Client Setup

Configure your AI tools and SDKs to use Harbangan as their backend. Each client points at your gateway’s base URL and authenticates with a personal API key.

Table of contents

Before You Start
Claude Code CLI
Zed Editor
OpenCode
1. Tested model limits
2. Configuration
Cursor / VS Code
OpenAI Python SDK
Anthropic Python SDK
Model Naming
1. Kiro pipeline (default)
2. Direct providers (full deployment only)
Known Limitations
1. Web Search in Claude Code

Before You Start

Your configuration depends on your deployment mode:

	Proxy-Only	Full Deployment
API Key	`PROXY_API_KEY` from `.env.proxy`	Personal key from Web UI (`/_ui/`)
Providers	All providers via env vars	All providers via Web UI OAuth + Admin UI
Models	`claude-*` names, `auto`, provider-prefixed	+ `anthropic/`, `openai_codex/`, `copilot/` prefixes
Web UI	Not available	Available at `/_ui/` for OAuth setup

Claude Code CLI

The fastest way to get started. Claude Code works out of the box with the Anthropic-compatible /v1/messages endpoint.

Demo:

One-liner

ANTHROPIC_BASE_URL=https://gateway.example.com \
ANTHROPIC_AUTH_TOKEN=YOUR_API_KEY \
CLAUDE_CODE_ENABLE_TELEMETRY=0 \
DISABLE_PROMPT_CACHING=1 \
DISABLE_NON_ESSENTIAL_MODEL_CALLS=1 \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
claude

Shell profile (persistent)

Add the following to your ~/.bashrc, ~/.zshrc, or equivalent:

export ANTHROPIC_BASE_URL=https://gateway.example.com
export ANTHROPIC_AUTH_TOKEN=YOUR_API_KEY
export CLAUDE_CODE_ENABLE_TELEMETRY=0
export DISABLE_PROMPT_CACHING=1
export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

Environment variables

Variable	Value	Purpose
`ANTHROPIC_BASE_URL`	`https://gateway.example.com`	Points Claude Code at your gateway
`ANTHROPIC_AUTH_TOKEN`	`YOUR_API_KEY`	API key from the Web UI or `PROXY_API_KEY`
`CLAUDE_CODE_ENABLE_TELEMETRY`	`0`	Disables telemetry (calls that would fail through the proxy)
`DISABLE_PROMPT_CACHING`	`1`	Disables prompt caching (not supported by Kiro backend)
`DISABLE_NON_ESSENTIAL_MODEL_CALLS`	`1`	Prevents background model calls that waste tokens
`CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC`	`1`	Prevents non-essential network requests

Replace gateway.example.com with your actual domain. For the API key: use a personal key from the Web UI (/_ui/ → API Keys) in full deployment, or your PROXY_API_KEY from .env.proxy in proxy-only mode.

Zed Editor

Add the following to ~/.config/zed/settings.json:

{
  "agent_servers": {
    "claude": {
      "env": {
        "ANTHROPIC_BASE_URL": "https://gateway.example.com",
        "ANTHROPIC_AUTH_TOKEN": "YOUR_API_KEY",
        "CLAUDE_CODE_ENABLE_TELEMETRY": "0",
        "DISABLE_PROMPT_CACHING": "1",
        "DISABLE_NON_ESSENTIAL_MODEL_CALLS": "1",
        "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
      }
    }
  }
}

Restart Zed after saving the configuration for changes to take effect.

OpenCode

The configuration below uses Kiro models (works in both modes). In full deployment, you can add direct provider models using the provider/model prefix — see Model Naming.

Tested model limits

Model	Context Window	Max Output
Opus 4.6	~195K tokens	Unknown
Sonnet 4.6	~195K tokens	Unknown
Haiku 4.5	~195K tokens	Unknown

Thinking mode is supported. The standard output token limit is 8192 tokens when thinking is not enabled.

Configuration

Create or edit ~/.config/opencode/opencode.json:

{
  "provider": {
    "kiro": {
      "name": "Kiro Gateway",
      "type": "anthropic",
      "api_key": "YOUR_API_KEY",
      "url": "https://gateway.example.com",
      "models": {
        "auto": {
          "name": "Auto (Gateway Default)",
          "max_tokens": 8192,
          "context_window": 195000,
          "supports_reasoning": true,
          "reasoning": {
            "budget_tokens": 10000
          }
        },
        "haiku-4.5": {
          "name": "Claude Haiku 4.5",
          "max_tokens": 8192,
          "context_window": 195000,
          "supports_reasoning": true,
          "reasoning": {
            "budget_tokens": 10000
          }
        },
        "sonnet-4.6": {
          "name": "Claude Sonnet 4.6",
          "max_tokens": 8192,
          "context_window": 195000,
          "supports_reasoning": true,
          "reasoning": {
            "budget_tokens": 10000
          }
        },
        "opus-4.6": {
          "name": "Claude Opus 4.6",
          "max_tokens": 8192,
          "context_window": 195000,
          "supports_reasoning": true,
          "reasoning": {
            "budget_tokens": 10000
          }
        },
        "sonnet-4.6-thinking": {
          "name": "Claude Sonnet 4.6 (Thinking)",
          "max_tokens": 16000,
          "context_window": 195000,
          "supports_reasoning": true,
          "reasoning": {
            "budget_tokens": 10000
          }
        },
        "opus-4.6-thinking": {
          "name": "Claude Opus 4.6 (Thinking)",
          "max_tokens": 16000,
          "context_window": 195000,
          "supports_reasoning": true,
          "reasoning": {
            "budget_tokens": 10000
          }
        }
      }
    }
  }
}

Cursor / VS Code

Open settings and find the AI provider configuration.
Set the API base URL to:
```
https://your-domain.com/v1
```
Enter your personal API key from the Web UI.
Select any supported Claude model.

OpenAI Python SDK

The gateway’s /v1/chat/completions endpoint is OpenAI-compatible, so the standard SDK works directly:

from openai import OpenAI

client = OpenAI(
    base_url="https://your-domain.com/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Anthropic Python SDK

The gateway’s /v1/messages endpoint is Anthropic-compatible:

import anthropic

client = anthropic.Anthropic(
    base_url="https://your-domain.com",
    api_key="YOUR_API_KEY",
)

message = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

print(message.content[0].text)

Model Naming

Kiro pipeline (default)

Use Claude model names directly. The gateway normalizes variants automatically:

You Send	Gateway Resolves
`claude-sonnet-4-6`	`claude-sonnet-4.6`
`claude-3-7-sonnet-20250219`	`claude-3.7-sonnet`
`claude-haiku-4-5-latest`	`claude-haiku-4.5`
`auto`	Gateway picks the best available model

Direct providers (full deployment only)

Prefix with the provider name to bypass Kiro and route to a direct API:

Model String	Routes To
`anthropic/claude-opus-4-6`	Anthropic API directly
`openai_codex/gpt-4`	OpenAI Codex
`copilot/gpt-4`	GitHub Copilot

Direct providers require per-user OAuth tokens configured in the Web UI Providers page. Without OAuth tokens, requests fall back to Kiro automatically.

Known Limitations

Web Search in Claude Code

Web search does not work through the proxy. The Kiro backend does not support the tool_use / tool_result round-trip that Claude Code’s built-in web search relies on.

Workaround: Use local MCP servers for web access instead. Add the following to ~/.claude.json:

{
  "mcpServers": {
    "mcp-server-fetch": {
      "command": "uvx",
      "args": ["mcp-server-fetch"]
    },
    "exa-mcp-server": {
      "command": "npx",
      "args": ["-y", "exa-mcp-server"],
      "env": {
        "EXA_API_KEY": "YOUR_EXA_API_KEY"
      }
    }
  }
}

MCP Server	Description
`mcp-server-fetch`	Fetches and extracts content from URLs
`exa-mcp-server`	AI-powered web search via the Exa API (requires `EXA_API_KEY`)