Deployment Guide
Deployment instructions for Harbangan. Covers both Proxy-Only Mode (single container) and Full Deployment (multi-user with Docker Compose).
Table of contents
- Deployment Modes
- Proxy-Only Mode Deployment
- Full Deployment
- Prerequisites
- Step 1: Clone the Repository
- Step 2: Configure Environment Variables
- Step 3: Build and Start
- Step 4: Complete Web UI Setup
- Step 5: Verify
- Docker Compose File Reference
- Day-to-Day Operations
- PostgreSQL
- Health Monitoring
- Datadog APM (Optional)
- Next Steps
Deployment Modes
Harbangan supports two deployment modes:
- Proxy-Only Mode — A single backend container with no database or web UI. Uses
docker-compose.gateway.yml. Best for personal use or quick evaluation. - Full Deployment — Three containers (backend, PostgreSQL, frontend) with Google SSO, per-user API keys, and web dashboard. Uses
docker-compose.yml. Best for teams and development. Production deployment targets Kubernetes with TLS handled by an Ingress controller.
Proxy-Only Mode Deployment
Architecture
Client (localhost) → gateway container (127.0.0.1:8000, plain HTTP)
├── Kiro API (AWS CodeWhisperer)
├── Anthropic API
├── OpenAI API
├── GitHub Copilot API
└── Custom OpenAI-compatible endpoint
A single container running the Rust backend. No database or web UI. Supports all providers (Kiro, Anthropic, OpenAI Codex, Copilot, Custom) via environment variables. Authentication uses a single PROXY_API_KEY environment variable. Kiro credentials are obtained via an AWS SSO device code flow on first boot and cached to a Docker volume (gateway-data:/data/tokens.json). Additional providers are configured via API key or token env vars (see Configuration Reference). The port binds to 127.0.0.1 only — not accessible from external networks. The container runs as a non-root user (appuser) with a 512MB memory limit.
| Service | Image | Purpose |
|---|---|---|
gateway |
ghcr.io/if414013/harbangan-backend:latest |
Rust API server on configurable port (default 8000) |
Prerequisites
- Docker Engine 20.10+ and Docker Compose v2
- An AWS Builder ID (free) or Identity Center (pro) account
Step 1: Clone the Repository
git clone https://github.com/if414013/harbangan.git
cd harbangan
Step 2: Configure Environment Variables
Copy .env.proxy.example to .env.proxy and set your values:
GATEWAY_MODE=proxy
PROXY_API_KEY=your-secret-api-key
# Optional — defaults to us-east-1:
# KIRO_REGION=us-east-1
# For Identity Center (pro): set your SSO start URL
# KIRO_SSO_URL=https://your-org.awsapps.com/start
# KIRO_SSO_REGION=us-east-1
See the Configuration Reference for all available variables.
Step 3: Start the Gateway
docker compose -f docker-compose.gateway.yml --env-file .env.proxy up -d
On first boot, the container runs an AWS SSO device code flow. Check the logs:
docker compose -f docker-compose.gateway.yml logs -f gateway
You’ll see a URL and user code to authorize in your browser. After authorization, credentials are cached in the gateway-data Docker volume and reused on subsequent restarts.
Step 4: Verify
curl http://localhost:8000/health
# Expected: {"status":"ok"}
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer your-secret-api-key" \
-H "Content-Type: application/json" \
-d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"Hello!"}],"max_tokens":50}'
Proxy-Only Mode Operations
# View logs
docker compose -f docker-compose.gateway.yml logs -f gateway
# Stop the gateway
docker compose -f docker-compose.gateway.yml down
# Restart (reuses cached credentials)
docker compose -f docker-compose.gateway.yml --env-file .env.proxy up -d
# Rebuild after code changes
docker compose -f docker-compose.gateway.yml --env-file .env.proxy up -d --build
# Re-authorize (clear cached credentials)
docker volume rm harbangan_gateway-data
docker compose -f docker-compose.gateway.yml --env-file .env.proxy up
Proxy-Only Volume Layout
| Volume | Type | Purpose |
|---|---|---|
gateway-data |
Named volume | Cached Kiro credentials (/data/tokens.json) |
Full Deployment
Architecture
The Full Deployment runs via Docker Compose with three services:
graph LR
subgraph Internet
client1["OpenAI Client<br/>(Python/Node.js)"]
client2["Anthropic Client"]
client3["Web Browser"]
end
subgraph Docker["Docker Compose Stack"]
subgraph frontend_container["frontend (nginx)"]
vite["nginx<br/>:5173 → :80"]
end
subgraph backend_container["backend"]
gw["Rust API Server<br/>:9999 (HTTP)"]
end
subgraph db_container["db"]
pg["PostgreSQL 16<br/>:5432"]
end
end
subgraph AWS["AWS"]
kiro["Kiro API<br/>(CodeWhisperer)"]
sso["AWS SSO OIDC"]
end
client1 -->|"HTTP /v1/*"| gw
client2 -->|"HTTP /v1/*"| gw
client3 -->|"HTTP /_ui/"| vite
vite -->|"Proxy /_ui/api"| gw
gw -->|"HTTPS Event Stream"| kiro
gw -->|"OAuth token refresh"| sso
gw -->|"Config + credentials"| pg
style vite fill:#4a9eff,color:#fff
style gw fill:#4a9eff,color:#fff
style pg fill:#336791,color:#fff
style kiro fill:#ff9900,color:#fff
style sso fill:#ff9900,color:#fff
| Service | Image | Purpose |
|---|---|---|
db |
postgres:16-alpine |
PostgreSQL database for config, credentials, and user data |
backend |
harbangan-backend:latest (built locally) |
Rust API server — plain HTTP on port 9999 |
frontend |
harbangan-frontend:latest (built locally) |
nginx — serves built React SPA, proxies API requests to backend |
Note: This Docker Compose setup is intended for development. Production deployment targets Kubernetes, where TLS is handled by an Ingress controller.
Prerequisites
- Docker Engine 20.10+ and Docker Compose v2
- At least 1 GB RAM and 2 GB disk space
- Optionally, Google OAuth credentials if you want Google SSO (can also use password auth exclusively — see Configuration Reference)
Step 1: Clone the Repository
git clone https://github.com/if414013/harbangan.git
cd harbangan
Step 2: Configure Environment Variables
cp .env.example .env
Edit .env:
# PostgreSQL password (required)
POSTGRES_PASSWORD=your_secure_password_here
# Optional: seed an admin user for password-based login (first-run only)
# INITIAL_ADMIN_EMAIL=admin@example.com
# INITIAL_ADMIN_PASSWORD=changeme
# INITIAL_ADMIN_TOTP_SECRET=JBSWY3DPEHPK3PXP
Note: Google SSO is configured via the Admin UI after initial login, not via environment variables. You can use password auth for the first login by setting the
INITIAL_ADMIN_*variables above.
The following are managed automatically by docker-compose.yml — do not set them in .env:
SERVER_HOST— set to0.0.0.0for the backendSERVER_PORT— set to9999for the backendDATABASE_URL— constructed fromPOSTGRES_PASSWORD
Step 3: Build and Start
docker compose up -d --build
The first build compiles the Rust backend and React frontend, which takes a few minutes. Subsequent builds are fast unless dependencies change.
Watch the logs to confirm startup:
docker compose logs -f
Step 4: Complete Web UI Setup
On first launch, the backend starts in setup-only mode — the /v1/* proxy endpoints return 503 until an admin completes setup.
Open http://localhost:5173/_ui/ and:
- Sign in with Google — the first user is automatically granted the Admin role
- Add Kiro credentials — connect your AWS account via the SSO device code flow on the Profile page (URL appears in the web UI)
- Create an API key — generate a personal API key for programmatic access
Step 5: Verify
# Health check
curl http://localhost:9999/health
# Expected: {"status":"ok"}
# List models (use your personal API key)
curl -H "Authorization: Bearer YOUR_API_KEY" \
http://localhost:9999/v1/models
# Test a chat completion
curl -X POST http://localhost:9999/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"claude-sonnet-4","messages":[{"role":"user","content":"Hello!"}],"max_tokens":50}'
Docker Compose File Reference
The docker-compose.yml defines three services:
services:
db:
image: postgres:16-alpine
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
backend:
build: ./backend
restart: on-failure:5
ports:
- "9999:9999"
env_file:
- .env
environment:
SERVER_HOST: "0.0.0.0"
SERVER_PORT: "9999"
DATABASE_URL: postgres://kiro:${POSTGRES_PASSWORD}@db:5432/kiro_gateway
INITIAL_ADMIN_EMAIL: ${INITIAL_ADMIN_EMAIL:-}
INITIAL_ADMIN_PASSWORD: ${INITIAL_ADMIN_PASSWORD:-}
INITIAL_ADMIN_TOTP_SECRET: ${INITIAL_ADMIN_TOTP_SECRET:-}
depends_on:
db: { condition: service_healthy }
frontend:
build: ./frontend
restart: unless-stopped
ports:
- "5173:80"
depends_on:
backend: { condition: service_healthy }
Volume Layout
| Volume | Type | Purpose |
|---|---|---|
pgdata |
Named volume | PostgreSQL data (users, credentials, config, history) |
Day-to-Day Operations
# View live logs
docker compose logs -f
# Check container health (should show "healthy" after ~30s)
docker compose ps
# Stop the stack
docker compose down
# Rebuild after code changes
docker compose up -d --build
# Restart without rebuild
docker compose restart backend
# View backend logs only
docker compose logs -f backend
Database Backup
# Dump the database
docker compose exec db pg_dump -U kiro kiro_gateway > backup.sql
# Restore from backup
docker compose exec -T db psql -U kiro kiro_gateway < backup.sql
PostgreSQL
The gateway uses PostgreSQL for persistent storage of:
- User accounts and roles
- Per-user Kiro credentials (refresh tokens)
- Per-user API keys (SHA-256 hashed)
- Runtime configuration
- Configuration change history
Database tables
Tables are created automatically on first connection via an incremental migration system (25 versions). Key tables include:
| Table | Purpose |
|---|---|
users |
User accounts (identity, role, status, auth method) |
api_keys |
Per-user API keys (SHA-256 hashed, with labels) |
sessions |
Persistent user sessions |
user_kiro_tokens |
Per-user Kiro refresh tokens |
user_provider_tokens |
Per-user provider credentials (Anthropic, OpenAI, Copilot) |
user_provider_priority |
Per-user provider priority ordering |
user_copilot_tokens |
Copilot-specific token storage |
user_provider_keys |
Per-user provider API keys |
admin_provider_pool |
Shared provider accounts (admin pool) |
model_registry |
Admin-configured model entries |
model_visibility_defaults |
Default model visibility settings per provider |
model_routes |
Model-to-provider routing rules |
provider_settings |
Per-provider enabled/disabled state (admin toggle) |
config |
Key-value configuration store |
config_history |
Audit log of configuration changes |
schema_version |
Database migration tracking |
allowed_domains |
Google SSO domain allowlist |
usage_records |
Token usage tracking per request |
pending_2fa_logins |
Temporary 2FA login tokens (5-min TTL) |
totp_recovery_codes |
TOTP recovery codes (SHA-256 hashed) |
guardrail_profiles |
AWS Bedrock guardrail profiles (credentials encrypted) |
guardrail_rules |
Guardrail rules (CEL expressions, sampling, timeouts) |
guardrail_rule_profiles |
Many-to-many mapping of rules to profiles |
Connection string
The DATABASE_URL is constructed by docker-compose from POSTGRES_PASSWORD:
postgres://kiro:<POSTGRES_PASSWORD>@db:5432/kiro_gateway
Health Monitoring
Health check endpoint
curl http://localhost:9999/health
Returns 200 OK with:
{"status":"ok"}
Docker health checks
All services include built-in health checks:
docker compose ps
# NAME SERVICE STATUS PORTS
# harbangan-db-1 db Up (healthy) 5432/tcp
# harbangan-backend-1 backend Up (healthy) 0.0.0.0:9999->9999/tcp
# harbangan-frontend-1 frontend Up (healthy) 0.0.0.0:5173->5173/tcp
Web UI monitoring
The Web UI at /_ui/ provides usage tracking and system information:
- Token usage statistics by day, model, and provider
- System info (CPU, memory, uptime)
- Configuration management and change history
- User management (admin-only)
For real-time metrics, latency percentiles, and distributed tracing, use the optional Datadog APM integration (see below).
Log access
# All services
docker compose logs -f
# Backend only
docker compose logs -f backend
# Frontend only
docker compose logs -f frontend
The backend uses structured logging via tracing:
INFO kiro_gateway::routes: Request to /v1/chat/completions: model=claude-sonnet-4, stream=true, messages=3
Datadog APM (Optional)
Harbangan supports Datadog APM for distributed tracing, metrics, log forwarding, and frontend RUM. The integration is zero-overhead when not configured — when DD_AGENT_HOST is unset, no Datadog code runs.
Datadog APM is intended for production/Kubernetes deployments where a Datadog Agent is running separately. There is no
--profile datadogin the Docker Compose files. SetDD_AGENT_HOSTmanually to point at your Datadog Agent.
Configure Datadog environment variables
Add to your .env (Full Deployment) or .env.proxy (Proxy-Only):
DD_API_KEY=your-datadog-api-key
DD_SITE=datadoghq.com # or datadoghq.eu, us3.datadoghq.com, etc.
DD_ENV=production
For frontend Real User Monitoring (RUM), set these before building the frontend image:
VITE_DD_CLIENT_TOKEN=your-rum-client-token
VITE_DD_APPLICATION_ID=your-rum-application-id
VITE_DD_ENV=production
| Variable | Required | Default | Description |
|---|---|---|---|
DD_API_KEY |
Yes | Datadog API key | |
DD_SITE |
No | datadoghq.com |
Datadog intake site |
DD_ENV |
No | Environment tag (e.g. production, staging) |
|
VITE_DD_CLIENT_TOKEN |
No | RUM client token (baked into frontend bundle at build time) | |
VITE_DD_APPLICATION_ID |
No | RUM application ID (baked into frontend bundle at build time) |
Verify Datadog connectivity
Once your Datadog Agent is running and DD_AGENT_HOST is set in the backend’s environment, traces appear in your Datadog APM dashboard within ~30 seconds of the first request.
Traces appear in your Datadog APM dashboard within ~30 seconds of the first request.
What you’ll see in Datadog:
- Distributed traces for every
/v1/*request with model, user, and latency breakdown - Metrics: request rate, error rate, latency percentiles, token usage (per model and user)
- Logs correlated to traces via injected
dd.trace_id/dd.span_idfields - Frontend RUM sessions linked to backend traces (if
VITE_DD_*vars are set at build time)
See the Configuration Reference for all Datadog variables and the Architecture docs for implementation details.
Next Steps
- Configuration Reference — Environment variables for both Proxy-Only Mode and Full Deployment
- Getting Started — Full setup walkthrough with both deployment modes