Skip to content

Credential Proxy

CodeBuddy includes a built-in credential proxy — a localhost HTTP server that sits between the webview and upstream LLM providers. Instead of passing API keys to the client-side code, the proxy injects credentials server-side so they never leave the extension host.

Extensions run in two contexts: the extension host (Node.js, trusted) and webviews (browser sandbox, less trusted). Without a proxy, API keys would need to be passed to the webview for direct LLM calls. The proxy keeps credentials in the extension host and only exposes a local URL + session token.

sequenceDiagram participant WV as Webview (Chat UI) participant Proxy as Credential Proxy<br/>(127.0.0.1:random) participant KS as OS Keychain<br/>(SecretStorage) participant UP as Upstream Provider<br/>(api.anthropic.com) Note over Proxy: Start: bind 127.0.0.1:0<br/>Generate session token WV->>Proxy: POST /anthropic/v1/messages<br/>x-codebuddy-proxy-token: abc123 Proxy->>Proxy: Validate session token Proxy->>Proxy: Rate limit check (token bucket) Proxy->>KS: Fetch API key for "anthropic" KS-->>Proxy: sk-ant-... Proxy->>Proxy: Strip sensitive headers<br/>Inject x-api-key + anthropic-version Proxy->>UP: POST /v1/messages<br/>x-api-key: sk-ant-... UP-->>Proxy: 200 OK (streaming) Proxy-->>WV: 200 OK (streaming passthrough) Proxy->>Proxy: Log audit entry
ProviderUpstream URLAuth headerAuth format
Anthropichttps://api.anthropic.comx-api-keyRaw key
OpenAIhttps://api.openai.comAuthorizationBearer {key}
Groqhttps://api.groq.com/openaiAuthorizationBearer {key}
DeepSeekhttps://api.deepseek.comAuthorizationBearer {key}
Qwenhttps://dashscope-intl.aliyuncs.com/compatible-modeAuthorizationBearer {key}
GLMhttps://open.bigmodel.cn/api/paasAuthorizationBearer {key}
Grokhttps://api.x.aiAuthorizationBearer {key}
Tavilyhttps://api.tavily.comAuthorizationBearer {key}
Localhttp://localhost:11434AuthorizationBearer {key}

Anthropic additionally receives the anthropic-version: 2023-06-01 header on every request.

Each time the proxy starts, it generates a 256-bit random session token (crypto.randomBytes(32)). Every incoming request must include this token in the x-codebuddy-proxy-token header. Requests without a valid token receive 403 Forbidden.

This prevents other processes on the machine from using the proxy — only the CodeBuddy extension (which knows the token) can make authenticated calls.

The proxy implements per-provider token-bucket rate limiting:

  • Each provider gets its own bucket with configurable maxTokens and refillRate
  • Local providers are exempt from rate limiting (no API rate caps)
  • When a bucket is empty, the request receives 429 Too Many Requests with a Retry-After header
  • Rate limits are configurable via editor settings and live-reload on change (buckets are reset)

The proxy binds exclusively to 127.0.0.1 — it is never exposed to the network. The listen call specifies the loopback address explicitly:

srv.listen(0, "127.0.0.1", callback)

Sensitive headers from the client are always stripped before forwarding to upstream, preventing credential leakage:

  • authorization
  • x-api-key
  • x-goog-api-key
  • host
  • connection, keep-alive, transfer-encoding

The proxy then injects the correct credentials from the OS keychain.

  • Maximum body size: 10 MB — requests exceeding this receive 413 Payload Too Large
  • Client receive timeout: 30 seconds — connections idle for more than 30 seconds between body chunks are terminated
  • Upstream timeout: 5 minutes — LLM streaming responses can be long, so this is generous

Upstream failures are mapped to safe HTTP responses without exposing internal details:

Error codeProxy responseMessage
ECONNREFUSED502Upstream refused connection
ECONNRESET502Upstream reset connection
ETIMEDOUT504Upstream connection timed out
ENOTFOUND502Upstream host not found
ECONNABORTED504Request timed out

Every proxied request is logged in a ring buffer (O(1) writes, capped at 1,000 entries):

{
"timestamp": 1711612800000,
"provider": "anthropic",
"method": "POST",
"path": "/v1/messages",
"statusCode": 200,
"latencyMs": 1523
}

The ring buffer overwrites the oldest entries when full — no unbounded memory growth.

The proxy has explicit lifecycle states that prevent invalid transitions:

stateDiagram-v2 idle --> starting : start() starting --> running : Listening starting --> stopped : Startup error running --> draining : dispose() draining --> stopped : Connections closed stopped --> idle : restart
  • Start is promise-coalesced — multiple concurrent start() calls share the same promise
  • Drain on dispose — the server stops accepting new connections and force-closes existing sockets after a grace period
  • Socket tracking — all active connections are tracked in a Set and force-destroyed on dispose

The proxy tracks every active TCP socket. On dispose:

  1. Server stops accepting new connections
  2. All active sockets are destroyed (force-close)
  3. Config watcher is disposed
  4. State transitions to stopped
SettingTypeDefaultDescription
codebuddy.credentialProxy.enabledbooleanfalseEnable the credential proxy
codebuddy.credentialProxy.rateLimitsobject{}Per-provider rate limit overrides

When enabled, getAPIKeyAndModel() returns a proxy URL instead of the real upstream URL. The LangChain wrapper (or completion provider) makes calls to http://127.0.0.1:{port}/{provider}/... and the proxy transparently injects credentials and forwards to the real upstream.