Proxy Reference Server

In production, your browser should never talk directly to Google APIs. A proxy server sits between the browser and the upstream endpoint, attaching server-side credentials and enforcing security policies. The proxy-reference app is a working example you can deploy or use as a starting point.

When you need a proxy

  • Production deployments where API credentials must stay server-side.
  • Regulated environments that require audit logging and request redaction.
  • Any setup where the browser should not know the upstream Google endpoint URL.

Architecture

Browser (SDK)  -->  Your Proxy  -->  Google Session API
                    - attaches Authorization header
                    - redacts sensitive fields
                    - rate-limits per origin/session
                    - logs for audit

The browser only sees your proxy. The upstream URL and bearer token never leave the server.

Run it locally

pnpm install
pnpm --filter proxy-reference dev

The server starts on port 8080 by default (override with the PORT env var).

Connect the showcase to the proxy

In your showcase app, configure the SDK transport to point at the proxy:

import { createProxyTransport } from 'gecx-chat';

const transport = createProxyTransport({
  endpoint: 'http://localhost:8080/chat/stream',
});

Routes

PathMethodWhat it does
/chat/tokenPOSTIssues short-lived chat tokens. Uses createChatTokenHandler from the SDK. Replace the stub issueToken before deploying.
/chat/streamPOSTForwards requests to the upstream Google endpoint with server-side credentials. Preserves SSE streaming. Honors Idempotency-Key.
/chat/uploadPOSTForwards multipart file uploads to the upstream upload endpoint.
/chat/tool-callPOSTServer-tool lane. Dispatches to in-process handlers via createServerToolHandler. When the request targets the computer_use tool, routes through createComputerUseHandler instead.
/chat/voice-tokenPOSTMints a short-lived ephemeral token for Gemini Live. Supports stub mode for local dev (ALLOW_STUB_VOICE_TOKEN=1, non-production).
/chat/computer-use/:sessionId/streamGETSigned SSE stream of screenshots and action-log events. HMAC + 60s TTL + constant-time verify; verifier runs before any session lookup.
/chat/computer-use/:sessionId/controlPOSTPer-action approval decisions, abort, and the admin kill switch.
/chat/forget-mePOSTRight-to-be-forgotten endpoint backing governance.forgetMe().
/chat/conversations/:sessionIdDELETEConversation delete, backing governance.deleteConversation().
/chat/conversations/:sessionId/exportGETConversation export, backing governance.exportConversation().
/healthGETLiveness probe — returns { status, version, authConfigured }. /healthz legacy alias kept for k8s probes.

Environment variables

VariableRequiredNotes
PORTNoDefaults to 8080.
ALLOWED_ORIGINSYesComma-separated origins for CORS and the token handler.
GECX_CHAT_STREAM_URLYes (stream)Upstream Google Session API endpoint.
GECX_CHAT_UPLOAD_URLYes (upload)Upstream upload endpoint.
GECX_CHAT_TOOL_URLNoUpstream tool endpoint. When unset, /chat/tool-call dispatches to embedded reference handlers.
UPSTREAM_TOKENYes (stream/upload)Static server-side bearer token. Production deployments mint tokens dynamically via GECX_AUTH_MODE=wif; see apps/proxy-reference/README.md.
GECX_AUTH_MODENoOne of wif (production default), service-account, or mock. mock is refused in production.
GOOGLE_APPLICATION_CREDENTIALSYes for wif/service-accountPath to the WIF config JSON or service-account key. Mount from Secret Manager in production.

Deploy to Cloud Run

gcloud builds submit --tag gcr.io/$PROJECT_ID/ceai-proxy-reference

gcloud run deploy ceai-proxy-reference \
  --image gcr.io/$PROJECT_ID/ceai-proxy-reference \
  --region us-central1 \
  --set-env-vars ALLOWED_ORIGINS=https://yourapp.example.com,GECX_CHAT_STREAM_URL=...,GECX_CHAT_UPLOAD_URL=... \
  --no-allow-unauthenticated

After deploying, verify with the doctor CLI:

npx gecx doctor --token-endpoint https://proxy.example.com/chat/token --deployment <id>

Built-in security

The reference server already wires up redaction (redactRequestBody), audit logging (defaultAuditSink), and per-origin/session rate limiting (createRateLimiter). Override REDACT_KEYS, RATE_LIMIT_MAX, and RATE_LIMIT_WINDOW_MS via environment variables, or swap the sinks for your production audit pipeline and rate-limit backend before deploying at scale.

The proxy ships with observability and lifecycle hardening: graceful shutdown drains in-flight streams before exiting, structured logs include request-id / session-id / route, and the recent audit pipeline emits governance.* events with consistent shape across every route. WIF deployment is the documented production path; see Proxy Deployment.

What's next

Source: docs/demos/proxy-reference.md