Proxy Deployment
The customer proxy is the trust boundary between browser code and Google Customer Engagement Suite (CES). It holds the service-account-scoped credentials, applies redaction and rate limits, and emits the audit log your security team needs. Every regulated CES deployment (telco, banking, healthcare, retail) goes through a proxy like this.
apps/proxy-reference/ is a working production-grade implementation. It runs on Google Cloud Run, mints tokens via Workload Identity Federation, and is designed to be a customer security team's first read. This guide walks you through deploying it.
When you need a proxy
In production, the browser should never talk directly to CES. A proxy lets you:
- Keep credentials server-side via Workload Identity Federation (no JSON keys at rest).
- Redact PII before it leaves your network.
- Rate-limit per
(origin, sessionId). - Audit every request for compliance.
- Apply custom authorization logic (session cookies, JWT, mTLS).
Browser --> Your Proxy --> CES
The browser only knows the proxy URL. Credentials, upstream URLs, and OAuth scopes stay on the server.
Quick deploy (under 30 minutes)
The full step-by-step playbook lives in apps/proxy-reference/README.md. The short version:
# Prerequisites
gcloud services enable run.googleapis.com customerengagementsuite.googleapis.com \
iamcredentials.googleapis.com sts.googleapis.com secretmanager.googleapis.com
# Service account + IAM
gcloud iam service-accounts create gecx-proxy
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member="serviceAccount:gecx-proxy@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/customerengagementsuite.client"
# WIF pool + provider (full commands in the README)
gcloud iam workload-identity-pools create gecx-proxy-pool --location=global
# ... see README for the OIDC provider, attribute mapping, and IAM binding.
# Deploy
gcloud run deploy gecx-proxy \
--source=apps/proxy-reference \
--region=us-central1 \
--service-account="gecx-proxy@${PROJECT_ID}.iam.gserviceaccount.com" \
--set-env-vars="NODE_ENV=production,GECX_AUTH_MODE=wif,ALLOWED_ORIGINS=https://app.example.com,GECX_CHAT_STREAM_URL=..." \
--set-secrets="/var/secrets/gecx-wif-config/wif.json=gecx-wif-config:latest" \
--no-allow-unauthenticated
# Verify
curl -fsS "$(gcloud run services describe gecx-proxy --region=us-central1 --format='value(status.url)')/health"
A cloudrun.yaml manifest is included at apps/proxy-reference/cloudrun.yaml for gcloud run services replace.
Authentication modes
GECX_AUTH_MODE selects how the proxy mints chat tokens. Mode selection happens at startup; misconfigurations log kind: auth.misconfigured and cause /chat/token to return 500 PROXY_AUTH_MISCONFIGURED until they're fixed. /health keeps responding so the deployment doesn't churn.
wif— production default. Reads a WIF config file atGOOGLE_APPLICATION_CREDENTIALSand exchanges it with Google STS viaExternalAccountClient.fromJSON. No JSON keys in the container.service-account— non-prod testing escape hatch. Reads a service-account key. Emits a one-time stderr warning. Refused unlessGECX_AUTH_MODE=service-accountis set explicitly.mock— tests and local dev. Opaque tokens. Refused entirely whenNODE_ENV=production.
Routes
The proxy must implement these routes. The SDK expects them at these paths when you configure createProxyTransport.
GET /health
Liveness probe. Returns { status: 'ok', version, authConfigured }. Used by Cloud Run, Kubernetes, and load balancers. No auth required, never rate-limited.
/healthz is a legacy alias that returns { ok: true } — kept for Kubernetes-style probes that hardcode the older path.
POST /chat/token
Issues a short-lived chat token to the browser. The proxy authenticates the caller (session cookie, JWT, etc.), mints a CES-scoped access token via the configured auth mode, and returns it.
Response body: { token, expiresAt, tokenType, scopes? }
The browser never sees the WIF config or the service-account credential.
POST /chat/stream
Forwards a send request to the CES streamRunSession endpoint and streams the SSE / NDJSON response back. The proxy parses the JSON body, applies redaction, attaches the WIF-minted bearer token, and pipes the upstream stream to the browser.
Request: JSON body with sessionId and message payload. X-Goog-Request-Id header (or legacy Idempotency-Key) is preserved.
Response: Content-Type matches CES — application/x-ndjson or text/event-stream. Cache-Control: no-store, X-Accel-Buffering: no to prevent intermediary buffering.
POST /chat/upload
Forwards multipart file uploads to the upstream upload endpoint. CES v1 prefers inline blobs (SessionInput.blob = { mimeType, data }) over a dedicated upload lane; leave GECX_CHAT_UPLOAD_URL empty unless you front a custom upload service.
The reference proxy validates every file before forwarding:
- MIME allowlist from
UPLOAD_ALLOWED_MIME_TYPESincluding wildcards such asimage/*. - Per-file size limit from
UPLOAD_MAX_BYTES. - File count limit from
UPLOAD_MAX_FILES. - Rate limit by
(origin, sessionId). - Metadata-only scan hook. The default hook returns
passed; replace it with a malware/PII scanner in production.
Audit events include route, session, MIME type, size, count, scan status, and upstream status. They do not include file bytes, base64, extracted text, or file contents.
Response body should include { attachmentId?, url, scanStatus?, metadata? }. createProxyTransport maps that into UploadProgressEvent; ChatSession.attachFile(file) then stages a typed file part when lifecycle reaches ready_to_send.
POST /chat/tool-call
Server-tool dispatch. The reference implementation includes a local dispatch table for tools like apply_refund and update_shipping_address. If GECX_CHAT_TOOL_URL is set, requests are forwarded upstream.
Server tools run with service-account-scoped credentials that never reach the browser. Database writes, payment processing, and entitlement checks belong here.
For local dispatch, createServerToolHandler supports manifest-shaped
definitions with inputSchema, outputSchema, approvalPolicy,
sideEffectLevel, idempotency, auth, audit, and timeoutMs. The
reference apply_refund and update_shipping_address actions require
Idempotency-Key, validate inputs and outputs, record duplicate disposition,
thread ctx.signal, and return one of these status envelopes:
{ "status": "completed", "output": {}, "idempotencyKey": "..." }
{ "status": "denied", "error": "not allowed", "errorCode": "SERVER_TOOL_UNAUTHORIZED" }
{ "status": "pending", "approvalPolicy": "supervisor_approve", "error": "pending approval" }
{ "status": "duplicate", "duplicateDisposition": "replayed", "output": {} }
{ "status": "failed", "error": "validation failed", "errorCode": "TOOL_VALIDATION_FAILED" }
DELETE /chat/conversations/:sessionId
Used by the governance deleteConversation method.
GET /chat/conversations/:sessionId/export
Used by the governance exportConversation method. Accepts an optional ?format=json or ?format=ndjson query parameter.
POST /chat/forget-me
Right-to-be-forgotten endpoint. Accepts { userId, sessionIds?, reason? } and fans out delete requests for each session. Returns 202 on success or 207 if some deletes failed.
POST /chat/voice-token
Mints a short-lived ephemeral token for Gemini Live so the browser can open the realtime WebSocket directly. The proxy validates the chat session, exchanges a server-side Gemini API key (or service account credential) for an ephemeral token, and returns { token, expiresAt, model, voice }.
The reference implementation in apps/proxy-reference/src/server.ts (handleVoiceToken) supports a stub mode for local dev (ALLOW_STUB_VOICE_TOKEN=1, NODE_ENV !== 'production'). The production code path commented inline calls @google/genai's authTokens.create.
Computer-use routes
Three additional routes back the computer_use server tool when enabled:
POST /chat/tool-call(existing) routes thecomputer_usetool name to acreateComputerUseHandlerinstance, which spins up aComputerUseSessionwith the configuredComputerUseProvider(Browserbase, Mock, or a custom adapter).GET /chat/computer-use/:sessionId/stream— signed SSE stream of PNG screenshots and structured action-log events. The signature uses HMAC over(sessionId, expiry)with constant-time verification and a 60s TTL. The verifier runs before any session lookup so callers cannot probe for valid session IDs.POST /chat/computer-use/:sessionId/control— per-action approval decisions, abort, and the global admin kill switch.
Every action emits a governance.computer_use.* audit event through the existing ChatGovernance sink with request-id correlation. See Computer-use and the threat model.
IAM bindings
The proxy service account needs:
| Role | Bound on | Why |
|---|---|---|
roles/customerengagementsuite.client | Project | Call CES streamRunSession and generateChatToken. |
roles/iam.workloadIdentityUser | The WIF pool's principal | Exchange the Cloud Run service identity for an impersonated CES token. |
roles/secretmanager.secretAccessor | The gecx-wif-config secret | Read the WIF config file mounted into the container. |
roles/logging.logWriter | Project | Emit structured access and audit logs. |
The CES IAM role surfaces as
roles/customerengagementsuite.clientin most projects. Verify withgcloud iam roles list --filter='name~customerengagement'if it's not found in yours.
Environment variables (uploads, rate limiting, redaction)
| Variable | Purpose | Default |
|---|---|---|
PORT | Server port | 8080 |
ALLOWED_ORIGINS | Comma-separated list of allowed CORS origins | http://localhost:3000 |
GECX_CHAT_STREAM_URL | Upstream stream endpoint URL | (required for streaming) |
GECX_CHAT_UPLOAD_URL | Upstream upload endpoint URL | (required for uploads) |
GECX_CHAT_TOOL_URL | Upstream tool-call endpoint URL | (optional; uses local dispatch if unset) |
GECX_CHAT_DELETE_URL_BASE | Upstream delete endpoint base URL | (optional; uses dev stub if unset) |
GECX_CHAT_EXPORT_URL_BASE | Upstream export endpoint base URL | (optional; uses dev stub if unset) |
UPSTREAM_TOKEN | Bearer token for upstream API calls | (empty) |
REDACT_KEYS | Comma-separated list of additional field names to redact | (empty) |
RATE_LIMIT_MAX | Max requests per window per (origin, session) pair | 60 |
RATE_LIMIT_WINDOW_MS | Rate-limit window in milliseconds | 60000 |
UPLOAD_ALLOWED_MIME_TYPES | Comma-separated MIME allowlist for /chat/upload; supports image/* style wildcards | image/*,text/plain,application/pdf |
UPLOAD_MAX_BYTES | Max size per uploaded file in bytes | 10485760 |
UPLOAD_MAX_FILES | Max files per upload request | 5 |
Security features
Redaction
The default redaction list covers password, secret, apiKey, api_key, token, accessToken, access_token, privateKey, private_key, creditCard, credit_card, ssn, socialSecurityNumber. Pattern-based catches: Google API keys (AIza…), PEM private-key headers, credit-card-shaped digit runs.
Add custom keys via REDACT_KEYS. Redaction is destructive; the upstream never sees the original values.
Rate limiting
In-memory (origin, sessionId) token bucket. Default: 60 requests per 60 seconds. When exceeded, the proxy returns 429 with Retry-After. For multi-instance deployments, swap the in-memory limiter for a Redis- or Memorystore-backed implementation — the function signature stays the same.
Audit logging
Every route emits structured audit events to stdout as JSON. Cloud Logging picks them up automatically. Events include token.issued, stream.started, stream.completed, tool.invoked, rate.limited, conversation.delete_requested, and the user-erasure lifecycle. Swap the sink in apps/proxy-reference/src/server.ts for your SIEM.
Structured access logs
Every request also emits an http.access JSON line with requestId, route, method, status, latencyMs, and origin. The requestId is echoed in the response X-Request-Id header — pass it to the SDK's onError callback for end-to-end correlation in Cloud Logging.
Server tool audit events include tool.invoked, tool.completed,
tool.denied, tool.pending, tool.duplicate, and tool.error. They include
the tool name, session ID, tool call ID, approval policy, audit classification,
idempotency key, and duplicate disposition when applicable. The reference does
not log raw tool input or service credentials by default.
CORS
Origins are checked against an explicit allowlist. Access-Control-Allow-Origin echoes the matched origin — never *.
Tracing
The server exports setTraceHook so OpenTelemetry can be wired without imposing it as a dependency. The reference ships no @opentelemetry/* package; hosts register a hook before startup. Example in apps/proxy-reference/README.md.
Container
The bundled Dockerfile is multi-stage, built on node:20-alpine, and runs as the node user (uid 1000). Image size is under 200 MB even with google-auth-library. HEALTHCHECK probes /health every 30 seconds.
docker build apps/proxy-reference -t gecx-proxy:test
docker run --rm -p 8080:8080 -e GECX_AUTH_MODE=mock gecx-proxy:test
curl -fsS http://localhost:8080/health
Doctor
After deployment, verify the proxy with the SDK doctor:
npx gecx doctor \
--token-endpoint https://your-proxy.run.app/chat/token \
--deployment your-deployment-id
What's next
- Data Governance — how
deleteConversationandforgetMecall through the proxy. - Server Tools — defining server-side tools that run behind the proxy.
- Error Handling — how proxy errors map to SDK error codes.
docs/guides/proxy-deployment.md