Testing
The SDK ships test utilities under gecx-chat/testing. They let you mock the full client, the transport layer, scenarios, and tool execution -- no backend required.
createMockChatClient
Creates a fully configured ChatClient with mock auth, mock transport, and in-memory storage. This is the fastest way to write a unit test:
import { createMockChatClient } from 'gecx-chat/testing';
const client = createMockChatClient();
const session = await client.createSession();
const result = await session.send('Hello');
expect(result.messages).toHaveLength(1);
createMockChatClient accepts an optional MockChatClientOptions object:
| Option | Type | Purpose |
|---|---|---|
scenarios | MockScenario[] | Canned conversations the mock transport plays back |
activeScenarioId | string | Lock the transport to one scenario by ID |
transportOptions | Partial<MockTransportOptions> | Fine-tune latency, delay, callbacks |
clientConfig | Partial<ChatClientConfig> | Override any client-level config |
createMockTransport
Lower-level primitive. Creates just the transport so you can pair it with a real createChatClient for more control over auth, storage, or config:
import { createMockTransport } from 'gecx-chat/testing';
import { createChatClient } from 'gecx-chat';
const transport = createMockTransport({ latencyMs: 50 });
const client = createChatClient({
auth: myAuth,
transport,
});
MockTransportOptions fields:
| Option | Type | Default | Purpose |
|---|---|---|---|
scenarios | MockScenario[] | [] | Scenario list the transport matches against |
latencyMs | number | 30 | Default delay between streamed events |
defaultDelayMs | number | — | Alias for latencyMs |
activeScenarioId | string | — | Force a single scenario |
onToolCall | function | — | Callback when the transport emits tool.call |
uploadFailure | { fileName?, message?, atProgress? } | — | Deterministically fail matching mock uploads |
The returned transport also exposes resolveToolCall(toolCallId, result) and setActiveScenario(scenarioId) for imperative control in tests.
Testing Attachments
Use chat.attachFile(file) in tests exactly as a host app would. The returned async iterable emits safe metadata-only progress objects with both a coarse status and the full lifecycle:
const events = [];
for await (const event of session.attachFile(new File(['hello'], 'note.txt', { type: 'text/plain' }))) {
events.push(event);
}
expect(events.map((event) => event.lifecycle)).toContain('ready_to_send');
await session.sendText('Here is the file');
expect(session.getMessages()[0].parts).toContainEqual(expect.objectContaining({ type: 'file' }));
Validation can be tested without transport setup:
import { validateFile } from 'gecx-chat';
expect(validateFile(
{ name: 'large.pdf', size: 6 * 1024 * 1024, type: 'application/pdf' },
{ allowedMimeTypes: ['application/pdf'], maxBytes: 5 * 1024 * 1024 },
).valid).toBe(false);
For proxy deployments, exercise /chat/upload with multipart FormData and assert:
- oversized or disallowed MIME files return
PROXY_UPLOAD_REJECTED; - upstream failures map to
PROXY_UPSTREAM_BAD_RESPONSE; - audit output contains metadata such as MIME type and size, never file bytes;
- successful responses include
urland optionalscanStatus.
Mock scenarios
Pass the built-in mockScenarios array to get pre-built conversations covering support, commerce, tools, rich content, handoff, errors, and more:
import { createMockTransport, mockScenarios } from 'gecx-chat/testing';
const transport = createMockTransport({ scenarios: mockScenarios });
Each MockScenario has an id, a trigger (string or regex matched against the user's message), and a steps array of { event, delayMs? } objects that the transport streams back.
The scenarioCatalog provides lookup helpers:
import { scenarioCatalog } from 'gecx-chat/testing';
scenarioCatalog.list(); // all built-in + extra scenarios
scenarioCatalog.get('handoff'); // by ID
scenarioCatalog.random(); // random pick (accepts { seed })
Custom scenarios
Define your own scenarios for specific test cases:
import { createMockTransport } from 'gecx-chat/testing';
import type { MockScenario } from 'gecx-chat/testing';
const refundScenario: MockScenario = {
id: 'refund-flow',
trigger: /refund/i,
steps: [
{ event: { type: 'response.started', responseId: 'r1', requestId: 'q1', timestamp: new Date().toISOString() } },
{ event: { type: 'text.delta', delta: 'Your refund has been processed.', responseId: 'r1', requestId: 'q1', timestamp: new Date().toISOString() }, delayMs: 50 },
{ event: { type: 'response.completed', responseId: 'r1', requestId: 'q1', timestamp: new Date().toISOString() } },
],
};
const transport = createMockTransport({ scenarios: [refundScenario] });
You can also record and replay scenarios:
import { recordMockScenario, replayMockScenario } from 'gecx-chat/testing';
const recording = recordMockScenario(capturedEvents, { id: 'my-recording' });
const scenario = replayMockScenario(recording);
Testing tools
Client tool fakes
Use createToolFake for a quick one-off, or use the pre-built fakes (lookupOrderFake, addToCartFake, etc.):
import { createToolFake, lookupOrderFake, addToCartFake } from 'gecx-chat/testing';
// One-off fake
const fakeSearch = createToolFake('search_products', { results: [] });
// Pre-built fakes include: lookupOrderFake, lookupInvoiceFake,
// addToCartFake, applyCouponFake, openOrderDetailsFake,
// createSupportTicketFake
Server tool fakes
Use createServerToolFake to exercise the server-tool path (timeout, validation, error mapping) without a real HTTP server:
import { createServerToolFake } from 'gecx-chat/testing';
const fakeRefund = createServerToolFake({
name: 'apply_refund',
output: { refundId: 'REF-123', status: 'queued' },
});
// Test error paths
const failingTool = createServerToolFake({
name: 'apply_refund',
status: 401, // triggers SERVER_TOOL_UNAUTHORIZED
});
// Test timeouts
const slowTool = createServerToolFake({
name: 'slow_tool',
delayMs: 10_000,
timeoutMs: 100, // will time out
});
Testing React components
Wrap your component with ChatProvider using a mock client:
import { render, screen } from '@testing-library/react';
import { ChatProvider } from 'gecx-chat/react';
import { createMockChatClient } from 'gecx-chat/testing';
const client = createMockChatClient();
render(
<ChatProvider client={client}>
<YourChat />
</ChatProvider>
);
This gives your component a fully functional chat context backed entirely by in-memory mocks.
Playwright E2E
The showcase app (apps/showcase) includes Playwright tests. Run them with:
pnpm e2e
Use the showcase tests as patterns for your own E2E suite. They exercise streaming, tool calls, rich content rendering, and error recovery against the mock transport.
Validating custom transports
If you build a custom ChatTransport, use the bundled contract harness instead of writing the boilerplate by hand:
import { runTransportContractTests } from 'gecx-chat/testing/vitest';
import { describe } from 'vitest';
import { createMyTransport } from './myTransport';
describe('my custom transport', () => {
runTransportContractTests({
name: 'myTransport',
factory: () => createMyTransport({ /* ... */ }),
});
});
The harness validates: connect() accepts a session id and resolves; capability class advertised matches actual behaviour; protocolVersion is set; stream() returns an AsyncIterable<TransportEvent> that respects AbortSignal; close() resolves; and (for tier-2 transports) conditional reconnect/resume behaviour matches the contract. The bundled mockTransport is itself validated by this harness.
Scenario-replay tests with gecx eval
Unit tests against the mock transport are great for narrow assertions. For full-turn quality checks (the agent didn't hallucinate, latency stayed under budget, the right tool was called, no handoff fired on a deflectable query), use gecx eval:
pnpm gecx eval ./apps/showcase/scenarios \
--baseline ./apps/showcase/baseline.eval.json \
--fail-on-regress
Scenarios are TS or YAML files in a directory you choose. Sixteen scorers ship — thirteen deterministic plus three LLM-judge. See Evaluation and the Eval CLI reference.
What's next
- Client Tools -- defining and registering tools
- Error Handling -- error codes and recovery
- Error Recovery -- recovery patterns by error category
- Evaluation -- scenario-replay tests with deterministic and LLM-judge scorers
- Transport Events -- the mock protocol event format
docs/guides/testing.md