Scenario Catalog

Scenarios live in packages/gecx-chat/src/testing/scenarios.ts and are reused by the showcase app and tests.

ScenarioPurposeExpected behavior
welcomeFirst-run greetingStreams welcome text and suggestion chips.
order-statusSupport flow for an order lookupEmits tool call, tool result, shipping text, citations, and chips.
invoice-lookupAccount support flowExercises structured lookup and citation rendering.
commerce-recommendationShopping assistant flowEmits product carousel and add-to-cart suggestion chips.
add-to-cartApproval-gated cart mutationEmits an add_to_cart tool call that requires approval before visible cart state changes.
unknown-toolFail-closed tool behaviorEmits an unregistered tool call and surfaces a failed tool-call part.
invalid-tool-inputSchema validation behaviorEmits a registered tool call with invalid input and surfaces a failed tool-call part.
rich-content-samplerRenderer contract samplerExercises markdown, citation, chips, custom payload, and diagnostics.
a2ui-return-flowGenerative UI return workflowStreams A2UI v0.9 frames into an a2ui-surface, renders a return workflow, and logs an action envelope.
handoffLive-agent transfer simulationMoves through requested, queued, connected, and completed handoff states.
file-uploadUpload preview flowPrompts the showcase upload UI. Selected files use chat.attachFile; the mock transport exercises the real transport.upload? contract end to end (validation → bytes uploaded → file part appended on the next user message). Progress events are emitted by the transport itself, not stubbed by the UI.
error-labError handling flowEmits structured error parts for diagnostics.
angry-customerSentiment escalation demoDrives the SignalEscalator from a frustration-rule rising past threshold to a requestTransfer call.

gecx eval scenarios

The 22 reference gecx eval scenarios live in apps/showcase/scenarios/ (mix of .scenario.ts and .scenario.yaml). They cover welcome flows, order status, invoice lookup, commerce recommendation, apply refund, file upload, handoff for billing, invalid tool input, latency budget, helpfulness baseline, hallucination guard, no-handoff resolution, rich-content sampler, server-side refund, tool-call accuracy, unknown tool, welcome tone, and more.

Run them with:

pnpm eval:showcase
# or
pnpm gecx eval ./apps/showcase/scenarios \
  --baseline ./apps/showcase/baseline.eval.json \
  --fail-on-regress

See Evaluation and Eval CLI.

Showcase Routes

RouteScenario focus
/Overview and architecture
/wow-tourGuided PRD proof path across runtime, tools, diagnostics, and handoff
/agent-builderAgent-ready integration packet generator
/vibe-labAgent recipe preview, message-part proof, and scaffold packet
/componentsA2UI catalog gallery — 20 installable surface presets
/supportOrder support, tools, citations, chips
/commerceProduct recommendations and cart state
/rich-contentMessage part rendering
/generative-uiA2UI frame stream, generated surface rendering, frame inspector, and action log
/agent-graphMulti-specialist routing with live inspector
/sentiment-demoSentiment / intent signals and escalation
/voiceVoice composer, firstAudioMs overlay
/computer-useSandboxed agent browsing
/dashboardsPre-built analytics widgets over seeded events
/toolsClient tool approval and execution
/handoffHuman handoff and upload
/diagnosticsTrace timeline and debug bundle
/analyticsProduct journey metrics, analytics events, CSAT, and resolution tracking
/integration-wizardSetup snippets

Verification

Run scenario unit and default showcase E2E coverage from the repo root:

pnpm test
pnpm e2e

Run the Coolaid demo E2E coverage when changing apps/agi-coolaid-stand:

pnpm e2e:coolaid

Coolaid Demo Scenarios

The Coolaid Stand app uses its own local scenario catalog in apps/agi-coolaid-stand/src/lib/coolaidScenarios.ts.

ScenarioPurposeExpected behavior
coolaid-welcomeBranded concierge greetingStreams welcome text and recommendation chips.
coolaid-focusProduct recommendationStreams recommendation text, product carousel rich content, citation, and chips.
coolaid-refresh-pickerBranded generated UIStreams an A2UI flavor tuner with sliders and an action routed into the cart approval flow.
coolaid-add-cartCart mutationEmits an add_coolaid_case client tool call that requires approval before cart state changes.
coolaid-orderOrder lookupEmits a lookup_coolaid_order client tool call and order-status text.

Relevant files:

  • packages/gecx-chat/src/testing/scenarios.ts
  • apps/showcase/src/lib/demoState.tsx
  • apps/showcase/tests/navigation.spec.ts
  • apps/showcase/tests/support-gecx-chat.spec.ts
  • apps/showcase/tests/commerce.spec.ts
  • apps/showcase/tests/generative-ui.spec.ts
  • apps/showcase/tests/tools.spec.ts
  • apps/agi-coolaid-stand/src/lib/coolaidScenarios.ts
  • apps/agi-coolaid-stand/tests/a2ui.spec.ts
  • apps/agi-coolaid-stand/tests/coolaid-demo.spec.ts
Source: docs/reference/scenarios.md