Memory

Long-term memory in gecx-chat lets the assistant remember facts about a user across conversations. The feature is opt-in: when ChatClientConfig.memory is omitted, no tools are registered, no interceptor installs, and no extractor runs — existing apps are unaffected.

This guide walks through enabling memory, choosing a backend, and wiring the React surface. See the reference for the full API.

Quickstart

import {
  createChatClient,
  createLocalMemoryAdapter,
  createMemoryStorage,
} from 'gecx-chat';

const storage = createMemoryStorage();
const client = createChatClient({
  // ...auth, transport, etc.
  storage,
  memory: {
    adapter: createLocalMemoryAdapter({ storage }),
    // Defaults: read.mode = 'inject', write.tool = true
  },
});

// React side:
import { useMemory, MemoryList } from 'gecx-chat/react';

function MemoryPanel() {
  return <MemoryList />;
}

That's it. The SDK now:

  • Registers memory.save, memory.update, memory.delete as tools the model can call.
  • Injects a system message with the user's saved facts into every outbound send.
  • Exposes useMemory() and <MemoryList> for app-side display & control.

Choosing an adapter

AdapterLives whereSurvives device changeSetup costGood for
createLocalMemoryAdapterClient only (StorageAdapter)NoNonePrototypes, single-device consumer apps
createServerMemoryAdapterCustomer-defined RESTYesImplement the endpointStrict privacy postures, headless agents
createHybridMemoryAdapterCache + remoteYesImplement the endpointProduction B2C/B2B — recommended default

The hybrid adapter wraps a remote source-of-truth with a local cache, doing optimistic writes and server-wins reconciliation:

const remote = createServerMemoryAdapter({ endpoint: '/api/memory' });
const cache = createLocalMemoryAdapter({ storage });
const adapter = createHybridMemoryAdapter({ remote, cache });

createChatClient({ memory: { adapter } });

If the remote save fails, the cache rolls back by default. Set onRemoteFailure: 'keep' to keep the optimistic write — useful for offline posture where you'll resync later.

Choosing a read mode

MemoryConfig.read.mode controls how saved memory reaches the model:

  • inject (default) — Prepends a system message with the top-N most relevant entries on every send. Cheap, deterministic, debuggable. Hits a ceiling around 1–2KB of text.
  • recall — No automatic injection. Registers a memory.recall tool the model calls when it wants context. Token-efficient but the model has to remember to ask.
  • hybrid — Inject a small pinned summary AND expose memory.recall. Best balance for production.
  • off — Nothing flows to the model. Useful for write-only / export pipelines.
memory: {
  adapter,
  read: { mode: 'hybrid', inject: { maxEntries: 5, semantic: true } },
}

Write paths

Memory has two write paths, both optional:

Tool-call (default ON when memory is configured) — the model invokes memory.save({ text, key?, scope? }) when it judges a fact worth keeping. Same pattern as Anthropic's memory tool.

Auto-extraction (opt-in via write.extractor) — runs after each assistant turn over the last N turns. Two built-in extractors:

// Zero-LLM, regex-based:
import { createHeuristicExtractor, COMMON_HEURISTIC_PATTERNS } from 'gecx-chat';
memory: { adapter, write: { extractor: createHeuristicExtractor({ patterns: COMMON_HEURISTIC_PATTERNS }) } }

// Second LLM pass — more thorough, costs another round-trip:
const { createLLMExtractor } = await import('gecx-chat/memory/extractors/llm');
memory: { adapter, write: { extractor: createLLMExtractor({ transport, model: 'gemini-flash' }) } }

Set write.requireUserApproval: true to gate extractor outputs behind a user confirmation; the candidates surface as a memory-approval part and persist only on useMemory().approve(id).

Scoping: identity vs conversation

Every memory has a scope:

  • Identity-wide — applies across every conversation that user has. ("Prefers vegan recipes.")
  • Conversation-scoped — applies only to one thread. ("This conversation is about an order RMA-1234.")

By default, memory.save saves identity-wide. Pass scope: 'conversation' to scope to the active conversation.

At read time, conversation memories rank higher than identity memories so a narrower context wins when both match.

Conflict resolution

Memory is append-only: every save() inserts a new row. If an entry declares a key, prior unarchived entries with the same (scope, key) get archivedAt stamped — they're hidden from default lists but preserved for audit. Reads naturally pick the newest non-archived entry per key.

This avoids the entire class of "in-place update over a race" bugs and gives you a free audit trail.

Governance & user control

Memory is gated by three independent switches, evaluated top-down:

  1. DataGovernancePolicy.consent (strongest) — withdrawing consent disables and unregisters the memory tools, throws MEMORY_CONSENT_WITHDRAWN on save/update/delete, and cancels in-flight extractions. Existing entries are preserved on disk until the standard governance "delete my data" flow clears them.
  2. temporaryConversation: true on client.createSession({ temporary: true }) — for that session's lifetime only: bypasses both write paths and the inject step. Mirrors ChatGPT's Temporary Chat.
  3. useMemory().setEnabled(false) — user-controllable kill switch persisted to identity-scoped storage. No reads, no writes, no extraction. Existing entries preserved.

If multiple gates apply, the strongest wins. A governance-disabled state cannot be re-enabled by setEnabled(true).

Analytics

Memory operations emit ProductAnalyticsEvents:

  • memory_saved, memory_deleted, memory_cleared
  • memory_recalled (with count, mode, semantic)
  • memory_extraction_proposed, memory_extraction_resolved
  • memory_consent_changed

These flow through the same analytics sinks as everything else. The applied-ai-retail demo wires these into its /analytics dashboard.

Error handling

Memory errors are first-class ChatSdkError codes. See error-codes.md for the full list. Critically: a memory failure NEVER blocks a send. The interceptor catches adapter errors silently and the conversation proceeds without injected context.

When to NOT use memory

  • Strict-PII contexts where any user-identifying fact would violate policy. Either disable memory or run the local adapter inside a retention-mode-session storage so entries die with the session.
  • One-shot use cases with no continuity expectation (CSAT polls, FAQ bots).
  • Sessions where the user has consented to ephemeral mode — pass temporary: true to createSession.

Where to go next

Source: docs/guides/memory.md