Memory Extraction — Dedup Context & Telemetry
What changed
Two improvements to the memory extraction pipeline shipped on 19 March 2026.
1. Existing memories fed into extraction prompt
Before calling smartExtractMemories(), the system now fetches up to 10 existing memories for the project and includes them as “Already Known” context in the extraction prompt. The LLM instruction was updated to: “Do NOT extract anything already covered by the Already Known memories — even if worded differently.”
This prevents paraphrased duplicates. For example, one user had 3 near-identical memories about PaaS classification — each extracted during a separate compaction with slightly different wording. The semantic dedup (0.85 threshold) didn’t catch them because the titles and embeddings were different enough.
2. Extraction telemetry
Structured logging added at two points:
smartExtractMemories()— logsextraction_skippedwith reason (transcript_too_shortorno_api_key) when extraction is bypasseddeepExtractMemories()— logsextraction_completewith project, user, transcript length, candidates from LLM, stored vs skipped counts, skip reasons, and memory types
Grep production logs for [Memory:telemetry] to diagnose extraction gaps.
3. “For Review” dashboard banner
The unextracted memories card was moved from a buried position in the dashboard grid to a full-width amber banner between the Hero KPIs and the grid. Each chat now has both Extract and Ignore buttons.
Files changed
| File | Change |
|---|---|
memory-mcp/src/chat.ts | ~50 lines added — fetchExistingMemoriesForPrompt(), buildExtractionPrompt(), telemetry logging |
admin-ui/src/features/dashboard/UserDashboardPage.tsx | For Review banner wrapping UnextractedMemoriesCard |
admin-ui/src/features/dashboard/UnextractedMemoriesCard.tsx | Review & Extract toggle, Ignore button |