Memory Architecture
Not Just Vectors
Most AI memory systems throw everything into a vector database and hope semantic search finds the right thing. CollectHive’s memory system currently uses a single memories table in Convex with type classification, but the target architecture splits knowledge across three purpose-built backends. This article describes where we’re heading.
Facts — Target: 80% of Queries
Structured knowledge. “Project Alpha uses Stripe.” “We deploy to DigitalOcean.” The plan is to store these as entity/key/value records with full-text search. When you ask “what payment provider does Project Alpha use?”, you’d get an instant, exact, inspectable answer — not a fuzzy similarity match.
Conversations — Target: 15% of Queries
Chat history. “What were we discussing about pricing last week?” Vector-indexed for semantic similarity search. This is where traditional AI memory shines — finding relevant context from past conversations.
Decisions — Target: 5% of Queries
Active configuration with side effects. Choosing Stripe wouldn’t just store a fact — it would activate the Stripe skill, suggest the Stripe MCP connection, and inform briefings. Decisions compound into capabilities.
Technical Detail — Storage Backends
The planned facts table would use Convex with entity/key/value structure and full-text search:
// facts table (Convex) — planned, not yet implemented
{
entity: string, // "project_alpha", "user_alice", "hive"
key: string, // "payment_provider", "deploy_target"
value: string, // "Stripe", "Vercel"
scope: "hive" | "user" | "project",
domains: string[], // ["business", "engineering"]
source_session: Id<"chatSessions">,
created_by: Id<"users">,
last_confirmed_at: number,
superseded_at: number | null,
sensitivity: "normal" | "high",
}
A unified search API would fan out across all backends — the agent never needs to know which backend answered:
memory_search("what payment provider does project alpha use?")
→ facts: exact match (confidence: 1.0)
→ decisions: category match (with rationale)
→ vectors: semantic search on conversations
→ Merged, deduplicated, ranked by confidence + recency
How Knowledge Gets Extracted
Knowledge doesn’t just appear in the system. There’s a pipeline from conversation to stored memory. The extraction functions (smartExtractMemories() and deepExtractMemories()) exist today in memory-mcp/src/chat.ts. The full pipeline described below is the target design — some steps are implemented, others are planned.
- Session ends — conversation content enters the extraction pipeline
- Pre-extraction scrubbing — credentials, PII, and sensitive patterns are redacted before the LLM ever sees the content. Code blocks are excluded entirely.
- LLM extraction — a constrained prompt extracts candidates: tool usage, architecture patterns, debugging techniques, workflows, integration gotchas. Max 10 to 15 per session.
- Draft review — candidates are presented as “things I learned this session.” Accept, reject, or edit. Conflicts with existing memories are flagged.
- Storage — accepted memories enter your scope with provenance tracking.
How Memories Would Stay Fresh
Three planned mechanisms target different types of staleness. These are not yet implemented.
Explicit contradiction — a new extraction contradicts an existing memory. You’d be prompted: “You previously used Stripe. Update to Paddle?” The old memory would be archived with a timestamp, never deleted.
Confidence decay — memories unconfirmed for 30 days would be surfaced: “Are these still accurate?” Via a heartbeat prompt, the dashboard, or a session-start nudge.
Anchored verification — memories tagged with verifiable anchors (a URL, file path, or API endpoint) could be checked automatically by heartbeat agents. If the anchor breaks, the memory gets flagged for review.
Technical Detail — Freshness Implementation
The planned design uses bitemporal storage — memories are never deleted, only superseded:
// Planned freshness fields — not yet implemented
{
last_confirmed_at: number, // reset on revalidation
superseded_at: number | null, // set when replaced
superseded_by: Id<"facts"> | null,
anchor: { type: string, value: string } | null, // URL, file path, etc.
}
Confidence decay triggers would be configurable per scope level. Hive-scope memories would have longer decay windows (90 days) since they’ve been validated by multiple users. User-scope memories would decay at 30 days.