03 — System design¶

Components¶

Component	File	Responsibility
`Memnex`	client.py	Public facade
`IdentityResolver`	identity/resolver.py	Resolve, link, check match
`MemoryManager`	memory/manager.py	Write, read, search
`Extractor`	memory/extractor.py	Rule-based + LLM fact extraction
`SalienceScorer`	memory/salience.py	0.0–1.0 scoring
`ConflictResolver`	memory/conflict.py	Detect & resolve contradictions
`Compressor`	memory/compressor.py	Fit into token budget
`ChannelAdapter`	channels/	Extract + format per channel
`GDPRCoordinator`	privacy/gdpr.py	Forget + export
`PIIMasker`	privacy/masker.py	Mask at write
`Workers`	workers/	Compaction, merges, TTL

Storage tiers¶

┌──────────────────────────────────────────────────────────────┐
│  HOT — Redis / in-memory                                     │
│    working_memory per customer      (last N facts)           │
│    identifier_cache                 (channel,id → customer)  │
│    TTL: 24h (configurable)                                   │
├──────────────────────────────────────────────────────────────┤
│  WARM — Postgres / in-memory                                 │
│    customer_identities              (root)                   │
│    channel_identifiers              (one → many)             │
│    memories                         (facts with metadata)    │
│    candidate_links                  (unconfirmed fuzzy)      │
│    Retention: per-tenant policy                              │
├──────────────────────────────────────────────────────────────┤
│  SEMANTIC — Qdrant / in-memory                               │
│    one collection per tenant                                 │
│    vector + {customer_id, fact_type, channel, salience}      │
└──────────────────────────────────────────────────────────────┘

Data flow¶

agent ──┐
        │ SDK / REST / MCP
        ▼
   ┌─────────────┐
   │  Memnex     │
   │   ├── IdentityResolver ─→ hot + warm
   │   ├── MemoryManager    ─→ hot + warm + semantic
   │   ├── ChannelAdapter   ─→ in-process
   │   └── GDPRCoordinator  ─→ hot + warm + semantic
   └─────────────┘
        │
        ▼
   workers (compaction, identity merge, TTL) — optional

Why three tiers¶

Hot: sub-5 ms reads for active sessions. Redis is the default; in-memory works for tests.
Warm: durable, queryable, RLS-protected. Postgres JSONB covers 90% of queries; indexes cover the rest.
Semantic: Qdrant handles similarity search only. Kept optional — many deployments don't need it.

Pluggability¶

Storage is behind HotStore, WarmStore, SemanticStore protocols (storage/base.py). The in-memory default means tests run with zero dependencies and production swaps in the real backends by setting env vars.

Async-first¶

Everything is async. No blocking calls on the request path. FastAPI, asyncpg, redis-async, qdrant async client.

Scale notes¶

Dimension	Strategy
Concurrent agents	Async I/O + connection pools (asyncpg, redis)
Multi-tenant fairness	Per-tenant rate limiter (Redis-backed in prod)
Cold reads	Working-memory cache warms on first hit
Multi-instance cache	Redis PUBSUB invalidation (auto when `MEMNEX_REDIS_URL` set)
Bulk writes	`max_facts_per_write` cap + salience drop