Skip to content

03 — System design

Components

Component File Responsibility
Memnex client.py Public facade
IdentityResolver identity/resolver.py Resolve, link, check match
MemoryManager memory/manager.py Write, read, search
Extractor memory/extractor.py Rule-based + LLM fact extraction
SalienceScorer memory/salience.py 0.0–1.0 scoring
ConflictResolver memory/conflict.py Detect & resolve contradictions
Compressor memory/compressor.py Fit into token budget
ChannelAdapter channels/ Extract + format per channel
GDPRCoordinator privacy/gdpr.py Forget + export
PIIMasker privacy/masker.py Mask at write
Workers workers/ Compaction, merges, TTL

Storage tiers

┌──────────────────────────────────────────────────────────────┐
│  HOT — Redis / in-memory                                     │
│    working_memory per customer      (last N facts)           │
│    identifier_cache                 (channel,id → customer)  │
│    TTL: 24h (configurable)                                   │
├──────────────────────────────────────────────────────────────┤
│  WARM — Postgres / in-memory                                 │
│    customer_identities              (root)                   │
│    channel_identifiers              (one → many)             │
│    memories                         (facts with metadata)    │
│    candidate_links                  (unconfirmed fuzzy)      │
│    Retention: per-tenant policy                              │
├──────────────────────────────────────────────────────────────┤
│  SEMANTIC — Qdrant / in-memory                               │
│    one collection per tenant                                 │
│    vector + {customer_id, fact_type, channel, salience}      │
└──────────────────────────────────────────────────────────────┘

Data flow

agent ──┐
        │ SDK / REST / MCP
   ┌─────────────┐
   │  Memnex     │
   │   ├── IdentityResolver ─→ hot + warm
   │   ├── MemoryManager    ─→ hot + warm + semantic
   │   ├── ChannelAdapter   ─→ in-process
   │   └── GDPRCoordinator  ─→ hot + warm + semantic
   └─────────────┘
   workers (compaction, identity merge, TTL) — optional

Why three tiers

  • Hot: sub-5 ms reads for active sessions. Redis is the default; in-memory works for tests.
  • Warm: durable, queryable, RLS-protected. Postgres JSONB covers 90% of queries; indexes cover the rest.
  • Semantic: Qdrant handles similarity search only. Kept optional — many deployments don't need it.

Pluggability

Storage is behind HotStore, WarmStore, SemanticStore protocols (storage/base.py). The in-memory default means tests run with zero dependencies and production swaps in the real backends by setting env vars.

Async-first

Everything is async. No blocking calls on the request path. FastAPI, asyncpg, redis-async, qdrant async client.

Scale notes

Dimension Strategy
Concurrent agents Async I/O + connection pools (asyncpg, redis)
Multi-tenant fairness Per-tenant rate limiter (Redis-backed in prod)
Cold reads Working-memory cache warms on first hit
Multi-instance cache Redis PUBSUB invalidation (auto when MEMNEX_REDIS_URL set)
Bulk writes max_facts_per_write cap + salience drop