Skip to main content

Memory Scoring & Decay

ZeroMemory uses a blended scoring algorithm to rank memories during recall. This ensures the most relevant, important, and recent memories surface first.

Blended Scoring Formula

final_score = (similarity × W_sim) + (importance × W_imp) + (recency × W_rec)

Default weights:

  • W_sim = 0.5 (semantic similarity)
  • W_imp = 0.3 (importance)
  • W_rec = 0.2 (recency)

Similarity Score (0.0–1.0)

Cosine distance between the query embedding and each memory's embedding. Powered by pgvector HNSW indexes for sub-millisecond search.

Importance Score (0.0–1.0)

Set at storage time via the importance parameter. Can also be adjusted:

  • Auto-boost: Importance increases by 0.05 each time a memory is accessed (capped at 1.0)
  • Manual: Update via the API

Recency Score (0.0–1.0)

Decays over time using exponential decay:

recency = exp(-decay_rate × hours_since_creation)
  • Fresh memories (< 1 hour): ~1.0
  • 24 hours old: ~0.6
  • 7 days old: ~0.2
  • 30 days old: ~0.01

Accessing a memory resets its recency clock.

Memory Tiers & Consolidation

TierTypical LifespanConsolidation
WorkingHoursAuto-consolidates to episodic after session ends
EpisodicDays–weeksFrequently accessed episodic memories promote to semantic
SemanticPermanentCore knowledge — highest importance, slowest decay

Consolidation happens automatically based on:

  1. Access frequency: Memories accessed 3+ times promote faster
  2. Importance threshold: Memories with importance > 0.7 promote faster
  3. Time in tier: Working memories consolidate after ~4 hours of inactivity

Example: How Scoring Works

A user asks: "What programming language does Alice prefer?"

MemorySimilarityImportanceRecencyFinal Score
"Alice prefers Python for backend"0.920.80.90.88
"Alice mentioned Rust is interesting"0.750.40.30.56
"Backend team uses Python and FastAPI"0.700.50.70.64

The first memory wins because it has high similarity AND high importance AND was accessed recently.

Tuning Tips

  • Set importance explicitly for critical facts (preferences, decisions, constraints)
  • Use memory_type wisely: working for current task context, episodic for interactions, semantic for permanent knowledge
  • Tag memories so you can filter during recall for better precision
  • Reflect periodically: The /reflect endpoint consolidates and synthesizes patterns

GraphRAG Scoring

When using the GraphRAG endpoint, an additional graph proximity score is blended:

graphrag_score = (vector_score × (1 - graph_weight)) + (graph_proximity × graph_weight)

Default graph_weight is 0.3 (70% vector, 30% graph). Tune higher for queries about relationships, lower for topical searches.