← PLUR
Open Specification · v2 · March 2026

The Engram Specification

An open standard for persistent AI agent memory. Engrams are atomic units of learned knowledge — individually addressable, activation-weighted, and decay-aware. They give agents the ability to remember corrections, learn preferences, and improve over time. This document specifies the data format, activation model, feedback loop, and search pipeline so that anyone can implement compatible agent memory.

Introduction

AI agents are stateless by default. Every session starts from zero — no memory of corrections, no recall of preferences, no knowledge of what tools exist. Users repeat themselves. Agents make the same mistakes. The problem is not intelligence but amnesia.

An engram is the smallest unit of persistent memory. The term comes from neuroscience — Richard Semon's 1904 hypothesis that experiences leave physical traces (engrams) in the brain. We borrow the concept: each engram is a single learned fact, correction, preference, or behavioral pattern that an agent can recall and apply.

The engram model draws on three lines of cognitive science research:

This is an open standard. The specification defines the data format and behavioral model — not a specific implementation. Any agent framework can implement engram-compatible memory. PLUR is one implementation; there should be others.

The benchmark results demonstrate that agents with engram memory outperform stateless agents 68% of the time, with the strongest gains in convention adherence (6/6 wins) and tool discoverability (6/7 wins). The engram specification explains the data structures and algorithms behind those results.

Engram schema

An engram is a YAML object with a fixed set of fields. Every field has a defined type, default value, and purpose. The schema is designed to be human-readable (YAML, not binary), individually addressable (unique ID per engram), and self-describing (each engram carries its own metadata).

# A complete engram example
- id: ENG-2026-0131-001
  version: 2
  status: active
  type: behavioral
  scope: agent:dip-preparer
  statement: |
    Validate org-mode syntax before writing to .org files.
    Three incidents of malformed entries in January 2026 caused
    data loss. Always parse and validate before write.
  rationale: "Three incidents of malformed entries in January 2026."
  confidence: 9
  tags: [org-mode, validation]
  domain: "gtd/org-mode"
  activation:
    retrieval_strength: 0.85
    storage_strength: 0.6
    frequency: 5
    last_accessed: 2026-01-30
  associations:
    - target: "ENG-2026-0302-001"
      strength: 0.8
      type: semantic
  metadata:
    created: 2026-01-31
    source: "user/personal"
    feedback_signals: {positive: 3, negative: 0, neutral: 2}

Core fields

Field Type Description
id string Unique identifier. Format: ENG-YYYY-MMDD-NNN
version integer Schema version (currently 2)
status enum active | dormant | retired | candidate
type enum behavioral | correction | preference | convention | procedural | factual
scope string Hierarchical namespace: agent:X | command:X | global | space:X
statement string The learned knowledge, written as actionable guidance (25–60 words)
rationale string Why this engram exists — the incident or pattern that prompted it
confidence 1–10 How certain this pattern holds. Affects injection priority. Default: 5
tags string[] Freeform tags for keyword matching during retrieval
domain string SKOS-style hierarchical domain (e.g., gtd/org-mode, dev/typescript)

Engram types

The type field classifies the kind of knowledge an engram encodes. This affects how agents apply it and how it surfaces during retrieval.

Type Description Example
behavioral How to act in a situation "Validate org-mode syntax before writing"
correction A mistake to avoid "The API returns snake_case, not camelCase"
preference User's preferred approach "Use tabs not spaces in this project"
convention System-wide rule or pattern "Tags use :colon: format in org-mode, #hash in PKM"
procedural Step-by-step process "Deploy via git pull then systemctl restart"
factual A specific fact to remember "Nightshift server IP is 100.115.67.71"

Activation fields

The activation block tracks how an engram decays and strengthens over time. These fields drive the ACT-R-inspired retrieval model described in the next section.

Field Type Description
retrieval_strength 0.0–1.0 How easily the engram can be recalled. Decays over time, increases on access.
storage_strength 0.0–1.0 How deeply encoded the engram is. Only increases (reinforcement, consolidation).
frequency integer Total number of times this engram has been accessed (injected, referenced).
last_accessed date When the engram was last injected into an agent session.

Associations

Engrams are connected to each other via weighted, typed links. These links form the graph structure that enables spreading activation during retrieval — finding related engrams that weren't directly matched by keyword search.

associations:
  - target: "ENG-2026-0302-001"
    strength: 0.8        # [0, 0.95]
    type: semantic      # semantic | temporal | causal | co_accessed

Strength is capped at 0.95. The co_accessed type is learned automatically: when two engrams are injected in the same session, their association strengthens by +0.05 (created at 0.1 for new pairs). All co_accessed associations decay by 0.95x daily, pruned below 0.05 — giving a half-life of about 14 days.

Metadata & provenance

Field Type Description
created date When the engram was created
source string Origin identifier (e.g., user/personal)
feedback_signals object Counters: {positive, negative, neutral}
emotional_weight 1–10 Significance of this learning. Higher = slower decay. Default: 5
trigger_context string What situation prompted this learning
derivation_count integer How many times this pattern was independently captured

Engrams with high derivation_count are the most valuable — patterns independently re-derived across sessions represent convergent evolution. If three separate sessions captured the same insight, it is almost certainly important.

Activation model

The activation model determines which engrams are injected into an agent session and which are left dormant. It is inspired by ACT-R's base-level learning equation: memory items that are accessed frequently and recently have higher activation.

Retrieval strength decay

An engram's retrieval_strength decays over time based on days since last access. This implements Ebbinghaus's forgetting curve — without reinforcement, memories fade.

The emotional_weight field modifies the decay rate. Painful or significant lessons persist longer:

effective_decay = base_decay × (1 - emotional_weight / 20)

An engram with emotional_weight = 10 decays at half the normal rate. An engram with the default weight of 5 decays at 75% of the base rate.

Activation thresholds

Retrieval strength determines an engram's lifecycle status and whether it gets injected into agent context:

Retrieval strength Status Effect
> 0.5 Active Injected into agent sessions when relevant
0.3 – 0.5 Fading Injected only if context budget allows
0.1 – 0.3 Dormant Retained in storage, not injected
< 0.1 Retirement candidate Flagged for review — retire or reinforce

Reinforcement

When an engram is accessed — injected into an agent session, referenced during review, or explicitly reinforced by the user — its retrieval_strength increases and frequency increments. The storage_strength only increases, never decreases, representing the depth of encoding.

Reconsolidation

When a new learning contradicts an existing engram, both surface as a pair during review. The user resolves the contradiction — keep one, merge them, or mark both as valid in different contexts. The survivor gains consolidated: true and a 2x storage_strength boost, making it more resistant to future decay.

Forgetting is not failure — it is curation. An unbounded memory that retains everything becomes a search problem worse than having no memory at all. The activation model ensures that frequently useful knowledge stays accessible while irrelevant knowledge gracefully fades.

Feedback loop

Engrams improve through use. Every session generates feedback signals that train the system on what to inject next time. This is the core learning loop that makes memory better over time, not just bigger.

Session lifecycle

1. session.start(task_description)
   # Agent describes what it's working on
   # System injects relevant engrams based on task + scope
2. work
   # Agent performs tasks with engram context loaded
   # New patterns captured as raw observations
3. feedback(engram_id, signal)
   # Agent or user rates injected engrams: positive | negative | neutral
   # Positive: engram was useful for this task
   # Negative: engram was irrelevant or misleading
4. learn(statement, type, scope)
   # Capture new knowledge discovered during work
   # Creates candidate engram for review
5. session.end(summary)
   # Co-access associations updated (Hebbian learning)
   # Decay recalculated for all engrams

Relevance scoring

Feedback signals directly affect future injection priority. An engram with consistently positive feedback scores higher during retrieval. An engram with negative feedback is deprioritized — even if its retrieval strength is high.

Signal Effect on retrieval When to use
Positive Boost Engram directly helped with the task
Neutral No change Engram was present but neither helped nor hurt
Negative Penalize Engram was irrelevant or misleading for this context

Promotion pipeline

Not every observation becomes an engram. Raw patterns go through a quality gate pipeline before promotion:

Expected pass rate is 30–50% of raw patterns. If consistently above 50%, the gates are too loose. Below 30%, too strict.

Search pipeline

When an agent starts a session, the system must find the most relevant engrams from potentially thousands of candidates. The search pipeline combines keyword matching, semantic similarity, and graph traversal to surface the right knowledge.

Enriched schema text

Before indexing, each engram is expanded into a searchable text representation that includes fields beyond the statement:

# What gets indexed for search
searchable_text =
  statement
  + entities (extracted from statement)
  + temporal validity markers
  + rationale
  + tags (space-separated)
  + domain
  + knowledge anchor snippets

This enrichment means a search for "trading" will find engrams whose statement doesn't mention trading but whose domain is trading/risk or whose anchor snippet references a trading journal entry.

Hybrid retrieval: BM25 + embedding + RRF

The search pipeline uses two retrieval strategies and fuses their results:

BM25 (keyword)

Classic term-frequency matching via SQLite FTS5. Fast, exact, and good at matching specific identifiers like server names or tool names. Zero external dependencies.

Embedding (semantic)

Dense vector search using BGE embeddings. Finds conceptually related engrams even when keywords don't overlap. "Deploy to production" matches "push to live server."

Results from both strategies are fused using Reciprocal Rank Fusion (RRF) — a parameter-free method that combines ranked lists without needing score normalization. Each result's final score is:

RRF(d) = Σ 1 / (k + ranki(d))

Where k is a constant (typically 60) and ranki is the document's position in each retrieval strategy's results. Documents that rank highly in both keyword and semantic search surface to the top.

This is a zero-cost hybrid search. BM25 runs on SQLite (already present for storage). Embeddings are generated locally using a small model. No external API calls, no cloud dependencies, no per-query costs. The entire search pipeline runs on the user's machine.

Injection scoring

After retrieval, engrams are scored for injection priority using a composite formula:

Score = keyword_match        # [0, 10] normalized relevance
      + anchor_boost        # [0, 2] keyword overlap with anchor snippets
      + schema_boost        # [0, 2] other schema members are activated

The top-scoring engrams are selected as directives (up to 10, within a token budget). Next-best become consider items (up to 5). Finally, a spreading activation pass traverses the association graph to discover related engrams not matched by search:

spread_score = (source_score / max_first_pass) × association_strength

This adds up to 3 more engrams from the association graph. Combined ceiling: 18 engrams per injection (10 directives + 5 consider + 3 spreading).

Implementation notes

The engram specification is deliberately storage-agnostic, but we document the reference implementation choices for implementors.

Storage format

Engrams are stored as YAML arrays in flat files. One file per scope or space. YAML was chosen over SQLite or JSON for three reasons: human-readability (engrams should be editable by hand), git-friendliness (meaningful diffs), and simplicity (no database driver required).

# File: .plur/engrams.yaml (or .datacore/learning/engrams.yaml)
- id: ENG-2026-0131-001
  statement: "Validate org-mode syntax before writing..."
  # ... full engram fields
- id: ENG-2026-0201-001
  statement: "Always rsync --dry-run first..."
  # ...

Search index

The YAML files are the source of truth. A SQLite database serves as the search index — rebuilt from YAML on demand. The index contains FTS5 tables for BM25 search and a vector table for embedding search. If the index is lost or corrupted, it is regenerated from the YAML files with zero data loss.

Tool interface

Implementations expose engram operations through whatever tool protocol their agent framework supports. The reference implementation uses MCP (Model Context Protocol). The core operations:

Operation Description
plur.learn Create a candidate engram from a statement, type, and scope
plur.recall Search engrams by keyword or semantic similarity
plur.inject Context-aware injection at session start (returns directives + consider)
plur.feedback Rate an injected engram: positive, negative, or neutral
plur.forget Retire an engram (sets status to retired)
plur.session.start Begin a session — describes task, triggers injection
plur.session.end End a session — updates co-access associations, runs decay

Minimum viable implementation

An implementation is engram-compatible if it supports:

Everything else — associations, spreading activation, co-access learning, schema emergence — is optional. Start minimal. Add complexity when the simple version stops being sufficient.