Skip to main content

Why Aegis Memory?

The Agent Memory Problem

Building production AI agents reveals a harsh truth: memory is the bottleneck.

What Happens Without Proper Memory

LLM context windows are expensive and finite. A 128K context window costs ~$0.50 per call with GPT-4. At scale, this becomes prohibitive.Result: Developers truncate context, agents forget important details.
When a context window resets (timeout, crash, new session), all learned context is lost.Result: Multi-hour agent tasks restart from zero. Users repeat themselves endlessly.
When multiple agents work together, they have no shared memory. The planner can’t tell the executor what it learned.Result: Agents duplicate work, contradict each other, or drop tasks.
Agents make the same mistakes repeatedly because there’s no mechanism to remember what worked.Result: Error patterns repeat. Good strategies aren’t reused.

The Current Landscape

What Others Offer

SolutionApproachLimitation
Vector DBs (Pinecone, Weaviate)Store embeddingsNo agent coordination, no structure
Mem0Personal AI memorySingle-agent focused, no ACE patterns
Rolling ContextKeep recent N messagesLoses important old context
RAGRetrieve documentsDocuments aren’t agent memories

What’s Missing

None of these solve the agent-native requirements:
  1. Scoped Access Control - Private vs shared vs global memories
  2. Effectiveness Tracking - Which memories actually help?
  3. Session Continuity - Resume work after context resets
  4. Structured Coordination - Handoffs between agents
  5. Self-Improvement - Agents that learn from outcomes

The Aegis Approach

ACE Patterns (Agentic Context Engineering)

Based on research from Stanford/SambaNova and Anthropic, we implement patterns that make agents actually useful:

Memory Voting

Agents vote on memory usefulness. Query only effective strategies.
client.vote(memory_id, "helpful", context="Worked!")

Session Progress

Track completed/in-progress/blocked items across sessions.
client.update_session("build-api",
  completed=["auth"], in_progress="endpoints")

Reflections

Store lessons learned from failures as global knowledge.
client.add_reflection("Always validate input types",
  error_pattern="TypeError in API calls")

Playbooks

Query proven strategies before starting tasks.
strategies = client.query_playbook("pagination",
  min_effectiveness=0.5)

Three-Tier Scoping

┌─────────────────────────────────────────┐
│              GLOBAL                      │
│  (Company-wide: style guides, patterns) │
├─────────────────────────────────────────┤
│           AGENT-SHARED                   │
│  (Team-level: project context)           │
├──────────────────┬──────────────────────┤
│  AGENT-PRIVATE   │   AGENT-PRIVATE      │
│  (Planner only)  │   (Executor only)    │
└──────────────────┴──────────────────────┘
  • agent-private: Only the creating agent can see it
  • agent-shared: Explicitly shared with specific agents
  • global: All agents can access (best practices, company knowledge)

Performance at Scale

OperationAegisTypical Vector DB
Query 1M memories30-80ms100-500ms
Semantic dedup1ms50-200ms
Batch insert 50300ms2-5s
Powered by PostgreSQL + pgvector with HNSW indexing.

When to Use Aegis

Multi-agent systems (CrewAI, LangGraph teams)
Long-running agent tasks that span sessions
Agents that need to learn from past interactions
User-facing bots that should remember preferences
Self-hosted requirements (data sovereignty)

When NOT to Use Aegis

  • Simple single-turn chatbots (just use context window)
  • Document Q&A (use RAG instead)
  • You need sub-10ms latency (we’re 30-80ms)

Next Steps