Skip to main content

Production Deployment Guide

Production deployment requires significant infrastructure expertise. This guide covers the essentials, but your team should be comfortable with Docker, PostgreSQL operations, and Kubernetes (if using).

Architecture Overview

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Agents    │────▶│  Aegis API  │────▶│ PostgreSQL  │
│  (Clients)  │     │  (FastAPI)  │     │ + pgvector  │
└─────────────┘     └──────┬──────┘     └─────────────┘

                    ┌──────▼──────┐
                    │    Redis    │
                    │  (Optional) │
                    └─────────────┘

Deployment Options

Suitable for: Small teams, single-server deployments

Production docker-compose.yml

version: '3.8'

services:
  aegis-api:
    image: ghcr.io/quantifylabs/aegis-memory:latest
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql+asyncpg://aegis:${DB_PASSWORD}@postgres:5432/aegis
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - AEGIS_API_KEY=${AEGIS_API_KEY}
      - RATE_LIMIT_PER_MINUTE=100
    depends_on:
      postgres:
        condition: service_healthy
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 1G

  postgres:
    image: pgvector/pgvector:pg16
    environment:
      - POSTGRES_USER=aegis
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_DB=aegis
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U aegis"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: unless-stopped

volumes:
  postgres_data:

Required Environment Variables

Create a .env file:
# Generate secure values!
DB_PASSWORD=$(openssl rand -base64 32)
AEGIS_API_KEY=$(openssl rand -base64 32)
OPENAI_API_KEY=sk-your-production-key
Security: Never commit .env files. Use a secrets manager in production.

Production Checklist

Database

1

Enable Connection Pooling

Configure DB_POOL_SIZE and DB_MAX_OVERFLOW:
DB_POOL_SIZE=20
DB_MAX_OVERFLOW=10
For high-traffic deployments, consider PgBouncer.
2

Set Up Backups

Schedule regular pg_dump backups:
# Daily backup cron job
0 2 * * * pg_dump -U aegis aegis | gzip > /backups/aegis-$(date +\%Y\%m\%d).sql.gz
Test your restore procedure regularly!
3

Configure Read Replicas (Optional)

For read-heavy workloads, add read replicas:
DATABASE_READ_REPLICA_URL=postgresql+asyncpg://aegis:pass@replica:5432/aegis
4

Monitor Query Performance

Enable slow query logging:
ALTER SYSTEM SET log_min_duration_statement = 100;
SELECT pg_reload_conf();

Security

1

Rotate API Keys

Generate strong API keys and rotate regularly:
# Generate new key
NEW_KEY=$(openssl rand -base64 32)

# Update environment
# Restart service
# Update all clients
You must update all agent clients when rotating keys. There’s no built-in key rotation mechanism.
2

Enable TLS

Always use HTTPS in production. Configure your reverse proxy (nginx, Traefik, etc.):
server {
    listen 443 ssl;
    server_name aegis.yourdomain.com;

    ssl_certificate /etc/ssl/certs/aegis.crt;
    ssl_certificate_key /etc/ssl/private/aegis.key;

    location / {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
3

Network Isolation

  • Place PostgreSQL in a private subnet
  • Use VPC security groups to restrict access
  • Never expose PostgreSQL to the internet
4

Secrets Management

Use a secrets manager instead of environment files:
  • AWS Secrets Manager
  • HashiCorp Vault
  • GCP Secret Manager
You’ll need to write custom code to fetch secrets at startup.

Monitoring

1

Health Checks

Monitor the /health endpoint:
curl https://aegis.yourdomain.com/health
Set up alerts for non-200 responses.
2

Prometheus Metrics

Scrape /metrics endpoint:
# prometheus.yml
scrape_configs:
  - job_name: 'aegis'
    static_configs:
      - targets: ['aegis:8000']
Available metrics:
  • aegis_memory_add_total
  • aegis_memory_query_total
  • aegis_memory_query_latency_seconds
3

Log Aggregation

Ship logs to your log aggregation system:
# Example: Docker with Loki
docker run -d \
  --log-driver=loki \
  --log-opt loki-url="http://loki:3100/loki/api/v1/push" \
  aegis-api
4

Alerting

Set up alerts for:
  • API error rate > 1%
  • Query latency p99 > 500ms
  • Database connections > 80% of pool
  • Disk usage > 80%

Scaling

The Aegis API is stateless and can be horizontally scaled:
# docker-compose scale
docker-compose up -d --scale aegis-api=3
Use a load balancer (nginx, HAProxy, cloud LB) in front.
PostgreSQL with pgvector benefits from:
  • More RAM (for HNSW index caching)
  • Faster SSDs
  • More CPU cores for parallel queries
Start with 4GB RAM, 2 vCPUs minimum.
For large memory counts (1M+), tune the HNSW index:
-- Increase ef_construction for better recall (slower inserts)
SET hnsw.ef_construction = 128;

-- Increase ef_search for better query accuracy (slower queries)
SET hnsw.ef_search = 64;
See pgvector tuning guide for details.

Disaster Recovery

Backup Strategy

#!/bin/bash
# backup.sh - Run daily via cron

BACKUP_DIR=/backups
DATE=$(date +%Y%m%d_%H%M%S)

# Full database backup
pg_dump -U aegis -Fc aegis > $BACKUP_DIR/aegis_$DATE.dump

# Upload to S3 (or your cloud storage)
aws s3 cp $BACKUP_DIR/aegis_$DATE.dump s3://your-bucket/backups/

# Retain 30 days locally
find $BACKUP_DIR -name "*.dump" -mtime +30 -delete

Restore Procedure

# 1. Stop the API
docker-compose stop aegis-api

# 2. Restore database
pg_restore -U aegis -d aegis --clean /backups/aegis_20250115.dump

# 3. Restart API
docker-compose start aegis-api

# 4. Verify
curl http://localhost:8000/health

Cost Estimation

ComponentSmall (Dev)Medium (Startup)Large (Enterprise)
Compute$20/mo$100/mo$500+/mo
PostgreSQL$15/mo$100/mo$500+/mo
OpenAI Embeddings$10/mo$50/mo$200+/mo
Total~$45/mo~$250/mo~$1,200+/mo
Costs scale primarily with:
  • Number of memories stored (database size)
  • Query volume (compute + embeddings)
  • Embedding calls (OpenAI API costs)

Need Help?

Production deployment requires careful planning. If you’re facing challenges:
Coming Soon: We’re working on a managed platform that handles all of this for you. Join the waitlist to be notified when it launches.