Production Deployment Guide
Production deployment requires significant infrastructure expertise. This guide covers the essentials, but your team should be comfortable with Docker, PostgreSQL operations, and Kubernetes (if using).
Architecture Overview
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Agents │────▶│ Aegis API │────▶│ PostgreSQL │
│ (Clients) │ │ (FastAPI) │ │ + pgvector │
└─────────────┘ └──────┬──────┘ └─────────────┘
│
┌──────▼──────┐
│ Redis │
│ (Optional) │
└─────────────┘
Deployment Options
Docker Compose
Kubernetes
Cloud VMs
Suitable for: Small teams, single-server deployments Production docker-compose.yml version : '3.8'
services :
aegis-api :
image : ghcr.io/quantifylabs/aegis-memory:latest
ports :
- "8000:8000"
environment :
- DATABASE_URL=postgresql+asyncpg://aegis:${DB_PASSWORD}@postgres:5432/aegis
- OPENAI_API_KEY=${OPENAI_API_KEY}
- AEGIS_API_KEY=${AEGIS_API_KEY}
- RATE_LIMIT_PER_MINUTE=100
depends_on :
postgres :
condition : service_healthy
restart : unless-stopped
deploy :
resources :
limits :
memory : 1G
postgres :
image : pgvector/pgvector:pg16
environment :
- POSTGRES_USER=aegis
- POSTGRES_PASSWORD=${DB_PASSWORD}
- POSTGRES_DB=aegis
volumes :
- postgres_data:/var/lib/postgresql/data
healthcheck :
test : [ "CMD-SHELL" , "pg_isready -U aegis" ]
interval : 5s
timeout : 5s
retries : 5
restart : unless-stopped
volumes :
postgres_data :
Required Environment Variables Create a .env file: # Generate secure values!
DB_PASSWORD = $( openssl rand -base64 32 )
AEGIS_API_KEY = $( openssl rand -base64 32 )
OPENAI_API_KEY = sk-your-production-key
Security : Never commit .env files. Use a secrets manager in production.
Suitable for: Teams needing auto-scaling, high availability Basic Kubernetes Manifests You’ll need to create:
Deployment for Aegis API
StatefulSet for PostgreSQL (or use managed PostgreSQL)
Services for internal communication
Ingress for external access
Secrets for credentials
PersistentVolumeClaims for data
Full Kubernetes manifests are beyond this guide’s scope. See our examples repository for templates. Recommended: Managed PostgreSQL For production Kubernetes deployments, use managed PostgreSQL:
AWS : RDS for PostgreSQL with pgvector
GCP : Cloud SQL for PostgreSQL
Azure : Azure Database for PostgreSQL
# Example: Using AWS RDS
apiVersion : v1
kind : Secret
metadata :
name : aegis-db-credentials
stringData :
DATABASE_URL : postgresql+asyncpg://aegis:PASSWORD@your-rds-endpoint:5432/aegis
Suitable for: Simple cloud deployments without containers Ubuntu/Debian Setup # 1. Install PostgreSQL 16 with pgvector
sudo apt update
sudo apt install -y postgresql-16 postgresql-16-pgvector
# 2. Configure PostgreSQL
sudo -u postgres psql << EOF
CREATE USER aegis WITH PASSWORD 'your-secure-password';
CREATE DATABASE aegis OWNER aegis;
\c aegis
CREATE EXTENSION vector;
EOF
# 3. Install Python and Aegis
sudo apt install -y python3.11 python3.11-venv
python3.11 -m venv /opt/aegis
/opt/aegis/bin/pip install aegis-memory[server]
# 4. Create systemd service
sudo tee /etc/systemd/system/aegis.service << EOF
[Unit]
Description=Aegis Memory API
After=postgresql.service
[Service]
Type=simple
User=aegis
WorkingDirectory=/opt/aegis
ExecStart=/opt/aegis/bin/uvicorn server.main:app --host 0.0.0.0 --port 8000
Restart=always
EnvironmentFile=/etc/aegis/environment
[Install]
WantedBy=multi-user.target
EOF
# 5. Start service
sudo systemctl enable aegis
sudo systemctl start aegis
Production Checklist
Database
Enable Connection Pooling
Configure DB_POOL_SIZE and DB_MAX_OVERFLOW: DB_POOL_SIZE = 20
DB_MAX_OVERFLOW = 10
For high-traffic deployments, consider PgBouncer.
Set Up Backups
Schedule regular pg_dump backups: # Daily backup cron job
0 2 * * * pg_dump -U aegis aegis | gzip > /backups/aegis- $( date + \% Y \% m \% d ) .sql.gz
Test your restore procedure regularly!
Configure Read Replicas (Optional)
For read-heavy workloads, add read replicas: DATABASE_READ_REPLICA_URL = postgresql+asyncpg://aegis:pass@replica:5432/aegis
Monitor Query Performance
Enable slow query logging: ALTER SYSTEM SET log_min_duration_statement = 100 ;
SELECT pg_reload_conf();
Security
Rotate API Keys
Generate strong API keys and rotate regularly: # Generate new key
NEW_KEY = $( openssl rand -base64 32 )
# Update environment
# Restart service
# Update all clients
You must update all agent clients when rotating keys. There’s no built-in key rotation mechanism.
Enable TLS
Always use HTTPS in production. Configure your reverse proxy (nginx, Traefik, etc.): server {
listen 443 ssl;
server_name aegis.yourdomain.com;
ssl_certificate /etc/ssl/certs/aegis.crt;
ssl_certificate_key /etc/ssl/private/aegis.key;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $ host ;
proxy_set_header X-Real-IP $ remote_addr ;
}
}
Network Isolation
Place PostgreSQL in a private subnet
Use VPC security groups to restrict access
Never expose PostgreSQL to the internet
Secrets Management
Use a secrets manager instead of environment files:
AWS Secrets Manager
HashiCorp Vault
GCP Secret Manager
You’ll need to write custom code to fetch secrets at startup.
Monitoring
Health Checks
Monitor the /health endpoint: curl https://aegis.yourdomain.com/health
Set up alerts for non-200 responses.
Prometheus Metrics
Scrape /metrics endpoint: # prometheus.yml
scrape_configs :
- job_name : 'aegis'
static_configs :
- targets : [ 'aegis:8000' ]
Available metrics:
aegis_memory_add_total
aegis_memory_query_total
aegis_memory_query_latency_seconds
Log Aggregation
Ship logs to your log aggregation system: # Example: Docker with Loki
docker run -d \
--log-driver=loki \
--log-opt loki-url="http://loki:3100/loki/api/v1/push" \
aegis-api
Alerting
Set up alerts for:
API error rate > 1%
Query latency p99 > 500ms
Database connections > 80% of pool
Disk usage > 80%
Scaling
The Aegis API is stateless and can be horizontally scaled: # docker-compose scale
docker-compose up -d --scale aegis-api=3
Use a load balancer (nginx, HAProxy, cloud LB) in front.
Vertical Scaling (Database)
PostgreSQL with pgvector benefits from:
More RAM (for HNSW index caching)
Faster SSDs
More CPU cores for parallel queries
Start with 4GB RAM, 2 vCPUs minimum.
For large memory counts (1M+), tune the HNSW index: -- Increase ef_construction for better recall (slower inserts)
SET hnsw . ef_construction = 128 ;
-- Increase ef_search for better query accuracy (slower queries)
SET hnsw . ef_search = 64 ;
See pgvector tuning guide for details.
Disaster Recovery
Backup Strategy
#!/bin/bash
# backup.sh - Run daily via cron
BACKUP_DIR = /backups
DATE = $( date +%Y%m%d_%H%M%S )
# Full database backup
pg_dump -U aegis -Fc aegis > $BACKUP_DIR /aegis_ $DATE .dump
# Upload to S3 (or your cloud storage)
aws s3 cp $BACKUP_DIR /aegis_ $DATE .dump s3://your-bucket/backups/
# Retain 30 days locally
find $BACKUP_DIR -name "*.dump" -mtime +30 -delete
Restore Procedure
# 1. Stop the API
docker-compose stop aegis-api
# 2. Restore database
pg_restore -U aegis -d aegis --clean /backups/aegis_20250115.dump
# 3. Restart API
docker-compose start aegis-api
# 4. Verify
curl http://localhost:8000/health
Cost Estimation
Component Small (Dev) Medium (Startup) Large (Enterprise) Compute $20/mo $100/mo $500+/mo PostgreSQL $15/mo $100/mo $500+/mo OpenAI Embeddings $10/mo $50/mo $200+/mo Total ~$45/mo ~$250/mo ~$1,200+/mo
Costs scale primarily with:
Number of memories stored (database size)
Query volume (compute + embeddings)
Embedding calls (OpenAI API costs)
Need Help?
Production deployment requires careful planning. If you’re facing challenges:
Coming Soon : We’re working on a managed platform that handles all of this for you. Join the waitlist to be notified when it launches.