Skip to content

Performance Benchmarks

Overview

This document contains performance benchmarks for the Cortex AI system.

Search Performance

Metric Value Target
P50 latency 45ms <100ms
P95 latency 120ms <200ms
P99 latency 250ms <500ms
Throughput 500 req/s >200 req/s

Indexing Performance

Repository Size Index Time Memory Usage
10K files 2 min 1 GB
50K files 8 min 3 GB
100K files 18 min 6 GB
500K files 90 min 15 GB

Embedding Generation

Model Speed Quality Score
Voyage Code 3 1000 tokens/s 0.92
text-embedding-3-large 1500 tokens/s 0.89
Local Ollama (nomic) 500 tokens/s 0.85

Qdrant Performance

Collection Size Search Latency Memory
1M vectors 15ms 4 GB
10M vectors 35ms 35 GB
50M vectors 80ms 175 GB

Test Environment

  • Hardware: 3-node Proxmox cluster, 64GB RAM each
  • Qdrant: 3-node cluster with replication
  • PostgreSQL: HA cluster with streaming replication
  • Network: 10Gbps internal

Running Benchmarks

# Run search benchmarks
go test -bench=. ./cortex-context/...

# Run indexing benchmarks
go test -bench=. ./cortex-indexer/...

Historical Data

Benchmarks are recorded in Prometheus and visualized in Grafana. Dashboard: https://grafana.emshvac.co/d/cortex-perf