Skip to content

Benchmarks

Synapse uses Criterion.rs for benchmarking. 19 benchmark files organized across 7 categories, 306 total benchmarks.

Running Benchmarks

sh
cd apps/synapse-pingora

# All benchmarks
cargo bench

# Specific suite
cargo bench --bench detection
cargo bench --bench pipeline

Criterion generates HTML reports in target/criterion/. Open target/criterion/report/index.html for an interactive dashboard.

Build Profile

SettingValue
LTOthin
codegen-units1
opt-level3
Warm-up3 seconds
Measurement5 seconds
Noise threshold5% (10% for sustained)
Sample size100

Benchmark Suites

#SuiteFileCategoryBenchmarks
1Detection Enginedetection.rsCore WAF29
2Pipelinepipeline.rsRequest chain17
3Goblins (DLP)goblins.rsData protection22
4Contentioncontention.rsConcurrency60
5Risk Scoringrisk_scoring.rsThreat intel18
6Correlationcorrelation.rsThreat intel14
7API Profilerprofiler_bench.rsBehavioral23
8Schemaschema_bench.rsAPI learning19
9Bot Detectionbot_bench.rsSecurity16
10Header Profilerheader_profiler_bench.rsBehavioral12
11Sustainedsustained_bench.rsLoad testing9
12Escalationescalation_bench.rsActive defense11
13Captchacaptcha_bench.rsChallenges8
14Hot Pathhot_path_bench.rsInfrastructure15
15–19Profiler unit teststests/profiler/*.rsValidation33

Full Stack Latency Budget

Worst-case per-request latency with all features enabled (~450 μs):

SubsystemCost% of Total
Pipeline (ACL → Rate Limit → WAF → Entity)73 μs16.4%
Trends97 μs21.8%
Campaign Correlation78 μs17.5%
DLP Scan (4 KB)35 μs7.9%
Crawler Detection3 μs0.7%
Session Management7 μs1.6%
Profiler1 μs0.2%
Proxy Overhead (Pingora I/O)~150 μs33.7%
Total (worst case)~450 μs100%

Optimization targets

Trends (21.8%) and Proxy I/O (33.7%) together account for over half of total latency. Proxy overhead is largely TCP/TLS negotiation (irreducible). Trends is an active optimization target — background aggregation could recover ~97 μs.

Detection Engine

OperationLatency
Simple GET (no params)~10 μs
SQLi detection (avg)~27 μs
XSS detection (avg)~23 μs
Evasive attacks (hex, unicode, polyglot)~25–33 μs
Full rule set (237 rules)~72 μs

WAF Rule Scaling

Active RulesAnalyze TimeNotes
103.7 μs
5025.4 μs
10034.8 μs
237 (full production)71.8 μsSub-linear scaling via rule indexing

Evasion Technique Detection

All evasion techniques detected under 34 μs:

TechniqueTime
XSS — hex / double / unicode encoding26–28 μs
SQLi — comment / case / concat evasion30–32 μs
Path traversal (all variants)10–12 μs
Command injection31–34 μs
Polyglot (XSS + SQLi combined)26.2 μs

Per-Request Hot Path

Sub-microsecond components that run on every request:

ComponentTimeNotes
Rate limit check (1M RPS capacity)60 nsToken bucket lookup
Rate limit (exhausted bucket)70 nsReject path
ACL — 5 rules (first hit)6 nsEarly match exit
ACL — 100 rules (last match)151 nsWorst case linear scan
IPv6 CIDR match5.5 nsBitwise prefix comparison
Tarpit peek36–51 nsCheck if IP is tarpitted

DLP Scanning

Payload SizeCleanWith PIIThroughput
4 KB20.9 μs34.4 μs187 MiB/s
8 KB41.6 μs67.2 μs188 MiB/s
32 KB49.3 μs635 MiB/s
128 KB665 μs966 μs188 MiB/s
512 KB2.6 ms3.8 ms192 MiB/s

A 42% faster DLP scanning mode is available for high-throughput scenarios with reduced pattern coverage.

Intelligence Layer

Per-request overhead for behavioral features:

OperationTime
apply_rule_risk (first hit)223 ns
apply_rule_risk (repeat)242 ns
check_block (below threshold)155 ns
check_block (above threshold)824 ns
Fingerprint register (new)1.30 μs
Fingerprint lookup (5 IPs)327 ns
Fingerprint lookup (100 IPs)5.24 μs
Campaign record_attack561 ns
Campaign calculate_score3.2 ns
update_profile (simple GET)130 ns
analyze_request (normal)366 ns
Session validate (existing)300 ns
Session create6.68 μs
Schema learn (small)1.85 μs
Schema validate (conforming)958 ns
Bot check (hit)747 ns
Bot check (miss / full scan)3.77 μs
Cookie generate (HMAC-SHA256)4.06 μs
Cookie validate (valid)2.73 μs
JA4 parse1.20 μs
Config reload236 μs

Realistic Payloads

End-to-end latency for production-representative request shapes:

ScenarioTimeCharacteristics
Simple GET10.5 μsNo body, 2–3 headers
Clean request w/ 8 headers151 μsTypical browser request
E-commerce POST2.4 msJSON body with cart/payment data
GraphQL mutation2.8 msComplex nested query
Healthcare claim4.4 msLarge structured data with PII
Heavy — 14 KB + 20 headers4.4 msWorst-case production request

Mixed traffic (95/5 clean/attack): 20 requests in 277 μs — averaging 13.9 μs per request. Sustained throughput: 72,000 req/s (single thread).

Contention Scaling

Concurrent load behavior across shared data structures (10,000 iterations per thread):

Benchmark1 Thread4 Threads8 ThreadsScaling Factor
Token bucket160 μs531 μs1.02 ms6.4x
Entity mgr (90/10 read/write)314 μs952 μs1.45 ms4.6x
Entity mgr (50/50 read/write)351 μs1.16 ms1.84 ms5.2x
Tarpit mixed215 μs777 μs1.42 ms6.6x
DLP scanner296 μs555 μs925 μs3.1x

DLP achieves the best scaling (3.1x degradation at 8 threads) because scanning is embarrassingly parallel — no shared mutable state.

Comparison

ImplementationDetection LatencyThroughputNotes
Synapse (Pingora)~75 μs13.8K req/sPure Rust, no FFI boundary
libsynapse (NAPI)~73 μs14K eval/sNode.js + Rust FFI overhead
Batch 128 (FlatBuffers)9.9 μs/req101K eval/sAmortized serialization
ModSecurity100–500 μsDepends on ruleset
AWS WAF50–200 μsCloud service
ThreatX SaaS (8-hop)1–2+ secondsSensor → Kafka → MongoDB → HackerMind

Known Gaps

  • Contention at 8 threads — stateful components show 4.6–6.6x degradation under concurrent write load. Acceptable for current targets but limits single-node throughput at very high concurrency.
  • Large payloads — DLP scan time enters the millisecond range for payloads exceeding 8 KB. Consider body size caps for latency-sensitive endpoints.
  • Sustained load — all benchmarks are burst-mode (Criterion.rs). No 10+ minute sustained load tests have been performed. Memory growth under extended load is unmeasured.

Test environment

Apple M3 Pro, 36 GB RAM. Rust release build with LTO. 237 production rules, 500+ bot signatures, 22+ DLP patterns. February 2026. All benchmarks are reproducible: cargo bench from the synapse-waf workspace.

Load Testing

For end-to-end load testing beyond micro-benchmarks:

  • apps/synapse-pingora/docs/performance/TUNNEL_LOAD_TEST.md — WebSocket tunnel load testing
  • apps/synapse-pingora/docs/performance/BENCHMARK_METHODOLOGY.md — testing methodology and reproducibility

Licensed under AGPL-3.0 · atlascrew.dev