mars

H1 CPU Runtime Benchmark Baselines

This document records the first runnable baseline set for the H1 replay-path benchmark requirement in docs/hpc_contracts.md.

Environment and Scope

Commands

Rust criterion baselines:

cd rust-runtime
cargo bench --bench runtime_bench -- --noplot

Python smoke wrapper (thread controls and memory deltas; requires local pymars runtime dependencies):

python3 scripts/benchmark_runtime_threads.py \
  --mode predict \
  --rows 1024,8192 \
  --threads 1,4 \
  --repeats 3

Recorded Rust Criterion Results (2026-05-11)

All runs used thread_count=1 and thread_count=4. Medians below are from the criterion output above.

Workload Operation Threads=1 Median Threads=4 Median Delta
64 rows design_matrix 6.4 us 81.4 us 11.3x slower
1,024 rows design_matrix 93.3 us 413.7 us 4.4x slower
8,192 rows design_matrix 712.0 us 1.90 ms 2.7x slower
64 rows predict 2.4 us 77.1 us 32.0x slower
1,024 rows predict 26.7 us 133.3 us 5.0x slower
8,192 rows predict 215.7 µs 201.3 µs 0.93x (7% faster)

Interpretation:

Memory and Regression Tracking