Profiling and Performance
This guide documents the Rust-core profiling workflow around the committed scalar CPU baseline. It is intentionally narrow: the goal is to describe the measured baseline, the JSON artifact contract, and the criteria for deciding when SIMD or accelerator work deserves a follow-on track.
Baseline scaffold
The current benchmark scaffold lives in bindings/rust/benches/:
scalar_cpu_baseline.rsruns the deterministic EVPI workloadscalar_cpu_baseline.jsonrecords the workload, expected value, and regression policyREADME.mdexplains the local entrypoint and the current comparison rule
The baseline is intentionally scalar and deterministic:
workload: a fixed two-strategy EVPI matrix
expected result:
3.0metric type:
scalar_cpucomparison rule: exact workload/value match
regression policy:
ci-contract-only
Recommended local check:
cargo test --benches scalar_cpu_baseline -- --nocapture
Workflow
Use the baseline in three steps:
Run the scalar benchmark and confirm the EVPI result matches the committed artifact.
Record timing, memory, or throughput measurements in the same artifact family using the same workload identity.
Compare the new artifact against the committed baseline before promoting a new optimization claim.
The key rule is that the workload seed and EVPI value remain stable unless the track explicitly re-baselines them.
Artifact format
The committed artifact is JSON and should stay small enough to review in code review or CI logs.
Example:
{
"benchmark_name": "scalar_cpu_baseline",
"metric_type": "scalar_cpu",
"workload": {
"seed": 42,
"repeats": 10000,
"net_benefits": [[10.0, 1.0], [2.0, 8.0]]
},
"expected": {
"evpi": 3.0,
"comparison_rule": "exact",
"regression_policy": "ci-contract-only"
},
"metadata": {
"phase": "phase-1-scalar-cpu-baseline",
"notes": [
"Deterministic baseline for the Rust core performance track.",
"Timing comparisons are deferred until a stable baseline artifact exists."
]
}
}
How to read the outputs
The current artifact family is correctness-first. Timing is observed, but the committed contract only enforces the scalar workload and value.
metric_typeidentifies the measurement family.expected.evpiis the correctness anchor.expected.comparison_rulestates how strict the comparison is.expected.regression_policysays whether CI only records the artifact or enforces a threshold.metadata.phaserecords which profiling phase produced the artifact.
When memory and throughput artifacts arrive, they should keep the same JSON family and add measured fields for the new metric rather than replacing the scalar contract. The scalar workload and expected EVPI remain the baseline reference unless the track explicitly re-baselines them.
Promotion criteria for SIMD or accelerators
SIMD, Rayon, and accelerator work should be promoted only when the scalar baseline and artifact layer already exist.
Open a follow-on track when all of the following are true:
the scalar baseline is stable and reproducible
the hot path shows a measurable gain from vectorization or parallelism
the proposed change preserves the same result semantics and tolerance policy
the memory/throughput artifacts show a repeatable improvement, not a one-off
the optimization can be described as an internal execution change rather than a new public contract
Practical order:
Scalar CPU baseline
Memory and throughput measurement
Rayon or equivalent multithreading feasibility
SIMD feasibility
GPU or other accelerator feasibility only if the earlier data justify it
If a candidate optimization needs a different workload, a different result shape, or a different correctness policy, it belongs in a follow-on track rather than in the baseline profiling contract.