Profiling and Performance ========================= This guide documents the Rust-core profiling workflow around the committed scalar CPU baseline. It is intentionally narrow: the goal is to describe the measured baseline, the JSON artifact contract, and the criteria for deciding when SIMD or accelerator work deserves a follow-on track. Baseline scaffold ----------------- The current benchmark scaffold lives in ``bindings/rust/benches/``: - ``scalar_cpu_baseline.rs`` runs the deterministic EVPI workload - ``scalar_cpu_baseline.json`` records the workload, expected value, and regression policy - ``README.md`` explains the local entrypoint and the current comparison rule The baseline is intentionally scalar and deterministic: - workload: a fixed two-strategy EVPI matrix - expected result: ``3.0`` - metric type: ``scalar_cpu`` - comparison rule: exact workload/value match - regression policy: ``ci-contract-only`` Recommended local check: .. code-block:: bash cargo test --benches scalar_cpu_baseline -- --nocapture Workflow -------- Use the baseline in three steps: 1. Run the scalar benchmark and confirm the EVPI result matches the committed artifact. 2. Record timing, memory, or throughput measurements in the same artifact family using the same workload identity. 3. Compare the new artifact against the committed baseline before promoting a new optimization claim. The key rule is that the workload seed and EVPI value remain stable unless the track explicitly re-baselines them. Artifact format --------------- The committed artifact is JSON and should stay small enough to review in code review or CI logs. Example: .. code-block:: json { "benchmark_name": "scalar_cpu_baseline", "metric_type": "scalar_cpu", "workload": { "seed": 42, "repeats": 10000, "net_benefits": [[10.0, 1.0], [2.0, 8.0]] }, "expected": { "evpi": 3.0, "comparison_rule": "exact", "regression_policy": "ci-contract-only" }, "metadata": { "phase": "phase-1-scalar-cpu-baseline", "notes": [ "Deterministic baseline for the Rust core performance track.", "Timing comparisons are deferred until a stable baseline artifact exists." ] } } How to read the outputs ----------------------- The current artifact family is correctness-first. Timing is observed, but the committed contract only enforces the scalar workload and value. - ``metric_type`` identifies the measurement family. - ``expected.evpi`` is the correctness anchor. - ``expected.comparison_rule`` states how strict the comparison is. - ``expected.regression_policy`` says whether CI only records the artifact or enforces a threshold. - ``metadata.phase`` records which profiling phase produced the artifact. When memory and throughput artifacts arrive, they should keep the same JSON family and add measured fields for the new metric rather than replacing the scalar contract. The scalar workload and expected EVPI remain the baseline reference unless the track explicitly re-baselines them. Promotion criteria for SIMD or accelerators ------------------------------------------- SIMD, Rayon, and accelerator work should be promoted only when the scalar baseline and artifact layer already exist. Open a follow-on track when all of the following are true: - the scalar baseline is stable and reproducible - the hot path shows a measurable gain from vectorization or parallelism - the proposed change preserves the same result semantics and tolerance policy - the memory/throughput artifacts show a repeatable improvement, not a one-off - the optimization can be described as an internal execution change rather than a new public contract Practical order: 1. Scalar CPU baseline 2. Memory and throughput measurement 3. Rayon or equivalent multithreading feasibility 4. SIMD feasibility 5. GPU or other accelerator feasibility only if the earlier data justify it If a candidate optimization needs a different workload, a different result shape, or a different correctness policy, it belongs in a follow-on track rather than in the baseline profiling contract. Related guidance ---------------- See the companion pages for the Rust-core migration boundary and the acceleration decision policy: * :doc:`rust_core_handoff` * :doc:`rust_parallelism` * :doc:`rust_accelerators`