Policy Optimisation: Parameter Scanning
This document describes the Parameter Scanning tool, which is the first phase of the Policy Optimisation module. This tool allows researchers and policy analysts to efficiently run the microsimulation model across multiple variations of policy parameters and evaluate the results.
Overview
The core of the tool is the run_parameter_scan
function located in src.optimisation
. It is designed to answer questions like:
- "What would be the effect on total tax revenue if we changed the top tax rate?"
- "How does increasing a tax credit affect government expenditure compared to the baseline policy?"
It works by taking a baseline population and a set of policy "scenarios". For each scenario, it adjusts the policy parameters, runs a simulation, and calculates a set of key metrics defined by the user.
How to Use
Using the tool involves three main steps: 1. Prepare a population dataset: This is a standard pandas DataFrame, the same kind used for single simulation runs. 2. Create a YAML configuration file: This file defines the different policy scenarios you want to test. 3. Define metric functions: These are Python functions that calculate the specific outcomes you are interested in (e.g., revenue, inequality).
Example Script
An example script is provided in examples/run_parameter_scan.py
. It demonstrates the full workflow. To run it, you can execute:
python examples/run_parameter_scan.py
1. The Configuration File
The scenarios are defined in a YAML file. The file must have a top-level key called scenarios
, which contains a list of individual scenario configurations.
Each scenario needs:
- id
: A unique string to identify the scenario in the results.
- description
: A brief explanation of what the scenario does.
- parameters
: A dictionary where keys are the parameters to change, and values are the new values.
Parameter keys are dot-separated paths that correspond to the structure of the Parameters
model in src.parameters
.
Example: examples/scan_config.yaml
scenarios:
- id: "baseline"
description: "The current policy settings for 2024-2025."
parameters:
# No overrides, use the default parameters for the base year.
{}
- id: "higher_top_tax_rate"
description: "Increase the top tax rate from 39% to 42%."
parameters:
tax_brackets.rates: [0.105, 0.175, 0.30, 0.33, 0.42]
- id: "increase_ietc"
description: "Increase the Independent Earner Tax Credit entitlement by 20%."
parameters:
ietc.ent: 624 # Original is 520
2. Defining Metric Functions
You can measure any outcome from a simulation by defining a "metric function". This is a Python function that accepts a pandas DataFrame (the output of a simulation) and returns a single number.
Example from the script:
def total_tax_revenue(df: pd.DataFrame) -> float:
"""Calculates the total income tax paid by the population."""
return df["tax_liability"].sum()
def total_wff_paid(df: pd.DataFrame) -> float:
"""Calculates the total Working for Families credits paid."""
wff_columns = ["FTCcalc", "IWTCcalc", "BSTCcalc", "MFTCcalc"]
existing_cols = [col for col in wff_columns if col in df.columns]
return df[existing_cols].sum().sum()
# These are then passed to the scanner in a dictionary:
metrics_to_run = {
"Total Tax Revenue": total_tax_revenue,
"Total WFF Paid": total_wff_paid
}
Function Reference
run_parameter_scan(base_df, base_year, scan_config, metrics)
base_df
: The initial population DataFrame.base_year
: The base year for the simulation (e.g., "2024-2025"). The tool will load the default parameters for this year before applying any scenario overrides.scan_config
: A dictionary loaded from your YAML configuration file.metrics
: A dictionary mapping metric names (strings) to your metric functions.
The function returns a pandas DataFrame where each row contains the results for one scenario, making it easy to compare the impacts of different policy choices.
Phase 2: Advanced Policy Optimisation with Optuna
Building on the parameter scanning tool, the advanced policy optimisation feature uses the Optuna
library to intelligently search for optimal policy parameters. Instead of testing a pre-defined grid of scenarios, this tool explores a continuous search space to find parameter combinations that maximize or minimize a specific objective function.
How it Works
The core of this feature is the run_policy_optimisation
function. It wraps the simulation logic in an "objective function" that Optuna repeatedly calls with different parameter values. Optuna's samplers use the history of previous results to guide the search towards more promising areas of the parameter space.
How to Use
The workflow is similar to the parameter scanner, but the configuration file is different.
Example Script:
An example is provided in examples/run_policy_optimisation.py
.
python examples/run_policy_optimisation.py
1. The Optimisation Configuration File
The study is defined in a YAML file. It has three main sections: objective
, n_trials
, and search_space
.
objective
: Defines the goal of the study.name
: The metric to be optimized. This must correspond to a key in themetrics
dictionary you provide.direction
: Can be"maximize"
or"minimize"
.n_trials
: The total number of simulations the optimiser will run.search_space
: A list of parameters to vary. Each entry defines:name
: A unique name for the parameter in the study.path
: The dot-separated path to the parameter in theParameters
model.type
: The type of value to search for (float
,int
, orcategorical
).- The range of values (
low
andhigh
forfloat
/int
, orchoices
forcategorical
).
Example: examples/opt_config.yaml
objective:
name: "total_tax_revenue"
direction: "maximize"
n_trials: 50
search_space:
- name: "top_tax_rate"
path: "tax_brackets.rates.4"
type: "float"
low: 0.38
high: 0.45
- name: "ietc_entitlement"
path: "ietc.ent"
type: "int"
low: 520
high: 700
2. Interpreting the Results
The run_policy_optimisation
function returns a completed Optuna study
object. This object contains a wealth of information about the optimisation process. The example script shows how to access the most important results:
study.best_trial
: The trial that resulted in the best objective value.study.best_params
: A dictionary of the parameter values that yielded the best result.study.best_value
: The best objective score achieved.
You can also access the user_attrs
of the best trial to see the values of all other metrics for that trial, which is useful for understanding trade-offs.
For more advanced analysis, Optuna provides powerful visualization tools (e.g., optuna.visualization.plot_optimization_history
) that can be used with the returned study
object.