innovate.utils package¶
Submodules¶
innovate.adopt.categorization¶
- innovate.adopt.categorization.categorize_adopters(model: DiffusionModel, t: Sequence[float]) DataFrame [source]¶
- No-index:
Categorizes adopters based on the fitted diffusion model.
- Parameters:
model – A fitted diffusion model.
t – A sequence of time points.
- Returns:
A pandas DataFrame with the adopter categories for each time point.
innovate.utils.metrics module¶
- innovate.utils.metrics.calculate_aic(n_params: int, n_samples: int, rss: float) float [source]¶
Calculates the Akaike Information Criterion (AIC).
Assumes errors are normally distributed.
- innovate.utils.metrics.calculate_bic(n_params: int, n_samples: int, rss: float) float [source]¶
Calculates the Bayesian Information Criterion (BIC).
Assumes errors are normally distributed.
- innovate.utils.metrics.calculate_mae(y_true: Sequence[float], y_pred: Sequence[float]) float [source]¶
Calculates the Mean Absolute Error (MAE).
- innovate.utils.metrics.calculate_mape(y_true: Sequence[float], y_pred: Sequence[float]) float [source]¶
Calculates the Mean Absolute Percentage Error (MAPE).
- innovate.utils.metrics.calculate_mse(y_true: Sequence[float], y_pred: Sequence[float]) float [source]¶
Calculates the Mean Squared Error (MSE).
- innovate.utils.metrics.calculate_r_squared(y_true: Sequence[float], y_pred: Sequence[float]) float [source]¶
Calculates the R-squared (coefficient of determination).
- innovate.utils.metrics.calculate_rmse(y_true: Sequence[float], y_pred: Sequence[float]) float [source]¶
Calculates the Root Mean Squared Error (RMSE).
innovate.utils.model_evaluation module¶
- innovate.utils.model_evaluation.compare_models(models: Dict[str, DiffusionModel], t_true: Sequence[float], y_true: Sequence[float]) DataFrame [source]¶
Compares multiple diffusion models based on various goodness-of-fit metrics.
- Parameters:
models – A dictionary where keys are model names (str) and values are fitted DiffusionModel instances.
t_true – The true time points.
y_true – The true cumulative adoption values.
- Returns:
A pandas DataFrame containing the comparison metrics for each model.
- innovate.utils.model_evaluation.compute_residuals(model: DiffusionModel, t: Sequence[float], y: Sequence[float]) ndarray [source]¶
Return the residuals for a fitted model.
- innovate.utils.model_evaluation.find_best_model(comparison_df: DataFrame, metric: str = 'RMSE', minimize: bool = True) Tuple[str, Dict[str, Any]] [source]¶
Identifies the best performing model from a comparison DataFrame.
- Parameters:
comparison_df – The DataFrame returned by compare_models.
metric – The metric to use for comparison (e.g., ‘RMSE’, ‘R-squared’).
minimize – If True, the best model has the minimum value for the metric. If False, the best model has the maximum value.
- Returns:
A tuple containing the name of the best model and its full results row.
- innovate.utils.model_evaluation.get_fit_metrics(model: DiffusionModel, t: Sequence[float], y: Sequence[float]) Dict[str, float] [source]¶
Calculates various goodness-of-fit metrics for a model.
- Parameters:
model – The fitted diffusion model.
t – The time points.
y – The true cumulative adoption values.
- Returns:
A dictionary containing the calculated metrics.
- innovate.utils.model_evaluation.model_aic(model: DiffusionModel, t: Sequence[float], y: Sequence[float]) float [source]¶
Return the Akaike Information Criterion for a fitted model.
- innovate.utils.model_evaluation.model_bic(model: DiffusionModel, t: Sequence[float], y: Sequence[float]) float [source]¶
Return the Bayesian Information Criterion for a fitted model.
- innovate.utils.model_evaluation.residual_acf(model: DiffusionModel, t: Sequence[float], y: Sequence[float], nlags: int = 40) ndarray [source]¶
Return the autocorrelation function of model residuals.
- innovate.utils.model_evaluation.residual_pacf(model: DiffusionModel, t: Sequence[float], y: Sequence[float], nlags: int = 40) ndarray [source]¶
Return the partial autocorrelation function of model residuals.
innovate.utils.preprocessing module¶
- innovate.utils.preprocessing.aggregate_time_series(data: Series | DataFrame, freq: str) Series | DataFrame [source]¶
Aggregates time series data to a specified frequency (e.g., ‘D’, ‘W’, ‘M’).
- innovate.utils.preprocessing.apply_rolling_average(data: Series, window: int) Series [source]¶
Applies a rolling average to a time series.
- Parameters:
data – A pandas Series.
window – The size of the rolling window.
- Returns:
A pandas Series with the rolling average applied.
- innovate.utils.preprocessing.apply_sarima(data: Series, order: Tuple[int, int, int], seasonal_order: Tuple[int, int, int, int]) Series [source]¶
Fits a SARIMA model to a time series and returns the fitted values.
- Parameters:
data – A pandas Series.
order – The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters.
seasonal_order – The (P,D,Q,s) seasonal order of the model.
- Returns:
A pandas Series with the fitted values from the SARIMA model.
- innovate.utils.preprocessing.apply_stl_decomposition(data: Series, period: int = None, robust: bool = True) Tuple[Series, Series, Series] [source]¶
Applies Seasonal-Trend decomposition using Loess (STL) to a time series.
- Parameters:
data – A pandas Series with a DatetimeIndex.
period – Period of the seasonality. If None, it will try to infer.
robust – Whether to use robust fitting (less sensitive to outliers).
- Returns:
A tuple of (trend, seasonal, residuals) as pandas Series.