geoprior.utils.forecast_utils#

Forecast utilities.

Functions

add_forecast_times(df, *[, forecast_times, ...])

Map each 1‑based forecast_step into an explicit calendar time.

adjust_time_predictions(df, time_col, ...[, ...])

Adjusts time predictions by adding the forecast horizon to inverse normalized time.

apply_extra_metrics(*, dest, y_true, y_pred, ...)

Apply extra metrics into dest using robust calling heuristics.

compute_quantile_coverage(df, quantiles, ...)

For each nominal quantile q in quantiles, compute the fraction of samples where actual <= predicted q‑quantile.

detect_forecast_type(df[, value_prefixes])

Auto-detects whether a DataFrame contains deterministic or quantile forecasts, supporting both long and wide formats.

evaluate_forecast(eval_data, *[, ...])

Evaluate forecast diagnostics from an evaluation DataFrame.

format_and_forecast(y_pred, y_true, *[, ...])

Format PINN forecasts into evaluation and future DataFrames.

format_forecast_dataframe(df[, to_wide, ...])

Auto-detects DataFrame format and conditionally pivots to wide format.

get_step_names(forecast_steps[, step_names, ...])

Build a step → label mapping for multi‑horizon plots.

get_test_data_from(df, time_col, time_steps)

Prepares the test data for forecasting by ensuring there is enough future data.

get_value_prefixes(df[, exclude_cols, ...])

Automatically detects the prefixes of value columns from a DataFrame.

get_value_prefixes_in(df[, exclude_cols])

Automatically detects the prefixes of value columns from a DataFrame.

increment_dates_by_horizon(df, time_col, ...)

Increments the values in a datetime column by the forecast horizon.

normalize_for_pinn(df, time_col, coord_x, ...)

Apply Min-Max normalization to spatial-temporal coordinates and optionally to other numeric columns.

pivot_forecast(df, *[, index_col, ...])

Pivot a long-format forecast DataFrame into a wide one.

pivot_forecast_dataframe(data, id_vars, ...)

Transforms a long-format forecast DataFrame to a wide format.

plot_reliability_diagram(models_data[, ...])

Plot a reliability diagram for one or multiple models.

plot_reliability_diagram_in(coverage_df[, ...])

Plot nominal vs empirical probabilities.

stack_quantile_predictions(q_lower, ...)

Stack three quantile trajectories into a single y_pred array of shape (n_samples, 3, n_timesteps), ready for PSS.

geoprior.utils.forecast_utils.detect_forecast_type(df, value_prefixes=None)[source]#

Auto-detects whether a DataFrame contains deterministic or quantile forecasts, supporting both long and wide formats.

This utility inspects column names to determine the nature of the predictions.

  • It identifies a ‘quantile’ forecast if it finds columns containing a _qXX pattern (e.g., ‘subsidence_q10’, ‘GWL_2022_q50’).

  • It identifies a ‘deterministic’ forecast if no quantile columns are found, but columns ending in _pred, _actual, or matching a base prefix exist (e.g., ‘subsidence_pred’, ‘subsidence_2022_actual’, ‘GWL’).

Parameters:
  • df (pd.DataFrame) – The DataFrame to inspect.

  • value_prefixes (list of str, optional) – A list of value prefixes (e.g., [‘subsidence’, ‘GWL’]) to focus the search on. If None, prefixes are inferred from column names.

Returns:

One of ‘quantile’, ‘deterministic’, or ‘unknown’.

Return type:

str

Examples

>>> import pandas as pd
>>> from geoprior.utils.forecast_utils import detect_forecast_type
>>> # Long format quantile
>>> df_quant_long = pd.DataFrame(columns=['subsidence_q50', 'GWL_q90'])
>>> detect_forecast_type(df_quant_long)
'quantile'
>>> # Wide format quantile
>>> df_quant_wide = pd.DataFrame(columns=['subsidence_2022_q50'])
>>> detect_forecast_type(df_quant_wide)
'quantile'
>>> # Deterministic forecast
>>> df_determ = pd.DataFrame(columns=['subsidence_pred', 'GWL'])
>>> detect_forecast_type(df_determ)
'deterministic'
geoprior.utils.forecast_utils.format_forecast_dataframe(df, to_wide=True, time_col='coord_t', spatial_cols=('coord_x', 'coord_y'), value_prefixes=None, _logger=None, **pivot_kwargs)[source]#

Auto-detects DataFrame format and conditionally pivots to wide format.

This function serves as a smart wrapper. It first determines if the input DataFrame is in a ‘long’ or ‘wide’ forecast format based on its column structure. If to_wide is True and the format is ‘long’, it calls pivot_forecast_dataframe() to perform the transformation.

Parameters:
  • df (pd.DataFrame) – The input DataFrame to check and potentially transform.

  • to_wide (bool, default True) –

    • If True, the function’s goal is to return a wide-format DataFrame. It will pivot a long-format frame or return a wide-format frame as is.

    • If False, the function only performs detection and returns a string (‘wide’, ‘long’, or ‘unknown’).

  • time_col (str, default 'coord_t') – The name of the column that indicates the time step. Its presence is a primary indicator of a long-format DataFrame.

  • value_prefixes (list of str, optional) – A list of prefixes for the value columns (e.g., [‘subsidence’, ‘GWL’]). If None, the function will attempt to infer them from column names that do not match common ID columns.

  • **pivot_kwargs – Additional keyword arguments to pass down to the pivot_forecast_dataframe() function if it is called. Common arguments include id_vars, static_actuals_cols, verbose, etc.

  • spatial_cols (tuple[str])

  • _logger (Logger | Callable[[str], None] | None)

Returns:

  • If to_wide is True, returns the (potentially pivoted) wide-format pd.DataFrame.

  • If to_wide is False, returns a string: ‘wide’, ‘long’, or ‘unknown’.

Return type:

pd.DataFrame or str

See also

pivot_forecast_dataframe

The underlying function that performs the pivot operation.

Examples

>>> # df_long is a typical long-format forecast output
>>> df_long.columns
Index(['sample_idx', 'forecast_step', 'coord_t', 'coord_x', ...])
>>> # Detect format
>>> format_str = format_forecast_dataframe(df_long, to_wide=False)
>>> print(format_str)
'long'
>>>
>>> # Convert to wide format
>>> df_wide = format_forecast_dataframe(
...     df_long,
...     to_wide=True,
...     id_vars=['sample_idx', 'coord_x', 'coord_y'],
...     value_prefixes=['subsidence', 'GWL'],
...     static_actuals_cols=['subsidence_actual']
... )
>>> # print(df_wide.columns)
# Index(['sample_idx', 'coord_x', 'coord_y', 'subsidence_actual',
#        'GWL_2018_q50', ...], dtype='object')
geoprior.utils.forecast_utils.get_value_prefixes(df, exclude_cols=None, spatial_cols=('coord_x', 'coord_y'), time_col='coord_t')[source]#

Automatically detects the prefixes of value columns from a DataFrame.

This utility inspects the column names to infer the base names of the metrics being forecasted (e.g., ‘subsidence’, ‘GWL’), excluding common ID and coordinate columns. It works with both long and wide format forecast DataFrames.

Parameters:
  • df (pd.DataFrame) – The DataFrame from which to detect value prefixes.

  • exclude_cols (list of str, optional) – A list of columns to explicitly ignore during detection. If None, a default list of common ID/coordinate columns is used (e.g., ‘sample_idx’, ‘coord_x’, ‘coord_t’, etc.).

  • spatial_cols (tuple[str, str])

  • time_col (str)

Returns:

A sorted list of unique prefixes found in the column names.

Return type:

list of str

Examples

>>> from geoprior.utils.data_utils import get_values_prefixes
>>> # For a long-format DataFrame
>>> long_cols = ['sample_idx', 'coord_t', 'subsidence_q50', 'GWL_q50']
>>> df_long = pd.DataFrame(columns=long_cols)
>>> get_value_prefixes(df_long)
['GWL', 'subsidence']
>>> # For a wide-format DataFrame
>>> wide_cols = ['sample_idx', 'coord_x', 'subsidence_2022_q90', 'GWL_2022_q50']
>>> df_wide = pd.DataFrame(columns=wide_cols)
>>> get_value_prefixes(df_wide)
['GWL', 'subsidence']
geoprior.utils.forecast_utils.get_value_prefixes_in(df, exclude_cols=None)[source]#

Automatically detects the prefixes of value columns from a DataFrame. (This is a dependency for the function below)

Parameters:
Return type:

list[str]

geoprior.utils.forecast_utils.pivot_forecast_dataframe(data, id_vars, time_col, value_prefixes, static_actuals_cols=None, time_col_is_float_year='auto', round_time_col=False, verbose=0, savefile=None, _logger=None, **kws)[source]#

Transforms a long-format forecast DataFrame to a wide format.

This utility reshapes time series prediction data from a “long” format, where each row represents a single time step for a given sample, to a “wide” format, where each row represents a single sample and columns correspond to values at different time steps.

Parameters:
  • data (pd.DataFrame) – The input long-format DataFrame. It must contain the columns specified in id_vars and time_col, as well as value columns that start with the strings in value_prefixes.

  • id_vars (list of str) – A list of column names that uniquely identify each sample or group. These columns will be preserved in the wide-format output. For example: ['sample_idx', 'coord_x', 'coord_y'].

  • time_col (str) – The name of the column that represents the time step or year of the forecast (e.g., ‘coord_t’ or ‘forecast_step’). This column’s values will become part of the new column names.

  • value_prefixes (list of str) – A list of prefixes for the value columns that need to be pivoted. The function identifies columns starting with these prefixes. For instance, ['subsidence', 'GWL'] would match ‘subsidence_q10’, ‘GWL_q50’, etc.

  • static_actuals_cols (list of str, optional) – A list of columns containing static “actual” or ground truth values for each sample. These values are assumed to be constant for each unique sample_idx and are merged back into the wide DataFrame after pivoting. Example: ['subsidence_actual'].

  • time_col_is_float_year (bool or 'auto', default 'auto') –

    Controls how the time_col values are formatted into new column names. - If 'auto', automatically detects if time_col has a

    float dtype.

    • If True, treats time_col values (e.g., 2018.0) as years and converts them to integer strings (‘2018’).

    • If False, uses the string representation of the value as is.

  • round_time_col (bool, default False) – If True and time_col is a float type, its values will be rounded to the nearest integer before being used in column names. This is useful for cleaning up float years (e.g., 2018.0001 -> 2018).

  • verbose (int, default 0) – Controls the verbosity of logging messages. 0 is silent. Higher values print more details about the process.

  • savefile (str, optional) – If a file path is provided, the final wide-format DataFrame will be saved as a CSV file to that location.

  • _logger (Logger | Callable[[str], None] | None)

Returns:

A wide-format DataFrame with one row per unique combination of id_vars. New columns are created in the format {prefix}_{time_str}{_suffix} (e.g., ‘subsidence_2018_q10’).

Return type:

pd.DataFrame

See also

pandas.pivot_table

The core function used for reshaping data.

pandas.merge

Used to re-join static columns after pivoting.

Notes

  • The combination of columns in id_vars and time_col must uniquely identify each row in df_long for the pivot to succeed without data loss.

  • If using static_actuals_cols, the id_vars list must contain ‘sample_idx’ to correctly merge the static data back.

Examples

>>> import pandas as pd
>>> from geoprior.utils.data_utils import pivot_forecast_dataframe
>>> data = {
...     'sample_idx':      [0, 0, 1, 1],
...     'coord_t':         [2018.0, 2019.0, 2018.0, 2019.0],
...     'coord_x':         [0.1, 0.1, 0.5, 0.5],
...     'coord_y':         [0.2, 0.2, 0.6, 0.6],
...     'subsidence_q50':  [-8, -9, -13, -14],
...     'subsidence_actual': [-8.5, -8.5, -13.2, -13.2],
...     'GWL_q50':         [1.2, 1.3, 2.2, 2.3],
... }
>>> df_long_example = pd.DataFrame(data)
>>> df_wide = pivot_forecast_dataframe(
...     data=df_long_example,
...     id_vars=['sample_idx', 'coord_x', 'coord_y'],
...     time_col='coord_t',
...     value_prefixes=['subsidence', 'GWL'],
...     static_actuals_cols=['subsidence_actual'],
...     verbose=0
... )
>>> print(df_wide.columns)
Index(['sample_idx', 'coord_x', 'coord_y', 'subsidence_actual',
       'GWL_2018_q50', 'GWL_2019_q50', 'subsidence_2018_q50',
       'subsidence_2019_q50'],
      dtype='object')
geoprior.utils.forecast_utils.get_step_names(forecast_steps, step_names=None, default_name='')[source]#

Build a step → label mapping for multi‑horizon plots.

The helper reconciles an integer list forecast_steps with an optional alias container (dict or sequence) and returns a dictionary whose keys are the integer steps and whose values are human‑readable labels.

Matching is case‑insensitive and tolerant to common delimiters—e.g. "Step 1", "step‑1", or "forecast step 1" will all map to integer step 1.

Parameters:
  • forecast_steps (Iterable[int]) – Ordered steps, e.g. [1, 2, 3].

  • step_names (dict | list | tuple | None, default None) –

    Custom labels. Accepted forms

    • dict – keys may be int or any string representation of the step.

    • sequence – positional, where the k‑th element labels step k+1.

    • None – no custom mapping.

  • default_name (str, default "") – Fallback label for steps missing from step_names. If empty, the step number itself is used (as a string).

Returns:

Mapping {step : label} for every element of forecast_steps.

Return type:

dict[int, str]

Notes

  • Dictionary keys are normalised with int(re.sub(r"[^0-9]", "", str(key))) before matching.

  • Duplicate keys in step_names are resolved by last‐one wins semantics.

Examples

>>> from geoprior.utils.forecast_utils import get_step_names
>>> get_step_names(
...     forecast_steps=[1, 2, 3],
...     step_names={"1": "Year 2021", 2: "2022", "step 3": "2023"},
... )
{1: 'Year 2021', 2: '2022', 3: '2023'}
>>> get_step_names(
...     forecast_steps=[1, 2, 3, 4],
...     step_names={"1": "2021", "2": "2022"},
... )
{1: '2021', 2: '2022', 3: '3', 4: '4'}
>>> get_step_names(
...     [1, 2, 3, 4],
...     step_names=None,
...     default_name="step with no name",
... )
{1: 'step with no name', 2: 'step with no name',
 3: 'step with no name', 4: 'step with no name'}
geoprior.utils.forecast_utils.stack_quantile_predictions(q_lower, q_median, q_upper)[source]#

Stack three quantile trajectories into a single y_pred array of shape (n_samples, 3, n_timesteps), ready for PSS.

Parameters:
  • q_lower (array-like) – Each is either - 1D: (n_timesteps,) → interpreted as a single sample, or - 2D: (n_samples, n_timesteps)

  • q_median (array-like) – Each is either - 1D: (n_timesteps,) → interpreted as a single sample, or - 2D: (n_samples, n_timesteps)

  • q_upper (array-like) – Each is either - 1D: (n_timesteps,) → interpreted as a single sample, or - 2D: (n_samples, n_timesteps)

Returns:

y_pred – Where axis=1 indexes [lower, median, upper].

Return type:

np.ndarray, shape (n_samples, 3, n_timesteps)

Raises:

ValueError – If the three inputs (after promotion) do not share the same shape.

geoprior.utils.forecast_utils.adjust_time_predictions(df, time_col, forecast_horizon, coord_scaler=None, inverse_transformed=False, verbose=1)[source]#

Adjusts time predictions by adding the forecast horizon to inverse normalized time. If the time column has already been inverse-transformed, skip the inverse transformation.

Parameters:
  • df (pd.DataFrame) – The DataFrame containing the time predictions (inverse scaled). The time column specified by time_col should contain the time values that need to be adjusted.

  • time_col (str) – The name of the time column in the DataFrame. This column will be adjusted by adding the forecast horizon.

  • forecast_horizon (int) – The forecast horizon (e.g., number of years or time steps) that will be added to the time predictions. This value shifts the time predictions forward.

  • coord_scaler (MinMaxScaler, optional) – The scaler that was used for the coordinates. It is necessary to reverse the scaling for the time column if it was previously normalized. If not provided, the time column should already be inverse-transformed.

  • inverse_transformed (bool, default False) – If True, skips the inverse transformation of the time column and directly adds the forecast horizon. This is useful when the time column has already been inverse-transformed, and you only need to adjust the time by the forecast horizon.

  • verbose (int, default 1) – Verbosity level for logging. Higher values (e.g., verbose=2) provide more detailed information about the operation.

Returns:

The adjusted DataFrame with the time column updated to reflect the forecast horizon. The time predictions are adjusted by adding the forecast_horizon to each entry in the time column.

Return type:

pd.DataFrame

Raises:

ValueError – If the time column is not found in the DataFrame or if the scaler is not available when necessary.

Examples

>>> import pandas as pd
>>> from sklearn.preprocessing import MinMaxScaler
>>> # Sample data for illustration
>>> df = pd.DataFrame({
>>>     'year': [0.0, 0.5, 1.0],
>>>     'subsidence': [0.1, 0.2, 0.3]
>>> })
>>> scaler = MinMaxScaler()
>>> df_scaled = df.copy()
>>> df_scaled['year'] = scaler.fit_transform(df_scaled[['year']])
>>> adjusted_df = adjust_time_predictions(
>>>     df_scaled,
>>>     time_col='year',
>>>     forecast_horizon=4,
>>>     coord_scaler=scaler,
>>>     inverse_transformed=False,
>>>     verbose=2
>>> )
>>> adjusted_df['year']
[0.0, 0.5, 1.0] -> After adjustment, will be shifted to the future.

Notes

  • The time column must be in a normalized scale if not already inverse-transformed.

  • If inverse_transformed=True, the time values will directly be adjusted by the forecast_horizon without applying the inverse transformation.

  • The forecast horizon is added directly to the time values after the necessary inverse transformation (if applicable).

See also

sklearn.preprocessing.MinMaxScaler

Scales features to [0,1].

geoprior.utils.forecast_utils.add_forecast_times(df, *, forecast_times=None, start=None, freq='YS', step_col='forecast_step', time_col='coord_t', error='raise', inplace=False, savefile=None, verbose=0)[source]#

Map each 1‑based forecast_step into an explicit calendar time.

You may either:
  1. Pass forecast_times of length H (one per step), or

  2. Pass a single start plus a pandas‐style freq to generate H dates.

If any entry in forecast_times is an integer of exactly 4 digits, it will be interpreted as January 1 of that year.

Parameters:
  • df (pd.DataFrame) – Long‐format forecast table. Must contain an integer column step_col with values 1..H.

  • forecast_times (sequence, optional) – Explicit sequence of length H specifying the target times. Each entry may be: - int (interpreted as January 1 of that year) - str/pd.Timestamp/datetime.date

  • start (int or str or date or Timestamp, optional) – Only used if forecast_times is None. The first time in the sequence; subsequent times will be generated via pd.date_range. If int, treated as a year at Jan 1.

  • freq (str, default "YS") – Pandas offset alias for frequency (e.g. “YS”=year start, “MS”=month start, “D”=day, etc.). Only used when start is set.

  • step_col (str, default "forecast_step") – Name of the 1‑based step index in df.

  • time_col (str, default "coord_t") – Name of the new column to create with mapped times.

  • error ({'raise','warn','ignore'}, default 'raise') – Policy if df[step_col].max() > number of provided times: - ‘raise’: throw ValueError - ‘warn’: issue warning, then still map what you can (truncate) - ‘ignore’: silently truncate to available times

  • inplace (bool, default False) – If True, modify df in place; otherwise return a new DataFrame.

  • savefile (str, optional) – If provided, path to CSV where the resulting DataFrame will be saved.

  • verbose (int, default 0) – Passed to vlog for debug logging.

Returns:

DataFrame with an added column time_col of dtype datetime64.

Return type:

pd.DataFrame

Raises:

ValueError – If neither forecast_times nor start is provided, or if error=’raise’ and there aren’t enough times.

Examples

>>> from geoprior.utils.forecast_utils import add_forecast_times
>>> df = pd.DataFrame({
...     "sample_idx": [0]*3 + [1]*3,
...     "forecast_step": [1,2,3]*2
... })
>>> add_forecast_times(df,
...     forecast_times=[2022,2023,2024])
   sample_idx  forecast_step     coord_t
0           0              1  2022-01-01
1           0              2  2023-01-01
2           0              3  2024-01-01
3           1              1  2022-01-01
4           1              2  2023-01-01
5           1              3  2024-01-01
>>> # Or generate from a start + yearly freq:
>>> add_forecast_times(df, start="2022-06-15", freq="YS")
geoprior.utils.forecast_utils.pivot_forecast(df, *, index_col='sample_idx', pivot_col=None, step_col='forecast_step', time_col='coord_t', value_cols=None, spatial_cols=None, aggfunc='first', fill_value=nan, sep='_', time_formatter=<function <lambda>>, inplace=False, savefile=None, verbose=0)[source]#

Pivot a long-format forecast DataFrame into a wide one.

This will take rows identified by index_col + a step_col (or datetime time_col) and spread each forecast step/time into its own set of columns for each value in value_cols, then re-attach the spatial_cols.

Parameters:
  • df (DataFrame) – Long-format forecasts. Must include index_col and at least one of step_col or time_col.

  • index_col (str) – Column that identifies each sample (e.g. “sample_idx”).

  • pivot_col (str | None) – If provided, pivot on this column instead of auto-detecting. Must be either step_col or time_col.

  • step_col (str) – Name of the integer 1‑based forecast step column.

  • time_col (str) – Name of the datetime column (e.g. “coord_t”).

  • value_cols (str | Sequence[str] | None) – Which forecast columns to pivot (e.g. “subsidence_q50” or [“subsidence_q10”,”subsidence_q50”,”subsidence_q90”]). If None, will auto-pick all numeric columns except index/pivot/spatial.

  • spatial_cols (Sequence[str] | None) – List of columns holding static spatial info (e.g. [“longitude”,”latitude”]) to join back once pivoted.

  • aggfunc (str | Callable) – Aggregation function for pivot (default “first”).

  • fill_value (Any) – What to put where a sample/step is missing (default NaN).

  • sep (str) – Separator between value name and step/time in the new column names (default “_”).

  • time_formatter (Callable[[Any], str]) – How to turn a datetime/timestamp into a string for column names (default “%Y-%m-%d”).

  • inplace (bool) – If True, modifies df instead of copying.

  • savefile (str | None) – If given, writes the resulting wide DataFrame to CSV at this path.

  • verbose (int) – Passed to vlog for logging.

Returns:

Wide-format DataFrame with one row per index_col and columns like <value><sep><step> or <value><sep><formatted time>.

Return type:

pd.DataFrame

Example

>>> from geoprior.utils.forecast_utils import pivot_forecast
>>> dff = pivot_forecast(
...    df_,
...    index_col="sample_idx",
...    pivot_col="coord_t",                # ← force pivot on the datetime
...    value_cols=["subsidence_q10","subsidence_q50","subsidence_q90"],
...    spatial_cols=["longitude","latitude"],
...    sep="_",                            # you’ll get subsidence_q50_2022 etc.
...    time_formatter=lambda t: f"{t.year}",
...    verbose=1
... )
[INFO] Pivoting on 'coord_t' for values ['subsidence_q10', 'subsidence_q50', 'subsidence_q90']
[INFO] Joining back spatial cols ['longitude', 'latitude']
>>> dff.columns
Out[37]:
Index(['sample_idx', 'subsidence_q10_2022', 'subsidence_q10_2023',
       'subsidence_q10_2024', 'subsidence_q50_2022', 'subsidence_q50_2023',
       'subsidence_q50_2024', 'subsidence_q90_2022', 'subsidence_q90_2023',
       'subsidence_q90_2024', 'longitude', 'latitude'],
      dtype='object')
geoprior.utils.forecast_utils.plot_reliability_diagram(models_data, y_true=None, prefix='subsidence', figsize=(8, 8), title='Reliability Diagram', plot_style='seaborn-whitegrid', verbose=None, _logger=None)[source]#

Plot a reliability diagram for one or multiple models.

Parameters:
  • models_data (dict) – Mapping of model names to forecast data. Each value can be a pandas.DataFrame or a nested dict with keys ‘forecasts’, ‘color’, ‘marker’, and ‘style’.

  • y_true (pandas.Series, optional) – Observed values for empirical coverage calculations. Required when forecasts need processing.

  • prefix (str, default 'subsidence') – Column prefix for quantile forecast fields.

  • figsize (tuple of int, default (8, 8)) – Figure size (width, height) in inches.

  • title (str, default 'Reliability Diagram') – Text title displayed at the top of the plot.

  • plot_style (str, default 'seaborn-whitegrid') – Matplotlib style sheet name to apply.

  • verbose (int, optional) – Verbosity level passed to geoprior.utils.generic_utils.vlog.

  • _logger (Logger or callable, optional) – Function or logger instance for internal messages.

Returns:

Displays the calibration plot and returns nothing.

Return type:

None

Notes

This function draws a diagonal baseline (perfect calibration) and computes empirical coverage for probabilistic intervals using specified quantiles. It wraps simple DataFrame inputs into the required nested format and uses vlog for conditional logging.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from geoprior.utils.forecast_utils import plot_reliability_diagram
>>> # Create dummy true time series
>>> dates = pd.date_range('2020-01-01', periods=100)
>>> y_true = pd.Series(
...     np.random.randn(100), index=dates
... )
>>> # Create forecasts for ModelA
>>> dfA = pd.DataFrame({
...     'subsidence_q10': y_true - 0.5,
...     'subsidence_q90': y_true + 0.5
... }, index=dates)
>>> # Simple usage with one model
>>> plot_reliability_diagram(
...     models_data={'ModelA': dfA},
...     y_true=y_true,
...     verbose=2
... )
>>> # Create forecasts for ModelB
... # with custom styling
>>> dfB = pd.DataFrame({
...     'subsidence_q10': y_true - 1.0,
...     'subsidence_q90': y_true + 1.0
... }, index=dates)
>>> custom_logger = print
>>> # Custom styling and logger
>>> plot_reliability_diagram(
...     models_data={
...         'ModelA': {
...             'forecasts': dfA,
...             'color': 'C0',
...             'marker': 'x'
...         },
...         'ModelB': {
...             'forecasts': dfB,
...             'color': 'C1',
...             'marker': 'o'
...         }
...     },
...     y_true=y_true,
...     verbose=4,
...     _logger=custom_logger
... )
geoprior.utils.forecast_utils.format_and_forecast(y_pred, y_true, *, coords=None, quantiles=None, target_name='subsidence', output_target_name=None, scaler_target_name=None, target_key_pred='subs_pred', component_index=0, scaler_info=None, coord_scaler=None, coord_columns=('coord_t', 'coord_x', 'coord_y'), train_end_time=None, forecast_start_time=None, forecast_horizon=None, future_time_grid=None, eval_forecast_step=None, eval_export='all', value_mode='rate', input_value_mode='rate', rate_first='cum_over_dtref', absolute_baseline=None, sample_index_offset=0, city_name=None, model_name=None, dataset_name=None, csv_eval_path=None, csv_future_path=None, time_as_datetime=False, time_format=None, calibration=False, calibration_kwargs=None, calibration_save_stats=None, eval_metrics=False, metrics_column_map=None, metrics_quantile_interval=(0.1, 0.9), metrics_per_horizon=False, metrics_extra=None, metrics_extra_kwargs=None, metrics_savefile=None, metrics_save_format='.json', metrics_time_as_str=True, output_unit=None, output_unit_from='m', output_unit_mode='overwrite', output_unit_suffix='_mm', output_unit_col=None, verbose=1, logger=None, **kws)[source]#

Format PINN forecasts into evaluation and future DataFrames.

This helper takes the raw model outputs (already split into y_pred['subs_pred'] / y_pred['gwl_pred']), the matching ground-truth dictionary (y_true), and optional coordinate and scaler information, and returns two DataFrames:

  • df_eval: predictions + actuals for an evaluation year (typically the last training year, e.g. 2022).

  • df_future: predictions for the future horizon (e.g. 2023–2025), without actuals.

Parameters:
  • y_pred (dict) –

    Dictionary of model predictions, as returned by GeoPriorSubsNet.predict post-processed into {'subs_pred': ..., 'gwl_pred': ...}.

    For subsidence, the expected shapes are:

    • Quantile mode: (B, H, Q, O) where: B = batch size, H = horizon steps, Q = number of quantiles, O = output dim.

    • Point mode: (B, H, O).

  • y_true (dict or None) –

    Dictionary of true targets, typically

    {'subsidence': ..., 'gwl': ...} or {'subs_pred': ..., 'gwl_pred': ...}.

    If None, evaluation DataFrame is still created but without the actual-value column.

  • coords (ndarray, optional) – Optional coordinates array aligned with predictions. Commonly shaped (B, H, 3) with columns [t_scaled, x_scaled, y_scaled]. Only x and y are used when inverse-transforming spatial coordinates; time is overwritten by the provided temporal config if given.

  • quantiles (list of float or None, optional) – List of quantiles (e.g. [0.1, 0.5, 0.9]) if the model was trained in probabilistic mode. If None, a single prediction column is emitted instead.

  • target_name (str, default 'subsidence') –

    Logical target identifier used as the default key for locating the target scaler in scaler_info and as a fallback for resolving truth arrays in y_true.

    Column naming is controlled by output_target_name (or the auto-derived output prefix when it is None).

  • output_target_name (str or None, optional) –

    Output prefix used when creating DataFrame columns for predictions and actuals.

    This controls the column naming only (e.g. the function will emit f"{output_target_name}_q10", f"{output_target_name}_pred", and f"{output_target_name}_actual").

    If None (default), the function derives the output prefix from target_name and applies a small convenience rule: if target_name ends with "_cum" or "_cumulative", that suffix is stripped for output naming.

    This keeps downstream tooling consistent (many plotting and metrics utilities expect names like subsidence_q10 rather than subsidence_cum_q10), while still allowing the scaler lookup to use the true target key. For example, with target_name="subsidence_cum" and output_target_name=None, output columns become subsidence_q10, subsidence_q50, and subsidence_actual. If output_target_name="subsidence_cum", the output columns keep the suffix such as subsidence_cum_q10.

  • scaler_target_name (str or None, optional) –

    Name used to locate the target scaling block inside scaler_info and to perform inverse-transform for predictions and actuals.

    This controls the scaler key and inverse scaling, not the output column naming.

    If None (default), the scaler key is assumed to be target_name. This is important when you want clean output columns but the scaler was fitted/stored under the original target name.

    A common pattern is to keep target_name="subsidence_cum" so the scaler lookup matches the Stage-1 schema, while letting output_target_name=None produce clean output columns. In that setup, inverse transform still uses the subsidence_cum scaler key, while output columns use the subsidence_ prefix because of the auto-strip rule.

  • target_key_pred (str, default 'subs_pred') – Key inside y_pred that holds the subsidence forecasts.

  • component_index (int, default 0) – Index along the output dimension O to use when output_subsidence_dim > 1. For scalar subsidence this is 0.

  • scaler_info (dict, optional) – Optional Stage-1 scaler_info mapping containing a target scaler under keys such as 'targets' or 'target'. The target block is expected to provide an sklearn-like transformer under 'scaler' together with column names under 'columns' or 'cols'. If present and consistent, subsidence values (predicted and actual) are inverse-transformed for target_name.

  • coord_scaler (object, optional) – Optional scaler used for coordinates. If provided, it is only used to inverse-transform coord_x and coord_y when coords is given and coord_columns can be matched. Time is not taken from the inverse transform; it is controlled by the temporal config.

  • coord_columns (tuple of str, default (``’coord_t’:py:class:`,`’coord_x’:py:class:`,`’coord_y’``)) – Logical names of the time, x, and y coordinate columns. These are used for DataFrame column naming and for mapping into coord_scaler if its block carries column names.

  • train_end_time (scalar or str or datetime, optional) – Physical time associated with the evaluation year (e.g. 2022). If eval_forecast_step is not given, the last horizon step is assumed to correspond to this time.

  • forecast_start_time (scalar or str or datetime, optional) – First time in the future forecast horizon (e.g. 2023).

  • forecast_horizon (int, optional) – Number of forecast steps in the future horizon (e.g. 3). If future_time_grid is not given, this is used together with forecast_start_time to build a regular grid.

  • future_time_grid (array-like, optional) – Explicit physical times for each forecast step, length H. For yearly data this might be [2023, 2024, 2025]. If provided, it overrides any automatic construction from forecast_start_time and forecast_horizon.

  • eval_forecast_step (int or None, optional) – Horizon step index (1-based) to use for evaluation. If None, defaults to the last horizon step H.

  • eval_export ({"all", "last"} or str or int or sequence, optional) –

    Controls which evaluation rows are exported in df_eval and written to csv_eval_path. By default ("all"), the function exports the multi-horizon evaluation DataFrame (df_eval_all), which contains one row per sample and forecast step (e.g. years 2020, 2021, 2022 for H=3).

    Accepted values are:

    • "all" or "full" or "horizons" : export all horizons from df_eval_all.

    • "last" or "single" or "default" : export only the single evaluation step specified by eval_forecast_step (backwards-compatible behaviour).

    • Other str (e.g. "2022") : interpreted as a time value for coord_t; only rows of df_eval_all whose time column matches this value are exported.

    • int or scalar non-string : interpreted as a single time value (e.g. 2022).

    • sequence of values (e.g. [2021, 2022]) : interpreted as a set of time values; only rows whose coord_t belongs to this set are exported.

    If time_as_datetime=True, the selection values are converted with pandas.to_datetime using time_format before filtering. If df_eval_all is not available (e.g. no ground truth was provided), the function falls back to exporting the single-step df_eval regardless of eval_export.

  • value_mode ({"rate", "cumulative", "absolute_cumulative"}, optional) –

    Controls how forecast values are interpreted along the temporal horizon for each sample. The default is "rate", which treats each forecast step as an incremental rate (e.g. annual subsidence rate) and leaves predictions unchanged.

    Supported modes are:

    • "rate" : keep per-step predictions as provided by the model (current behaviour).

    • "cumulative" or "cum" : convert per-step rates into relative cumulative values by applying a cumulative sum over forecast_step for each sample_idx. For example, for years 2023–2025, the value at 2024 is the sum of the 2023 and 2024 rates.

    • "absolute_cumulative" or "abs_cum" or "absolute" : same as "cumulative", then add an absolute baseline provided by absolute_baseline (e.g. cumulative subsidence at the end of the training period), yielding absolute cumulative trajectories.

    Cumulative transforms are applied consistently to:

    • the future forecast DataFrame (df_future),

    • the multi-horizon evaluation DataFrame (df_eval_all),

    • and the single-step evaluation DataFrame (df_eval, which is regenerated from df_eval_all after the transformation).

    When an unsupported string is given, the function logs a warning and falls back to "rate".

  • absolute_baseline (float or Mapping[int, float], optional) –

    Baseline value to use when value_mode requests absolute cumulative outputs ("absolute_cumulative", "abs_cum", "absolute"). This baseline is interpreted as the pre-forecast cumulative level for each sample, for example, cumulative subsidence at train_end_time (e.g. end of 2022), and is added after applying the cumulative sum over the forecast horizon.

    If a scalar float is provided, the same baseline value is added to all samples. If a mapping is provided, it must map sample_idx (integers) to baseline values, allowing per-sample baselines:

    • absolute_baseline = {sample_idx: baseline_value, ...}

    Only prediction columns for target_name are shifted (e.g. "subsidence_q10", "subsidence_q50", "subsidence_q90" or "subsidence_pred"). When df_eval_all is present, the corresponding "<target_name>_actual" column is shifted as well, so evaluation metrics operate on absolute cumulative values.

    If value_mode is an absolute cumulative variant but absolute_baseline is None, the function logs a warning and degrades gracefully to relative cumulative mode (i.e. no baseline shift is applied).

  • sample_index_offset (int, default 0) – Offset added to sample_idx (useful when concatenating multiple tiles).

  • city_name (str, optional) – Optional metadata used only for logging.

  • model_name (str, optional) – Optional metadata used only for logging.

  • dataset_name (str, optional) – Optional metadata used only for logging.

  • csv_eval_path (str, optional) – If provided, df_eval is written to this path (directories are created if needed).

  • csv_future_path (str, optional) – If provided, df_future is written to this path.

  • time_as_datetime (bool, default False) – If True, time values are converted using pandas.to_datetime() with the provided time_format (if any).

  • time_format (str or None, optional) – Optional format string passed to pandas.to_datetime() when time_as_datetime=True.

  • eval_metrics (bool, default False) – If True, automatically call evaluate_forecast() on the resulting df_eval to compute diagnostics. Metrics are not returned by this function; they are either written to disk (if metrics_savefile is provided) or discarded. For programmatic access to the metrics dictionary, call evaluate_forecast() directly.

  • metrics_column_map (mapping, optional) – Optional column mapping forwarded to evaluate_forecast() (see its documentation for details). If None, default column names such as 'coord_t', 'forecast_step', f'{target_name}_q10', and f'{target_name}_actual' are assumed.

  • metrics_quantile_interval (tuple of float, default (0.1, 0.9)) – Interval used for coverage and sharpness diagnostics in quantile mode, forwarded to evaluate_forecast().

  • metrics_per_horizon (bool, default False) – If True, per-horizon MAE/MSE/R² are computed by evaluate_forecast() and included in the diagnostics.

  • metrics_extra (sequence or mapping, optional) –

    Optional additional metrics to compute, forwarded to evaluate_forecast(). Can be:

    • A sequence of metric names (resolved via geoprior.metrics._registry.get_metric).

    • A mapping {name: func} where func is a callable taking (y_true, y_pred, **kwargs).

  • metrics_extra_kwargs (mapping, optional) – Optional per-metric keyword arguments, forwarded to evaluate_forecast(). Keys must match metric names in metrics_extra.

  • metrics_savefile (str, path-like, bool, or None) – If truthy, diagnostics from evaluate_forecast() are written to disk. Behavior matches the savefile argument of evaluate_forecast(). When True, a filename is auto-generated near the evaluation CSV (if any) or in the current working directory.

  • metrics_save_format ({'.json', 'json', '.csv', 'csv'}, default '.json') – Output format for diagnostics written by evaluate_forecast(). JSON preserves the nested metric structure; CSV flattens it into a tall table.

  • metrics_time_as_str (bool, default True) – If True, time keys in the diagnostics written by evaluate_forecast() are converted to strings (useful for JSON serialization).

  • verbose (int, default 1) – Verbosity level passed to vlog().

  • logger (logging.Logger, optional) – Logger instance; if None, a module-level LOG is used.

  • input_value_mode (str)

  • rate_first (str)

  • calibration (str | bool)

  • calibration_kwargs (Mapping[str, Any] | None)

  • calibration_save_stats (str | PathLike | None)

  • output_unit (str | None)

  • output_unit_from (str)

  • output_unit_mode (str)

  • output_unit_suffix (str)

  • output_unit_col (str | None)

Returns:

  • df_eval_to_write (pandas.DataFrame) – DataFrame containing predictions and actuals for the evaluation time. Columns include:

    • 'sample_idx'

    • 'forecast_step'

    • quantile columns (e.g. subsidence_q10) or subsidence_pred

    • 'subsidence_actual' (if y_true given)

    • coord_t, coord_x, coord_y (names from coord_columns).

  • df_future (pandas.DataFrame) – DataFrame containing predictions for the future horizon, without actuals. Same structure as df_eval but without the actual-value column.

Return type:

tuple[DataFrame, DataFrame]

Notes

This function separates scaler lookup (scaler_target_name) from output column naming (output_target_name). This is useful when the stored scaler key contains suffixes like "_cum" but downstream tools expect canonical names such as columns prefixed with subsidence_.

geoprior.utils.forecast_utils.evaluate_forecast(eval_data, *, target_name='subsidence', column_map=None, quantile_interval=(0.1, 0.9), per_horizon=False, extra_metrics=None, extra_metric_kwargs=None, overall_key='__overall__', savefile=None, save_format='.json', time_as_str=True, verbose=1, logger=None)[source]#

Evaluate forecast diagnostics from an evaluation DataFrame.

This helper consumes the df_eval output from format_and_forecast() (or a compatible DataFrame) and computes aggregate metrics such as MAE, MSE, \(R^2\), coverage, and sharpness. It can also optionally evaluate metrics per forecast horizon and apply additional user-defined metrics.

By default it expects the following columns:

  • 'sample_idx'

  • 'forecast_step'

  • 'coord_t' (time)

  • Quantile or point-prediction columns for the target, e.g.:

    • Quantile mode: f'{target_name}_q10', f'{target_name}_q50', f'{target_name}_q90', …

    • Point mode: f'{target_name}_pred'.

  • Actual column: f'{target_name}_actual'.

A flexible column_map allows remapping these logical roles to arbitrary column names, e.g.:

column_map = {
    'coord_t': 'date',
    'actual': 'true_subs',
    'pred': 'subs_predicted',
}

or, for quantile columns:

column_map = {
    'coord_t': 'date',
    'quantiles': {
        0.1: 'subs_q10',
        0.5: 'subs_q50',
        0.9: 'subs_q90',
    },
}
Parameters:
  • eval_data (str, path-like, or pandas.DataFrame) – Either a path to a CSV file containing the evaluation DataFrame (as saved by format_and_forecast()) or an in-memory DataFrame.

  • target_name (str, default 'subsidence') – Base name for the target columns. Used to infer default column names such as f'{target_name}_q10', f'{target_name}_pred', and f'{target_name}_actual'.

  • column_map (dict, optional) –

    Optional mapping to override default column names. The following keys are recognized:

    • 'sample_idx' : sample index column name (default 'sample_idx').

    • 'forecast_step' : horizon index column name (default 'forecast_step').

    • 'coord_t' : time coordinate column (default 'coord_t').

    • 'actual' : name or list of names for the actual target column(s). Currently a single column is supported; default f'{target_name}_actual'.

    • 'pred' : point prediction column for non-quantile mode, default f'{target_name}_pred'.

    • 'quantiles' :

      • If a mapping: {q: col_name} for quantile levels, where q is a float in (0, 1).

      • If a sequence of column names, the quantile value will be inferred from suffix patterns like f'{target_name}_q{int(q*100):d}'.

  • quantile_interval (tuple of float, default (0.1, 0.9)) – Interval (lower, upper) used for coverage and sharpness metrics, typically corresponding to an 80% interval between Q10 and Q90.

  • per_horizon (bool, default False) – If True, compute per-horizon MAE/MSE/R² grouped by the forecast_step column.

  • extra_metrics (sequence of str or mapping, optional) –

    Optional additional metrics to compute.

    • If a sequence of strings (e.g. ['pss', 'pit']), each name is resolved via geoprior.metrics._registry.get_metric(). If the name is not present in the registry, an error is raised, prompting the user to pass a callable instead.

    • If a mapping {name: func}, each func is called as:

      func(y_true, y_pred, **extra_metric_kwargs.get(name, {}))
      

      where y_pred is the median (Q50) or point forecast.

    For more complex metrics that require full quantile structure or temporal sequences, pass a suitable wrapper function that internally uses the DataFrame as needed.

  • extra_metric_kwargs (mapping, optional) – Optional mapping of per-metric keyword arguments. Keys must match the names in extra_metrics. Each value is a dict of kwargs forwarded to the corresponding metric function.

  • savefile (str, path-like, or bool, optional) –

    If provided, metrics are saved to disk.

    • If True: a filename is auto-generated near eval_data (if it is a path) or in the current working directory.

    • If a string/path without extension: the extension is taken from save_format.

    • If a string/path with extension: that extension takes precedence over save_format.

  • save_format ({'.json', 'json', '.csv', 'csv'}, default '.json') –

    Output format when savefile is truthy. JSON preserves nested structure; CSV is flattened into a tall table.

    • For JSON, the function returns the metrics dictionary.

    • For CSV, the function returns the metrics DataFrame.

  • time_as_str (bool, default True) – If True, time keys in the result dictionary are converted to strings (useful for JSON serialization). If there is only a single time value, the result is flattened and the time key is omitted.

  • verbose (int, default 1) – Verbosity level passed to vlog().

  • logger (logging.Logger, optional) – Optional logger instance used by vlog().

  • overall_key (str | None)

Returns:

results – If save_format is JSON (default), returns a dict:

  • Single time value:

    {
        "overall_mae": ...,
        "overall_mse": ...,
        "overall_r2": ...,
        "coverage80": ...,
        "sharpness80": ...,
        "per_horizon_mae": {1: ..., 2: ..., ...},
        ...
    }
    
  • Multiple time values:

    {
        "2021": { ...metrics... },
        "2022": { ...metrics... },
    }
    

If save_format is CSV, returns a DataFrame with flattened rows:

  • Columns include: coord_t, metric, horizon, and value.

Return type:

dict or pandas.DataFrame

Notes

  • Default metrics in quantile mode:

    • overall_mae, overall_mse, overall_r2

    • coverage80 and sharpness80 (using the requested interval, e.g., Q10–Q90)

    If per_horizon=True, also:

    • per_horizon_mae, per_horizon_mse, per_horizon_r2 (each a mapping from horizon index to score).

  • Default metrics in point mode (no quantiles):

    • mae, mse, r2

    And optionally, if per_horizon=True:

    • per_horizon_mae, per_horizon_mse, per_horizon_r2.