geoprior.models.utils.pinn#

Physics-Informed Neural Network (PINN) Utility functions.

Functions

check_and_rename_keys(inputs, y)

Helper function to check and rename keys in the inputs and target dictionaries.

check_required_input_keys(inputs[, y, ...])

Validate presence of required keys in inputs and y.

extract_txy(inputs[, coord_slice_map, ...])

Extracts t, x, y tensors from various input formats.

extract_txy_in(inputs[, coord_slice_map, ...])

Extracts t, x, y tensors from various input formats.

format_pihalnet_predictions([...])

Formats PIHALNet/GeoPriorSubsNet predictions into a structured pandas DataFrame, handling inversion, quantiles, and coordinates.

format_pinn_predictions([predictions, ...])

Formats PINN model predictions into a structured pandas DataFrame.

format_preds([pihalnet_outputs, model, ...])

Main function orchestrating all helper steps.

plot_hydraulic_head(model, t_slice, ...[, ...])

Generate and plot a 2D contour map of a hydraulic head solution.

prepare_pinn_data_sequences(df, time_col, ...)

process_pde_modes(pde_mode[, ...])

Normalize and validate PDE mode selection.

geoprior.models.utils.pinn.process_pde_modes(pde_mode, enforce_consolidation=False, pde_mode_config=None, solo_return=False)[source]#

Normalize and validate PDE mode selection.

Parameters:
  • pde_mode (str, sequence of str, or None) –

    Requested PDE mode(s).

    Accepted canonical values are: - "none" - "consolidation" - "gw_flow" - "both"

    Accepted aliases: - None, "off" -> "none" - "on" -> "both"

  • enforce_consolidation (bool, default False) –

    If True, any resolved mode other than exact ["consolidation"] is coerced to ["consolidation"] and a warning is emitted.

    This includes: - ["none"] - ["gw_flow"] - ["consolidation", "gw_flow"]

  • pde_mode_config (str, sequence of str, or None, optional) – Optional override. If provided, this value takes precedence over pde_mode.

  • solo_return (bool, default False) –

    If False, return a canonical list of active modes.

    If True, return a single canonical label: - "none" - "consolidation" - "gw_flow" - "both"

Returns:

Canonical PDE mode(s), either as a list or a single label.

Return type:

list of str or str

Raises:
  • TypeError – If the input type is invalid.

  • ValueError – If a token is unsupported or the mode selection is ambiguous.

Examples

>>> process_pde_modes(None)
['none']
>>> process_pde_modes("off")
['none']
>>> process_pde_modes("on")
['consolidation', 'gw_flow']
>>> process_pde_modes("both", solo_return=True)
'both'
>>> process_pde_modes("gw_flow", enforce_consolidation=True)
['consolidation']
geoprior.models.utils.pinn.format_pinn_predictions(predictions=None, model=None, model_inputs=None, y_true_dict=None, target_mapping=None, include_gwl=True, include_coords=True, quantiles=None, forecast_horizon=None, output_dims=None, ids_data_array=None, ids_cols=None, ids_cols_indices=None, scaler_info=None, coord_scaler=None, evaluate_coverage=False, coverage_quantile_indices=(0, -1), savefile=None, _logger=None, name=None, model_name=None, stop_check=None, verbose=0, **kwargs)[source]#

Formats PINN model predictions into a structured pandas DataFrame.

This is a general-purpose utility for transforming raw model outputs (from models like PIHALNet or TransFlowSubsNet) into a long-format DataFrame suitable for analysis, visualization, or export.

This is a powerful, general-purpose utility for transforming raw model outputs into a long-format DataFrame suitable for analysis, visualization, or export. It handles multi-target outputs (e.g., subsidence and GWL), point or quantile forecasts, and can optionally include true values, coordinate information, and other metadata. It also supports inverse-scaling of predictions and evaluation of quantile coverage.

Parameters:
  • predictions (dict of Tensors, optional) – The dictionary of prediction tensors, typically returned by a model’s .predict() method. Keys should match the model’s output layer names (e.g., 'subs_pred', 'gwl_pred'). If None, predictions are generated internally using the model and model_inputs arguments. Default is None.

  • model (keras.Model, optional) – A compiled Keras model instance used to generate predictions if the predictions dictionary is not provided. Default is None.

  • model_inputs (dict of Tensors, optional) – A dictionary of input tensors matching the model’s signature, required only if predictions is None. Default is None.

  • y_true_dict (dict, optional) – A dictionary containing the ground-truth target arrays, keyed by their base names (e.g., 'subsidence', 'gwl'). If provided, an <target>_actual column will be added to the output DataFrame for comparison. Default is None.

  • target_mapping (dict, optional) – A custom mapping from model output keys to desired base names in the DataFrame columns. For example: {'subs_pred': 'subsidence_mm', 'gwl_pred': 'head_m'}. Default is None.

  • include_gwl (bool, default True) – Toggles the inclusion of groundwater level (GWL) predictions in the final DataFrame.

  • include_coords (bool, default True) – Toggles the inclusion of the spatio-temporal coordinate columns (coord_t, coord_x, coord_y) in the final DataFrame.

  • quantiles (list of float, optional) – The list of quantile levels (e.g., [0.1, 0.5, 0.9]) that the model predicted. This is crucial for correctly parsing probabilistic forecasts. Default is None.

  • forecast_horizon (int, optional) – The length of the forecast horizon. If None, it is inferred from the shape of the prediction tensors. Default is None.

  • output_dims (dict of str, optional) – A dictionary specifying the feature dimension of each target, e.g., {'subs_pred': 1, 'gwl_pred': 1}. If None, it’s inferred from the tensor shapes. Default is None.

  • ids_data_array (np.ndarray or pd.DataFrame, optional) – An array or DataFrame containing static identifiers (e.g., well IDs, site categories) for each sample. Its length must match the number of samples in the prediction. Default is None.

  • ids_cols (list of str, optional) – A list of column names for the ids_data_array. Required if ids_data_array is a NumPy array. Default is None.

  • ids_cols_indices (list of int, optional) – A list of column indices to select from ids_data_array if it is a NumPy array. Default is None.

  • scaler_info (dict, optional) – A dictionary providing the necessary information to perform inverse scaling on a per-target basis. Each key should be a target name (e.g., ‘subsidence’) and its value a dictionary containing {'scaler': obj, 'all_features': list, 'idx': int}. Default is None.

  • coord_scaler (object, optional) – A fitted scikit-learn-like scaler object used to perform an inverse transform on the coordinate columns. Default is None.

  • evaluate_coverage (bool, default False) – If True and quantile predictions are present, calculates the unconditional coverage of the prediction interval.

  • coverage_quantile_indices (tuple of (int, int), default (0, -1)) – The indices of the lower and upper quantiles in the sorted quantiles list to use for the coverage calculation. Default is (0, -1), which corresponds to the full range.

  • savefile (str, optional) – If a file path is provided, the final DataFrame is saved to a CSV file at this location. Default is None.

  • name (str or None) – Name of the prediction. Name is used to format the output of the data and coverage result if applicable.

  • model_name (str, None,) – Name of the model.

  • verbose (int, default 0) – The verbosity level, from 0 (silent) to 5 (trace every step).

  • **kwargs (dict,) – Additional keyword arguments for future extensions.

  • _logger (Logger | Callable[[str], None] | None)

  • stop_check (Callable[[], bool])

Returns:

A long-format DataFrame where each row represents a single forecast step for a single sample. Columns include sample and step identifiers, coordinates, predictions, and optionally actuals and metadata.

Return type:

pd.DataFrame

Notes

  • The function returns a column-aligned DataFrame, which simplifies subsequent analysis and plotting.

  • For quantile forecasts, prediction columns are named using the pattern <target_name>_q<quantile*100>, e.g., subsidence_q5, subsidence_q50, subsidence_q95.

  • For point forecasts, the column is named <target_name>_pred.

See also

geoprior.plot.forecast.plot_forecasts

A powerful utility for visualizing the DataFrame produced by this function.

geoprior.models.utils.pinn.format_pihalnet_predictions(pihalnet_outputs=None, model=None, model_inputs=None, y_true_dict=None, target_mapping=None, include_gwl=True, include_coords=True, quantiles=None, forecast_horizon=None, output_dims=None, ids_data_array=None, ids_cols=None, ids_cols_indices=None, scaler_info=None, coord_scaler=None, evaluate_coverage=False, coverage_quantile_indices=(0, -1), savefile=None, name=None, model_name=None, apply_mask=False, mask_values=None, mask_fill_value=None, verbose=0, _logger=None, stop_check=None, **kwargs)[source]#

Formats PIHALNet/GeoPriorSubsNet predictions into a structured pandas DataFrame, handling inversion, quantiles, and coordinates.

This function is the core formatter. It: 1. Gets model outputs (or uses provided ones). 2. Unpacks ‘data_final’ if model_name is ‘geoprior’. 3. Inverse-transforms all prediction and actual arrays using scaler_info. 4. Builds a long-format DataFrame with sample_idx and forecast_step. 5. Appends inverted quantile/point predictions. 6. Appends inverted actual values. 7. Appends inverted coordinates. 8. Appends static/ID columns. 9. Evaluates coverage on the inverted data.

Parameters:
  • pihalnet_outputs (dict, optional) – Raw output from model.predict(). If None, model and model_inputs must be provided.

  • model (tf.keras.Model, optional) – Trained model instance (if pihalnet_outputs is None).

  • model_inputs (dict, optional) – Inputs for the model to generate predictions (if pihalnet_outputs is None).

  • y_true_dict (dict, optional) – Dictionary of true target arrays (e.g., {‘subs_pred’: y_true_s}). Required for including actuals and evaluating coverage.

  • target_mapping (dict, optional) – Maps prediction keys to base names for DataFrame columns. Default: {‘subs_pred’: ‘subsidence’, ‘gwl_pred’: ‘gwl’}.

  • include_gwl (bool, default True) – Whether to include ‘gwl_pred’ in the final DataFrame.

  • include_coords (bool, default True) – Whether to include ‘coord_t’, ‘coord_x’, ‘coord_y’ columns.

  • quantiles (list[float], optional) – List of quantiles (e.g., [0.1, 0.5, 0.9]). If provided, quantile columns (e.g., ‘subsidence_q10’) are created.

  • forecast_horizon (int, optional) – The forecast horizon length (H). If not provided, it’s inferred from the prediction array’s shape.

  • output_dims (dict, optional) – Maps prediction keys to their output dimension (O). E.g., {‘subs_pred’: 1, ‘gwl_pred’: 1}. Crucial for correctly splitting GeoPrior outputs and reshaping.

  • ids_data_array (np.ndarray or pd.DataFrame, optional) – Static/ID data (e.g., original coordinates) to merge. Must have the same number of samples (B) as predictions.

  • ids_cols (list[str], optional) – Column names if ids_data_array is a DataFrame.

  • ids_cols_indices (list[int], optional) – Column indices if ids_data_array is a NumPy array.

  • scaler_info (dict, optional) – Dictionary for inverse scaling. Each target entry should provide a fitted scaler, the target index inside that scaler, and the feature-name ordering used when the scaler was fit.

  • coord_scaler (sklearn.preprocessing.Scaler, optional) – A fitted scaler object for inverse transforming the ‘coords’ tensor.

  • evaluate_coverage (bool, default False) – If True, calculates coverage percentage for quantiles.

  • coverage_quantile_indices (tuple[int, int], default (0, -1)) – Indices of the low and high quantiles in the quantiles list to use for coverage (e.g., 0 and -1 for 10th and 90th).

  • savefile (str, optional) – If provided, saves the final DataFrame to this path.

  • model_name (str, optional) – Specifies the model type. If ‘geoprior’ or ‘geopriorsubsnet’, triggers unpacking of the ‘data_final’ output.

  • apply_mask (bool, default False) – If True, masks predictions based on mask_values in the first target’s _actual column.

  • mask_values (float or int, optional) – The value in the _actual column to trigger masking.

  • mask_fill_value (float, optional) – The value to replace masked predictions with (e.g., np.nan).

  • verbose (int, default 0) – Logging verbosity.

  • _logger (logging.Logger or callable, optional) – Logger object.

  • stop_check (callable, optional) – Function to check for early stopping.

  • name (str | None)

Returns:

A long-format DataFrame with predictions, actuals, and coordinates.

Return type:

pd.DataFrame

geoprior.models.utils.pinn.format_preds(pihalnet_outputs=None, model=None, model_inputs=None, y_true_dict=None, target_mapping=None, include_gwl=True, include_coords=True, quantiles=None, forecast_horizon=None, output_dims=None, ids_data_array=None, ids_cols=None, ids_cols_indices=None, scaler_info=None, coord_scaler=None, evaluate_coverage=False, coverage_quantile_indices=(0, -1), savefile=None, name=None, apply_mask=False, mask_values=None, mask_fill_value=None, verbose=0, _logger=None, stop_check=None, **kwargs)[source]#

Main function orchestrating all helper steps.

Parameters:
Return type:

DataFrame

geoprior.models.utils.pinn.prepare_pinn_data_sequences(df, time_col, subsidence_col, gwl_col, dynamic_cols, static_cols=None, future_cols=None, spatial_cols=None, h_field_col=None, lon_col=None, lat_col=None, group_id_cols=None, time_steps=12, forecast_horizon=3, output_subsidence_dim=1, output_gwl_dim=1, datetime_format=None, normalize_coords=True, cols_to_scale=None, lock_physics_cols=True, protect_si_suffix='__si', return_coord_scaler=False, coord_scaler=None, fit_coord_scaler=True, mode=None, model=None, savefile=None, progress_hook=None, stop_check=None, verbose=0, _logger=None, **kws)[source]#
Parameters:
Return type:

tuple[dict[str, ndarray], dict[str, ndarray]] | tuple[dict[str, ndarray], dict[str, ndarray], MinMaxScaler | None]

geoprior.models.utils.pinn.check_and_rename_keys(inputs, y)[source]#

Helper function to check and rename keys in the inputs and target dictionaries.

This function ensures that the necessary keys are present in both the inputs and y dictionaries. If the keys for ‘subsidence’ or ‘gwl’ are not found, it attempts to rename them from possible alternatives like ‘subs_pred’ or ‘gwl_pred’.

Parameters:
  • inputs (dict) – A dictionary containing the input data. The keys ‘coords’ and ‘dynamic_features’ are expected.

  • y (dict) – A dictionary containing the target values. The keys ‘subsidence’ and ‘gwl’ are expected, but they could also appear as ‘subs_pred’ or ‘gwl_pred’.

Raises:

ValueError – If required keys are missing in inputs or y, or if renaming does not result in valid keys for ‘subsidence’ and ‘gwl’.

geoprior.models.utils.pinn.check_required_input_keys(inputs, y=None, message=None, model_name=None, do_rename=True)[source]#

Validate presence of required keys in inputs and y. Optionally canonicalize keys via reverse alias mapping.

This function ensures that the necessary keys are present in both the inputs and y dictionaries. If the keys for ‘subsidence’ or ‘gwl’ are not found, it attempts to rename them from possible alternatives like ‘subs_pred’ or ‘gwl_pred’.

Parameters:
  • inputs (dict) – A dictionary containing the input data. The keys ‘coords’ and ‘dynamic_features’ are expected.

  • y (dict) – A dictionary containing the target values. The keys ‘subsidence’ and ‘gwl’ are expected, but they could also appear as ‘subs_pred’ or ‘gwl_pred’.

  • message (str, optional) – Message to raise error when inputs/y are not dictionnary.

  • model_name (str | None)

  • do_rename (bool)

Raises:

ValueError – If required keys are missing in inputs or y, or if renaming does not result in valid keys for ‘subsidence’ and ‘gwl’.

Return type:

tuple[dict[str, Any] | None, dict[str, Any] | None]

geoprior.models.utils.pinn.extract_txy_in(inputs, coord_slice_map=None, expect_dim=None, verbose=0, _logger=None, **kws)[source]#

Extracts t, x, y tensors from various input formats.

This utility standardizes coordinate inputs, accepting a single tensor or a dictionary, and handling both 2D (spatial/static) and 3D (spatio-temporal) data. It ensures a consistent 3D output format for robust downstream processing.

Parameters:
  • inputs (tf.Tensor, np.ndarray, or dict) – The input data containing coordinates. A single tensor or array may be 2D with shape (batch, 3) or 3D with shape (batch, time_steps, 3). A dictionary may contain a 'coords' key with the coordinate tensor, or separate 't', 'x', and 'y' keys.

  • coord_slice_map (dict, optional) – Mapping for ‘t’, ‘x’, ‘y’ to their index in the last dimension of a coordinate tensor. Defaults to {‘t’: 0, ‘x’: 1, ‘y’: 2}.

  • expect_dim ({'2d', '3d'}, optional) – If provided, enforces that the input resolves to the specified dimension. '2d' requires input shaped like (batch, 3) or a dictionary of (batch, 1) tensors. '3d' requires input shaped like (batch, time, 3) or a dictionary of (batch, time, 1) tensors. If None, both are accepted and 2D inputs are expanded to 3D.

  • verbose (int, default 0) – Controls the verbosity of logging messages. 0 is silent, 1 provides basic info, and higher values provide more detail.

  • _logger (Logger | Callable[[str], None] | None)

Returns:

t, x, y – The extracted t, x, and y coordinate tensors, each reshaped to be 3D with a singleton last dimension, e.g., (batch, time_steps, 1).

Return type:

Tuple[tf.Tensor, tf.Tensor, tf.Tensor]

Raises:

ValueError – If input format is unsupported, dimensions are inconsistent, or expect_dim constraint is violated.

geoprior.models.utils.pinn.extract_txy(inputs, coord_slice_map=None, expect_dim=None, verbose=0, _logger=None, **kws)[source]#

Extracts t, x, y tensors from various input formats.

This utility standardizes coordinate inputs, accepting a single tensor or a dictionary, and handling both 2D (spatial/static) and 3D (spatio-temporal) data with flexible dimension validation.

Parameters:
  • inputs (tf.Tensor, np.ndarray, or dict) – The input data containing coordinates. Can be a single tensor or a dictionary with ‘coords’ or ‘t’, ‘x’, ‘y’ keys.

  • coord_slice_map (dict, optional) – Mapping for ‘t’, ‘x’, ‘y’ to their index in the last dimension of a coordinate tensor. Defaults to {‘t’: 0, ‘x’: 1, ‘y’: 2}.

  • expect_dim ({'2d', '3d', '3d_only'}, optional) – Enforces a constraint on the input’s dimension. '2d' requires input shaped like (batch, 3). '3d' accepts 3D input and expands 2D input to 3D with a time dimension of 1. '3d_only' requires 3D input and raises an error for 2D input. None accepts both 2D and 3D inputs without changing their rank.

  • verbose (int, default 0) – Controls logging verbosity.

  • _logger (Logger | Callable[[str], None] | None)

Returns:

t, x, y – The extracted t, x, and y coordinate tensors. Their rank (2D or 3D) depends on the input and the expect_dim mode.

Return type:

Tuple[tf.Tensor, tf.Tensor, tf.Tensor]

Raises:

ValueError – If input format is unsupported, dimensions are inconsistent, or expect_dim constraint is violated.

geoprior.models.utils.pinn.plot_hydraulic_head(model, t_slice, x_bounds, y_bounds, resolution=100, ax=None, title=None, cmap='viridis', colorbar_label='Hydraulic Head (h)', save_path=None, show_plot=True, **contourf_kwargs)[source]#

Generate and plot a 2D contour map of a hydraulic head solution.

This utility visualizes the output of a Physics-Informed Neural Network (PINN) that solves for the hydraulic head \(h(t, x, y)\). It automates the process of creating a spatial grid, running model predictions, and generating a publication-quality contour plot for a specific slice in time.

Parameters:
  • model (tf.keras.Model) – The trained PINN model. It is expected to have a .predict() method that accepts a dictionary of tensors with keys {'t', 'x', 'y'}.

  • t_slice (float) – The specific point in time \(t\) for which to plot the 2D spatial solution.

  • x_bounds (tuple of float) – A tuple (x_min, x_max) defining the spatial domain for the x-axis.

  • y_bounds (tuple of float) – A tuple (y_min, y_max) defining the spatial domain for the y-axis.

  • resolution (int, optional) – The number of points to sample along each spatial axis, creating a grid of resolution x resolution points for prediction. Higher values result in a smoother plot. Default is 100.

  • ax (matplotlib.axes.Axes, optional) – A pre-existing Matplotlib Axes object to plot on. If None, a new figure and axes are created internally. This is useful for embedding this plot within a larger figure arrangement. Default is None.

  • title (str, optional) – A custom title for the plot. If None, a default title is generated using the value of t_slice. Default is None.

  • cmap (str, optional) – The name of the Matplotlib colormap to use for the contour plot. Default is 'viridis'.

  • colorbar_label (str, optional) – The text label for the color bar. Default is 'Hydraulic Head (h)'.

  • save_path (str, optional) – If provided, the path (including filename and extension) where the generated plot will be saved. This is only active when the function creates its own figure (i.e., when ax is None). Default is None.

  • show_plot (bool, optional) – If True, calls plt.show() to display the plot. This is only active when the function creates its own figure. Default is True.

  • **contourf_kwargs (any) – Additional keyword arguments that are passed directly to the matplotlib.pyplot.contourf function. This allows for advanced customization (e.g., levels=20, extend='both').

Returns:

  • ax (matplotlib.axes.Axes) – The Matplotlib Axes object on which the contour plot was drawn.

  • contour (matplotlib.cm.ScalarMappable) – The contour plot object, which can be used for further customizations, such as modifying the color bar.

Return type:

tuple[Axes, _ScalarMappable]

See also

geoprior.models.pinn.PiTGWFlow

The PINN model this function is designed to visualize.

Notes

The core mechanism of this function involves creating a 2D meshgrid of \((x, y)\) coordinates. These grid points are then “flattened” into a long list of points, as the PINN model expects a batch of individual coordinates for prediction, not a grid.

The prediction process is as follows:

  1. A grid of shape (resolution, resolution) is created for \(x\) and \(y\).

  2. These grids are reshaped into column vectors of shape (resolution*resolution, 1).

  3. A time vector of the same shape, filled with t_slice, is created.

  4. The model’s .predict() method is called on these flat tensors.

  5. The resulting flat prediction vector is reshaped back to the original (resolution, resolution) grid shape for plotting.

If a custom ax is provided, the user is responsible for calling plt.show() or saving the parent figure.

Examples

>>> import numpy as np
>>> import tensorflow as tf
>>> import matplotlib.pyplot as plt
>>> # This is a mock model for demonstration purposes.
>>> # In practice, you would use a trained PiTGWFlow model.
>>> class MockPINN(tf.keras.Model):
...     def call(self, inputs):
...         # A simple analytical function for demonstration
...         t, x, y = inputs['t'], inputs['x'], inputs['y']
...         return tf.sin(np.pi * x) * tf.cos(np.pi * y) * tf.exp(-t)
...
>>> mock_model = MockPINN()

1. Simple Plotting Example

This example creates a single plot and saves it to a file.

>>> ax, contour = plot_hydraulic_head(
...     model=mock_model,
...     t_slice=0.5,
...     x_bounds=(-1, 1),
...     y_bounds=(-1, 1),
...     resolution=50,
...     save_path="hydraulic_head_t0.5.png",
...     show_plot=False  # Do not display interactively
... )
Plot saved to hydraulic_head_t0.5.png

2. Advanced Example with Subplots

This example shows how to use the ax parameter to draw the solution at two different times side-by-side in one figure.

>>> fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))
>>> fig.suptitle('Hydraulic Head at Different Times', fontsize=16)
...
>>> # Plot solution at t = 0.1
>>> plot_hydraulic_head(
...     model=mock_model, t_slice=0.1, x_bounds=(-1, 1),
...     y_bounds=(-1, 1), ax=ax1, show_plot=False
... )
...
>>> # Plot solution at t = 1.0
>>> plot_hydraulic_head(
...     model=mock_model, t_slice=1.0, x_bounds=(-1, 1),
...     y_bounds=(-1, 1), ax=ax2, show_plot=False
... )
...
>>> plt.tight_layout(rect=[0, 0, 1, 0.96])
>>> plt.show()