geoprior.utils.audit_utils#

Audit helpers for stage handshakes and scaling artifacts.

Functions

`audit_stage1_scaling`(*, df_train, ...[, ...])	Stage-1 audit: - raw df_train coord stats (t/x/y) + heuristic units - model-fed coords stats from inputs_train["coords"] (flattened) - coord scaler min/max + coord_ranges - SI channel sanity for physics cols (if present) - target arrays sanity - split of features: scaled ML vs __si vs other Saves a machine-readable JSON if save_dir is provided.
`audit_stage1_stage2_coord_consistency`(*, ...)	Cross-check coordinate semantics between Stage-1 scaler and Stage-2 NPZ coords.
`audit_stage2_handshake`(*, X_train, X_val, ...)
`audit_stage3_run`(*, manifest_path, manifest, ...)	Stage-3 audit: tuned artifacts + eval sanity.
`resolve_audit_stages`(audit_stages, *[, ...])	Resolve cfg["AUDIT_STAGES"] into a canonical set like {"stage1","stage2"}.
`should_audit`(audit_stages, *, stage[, default])	Convenience: should we audit this stage?

geoprior.utils.audit_utils.resolve_audit_stages(audit_stages, *, known=('stage1', 'stage2', 'stage3'), default=None)[source]#

Resolve cfg[“AUDIT_STAGES”] into a canonical set like {“stage1”,”stage2”}.

Parameters:

audit_stages (Any)
known (Iterable[str])
default (Any)

Return type:

set[str]

geoprior.utils.audit_utils.should_audit(audit_stages, *, stage, default=None)[source]#

Convenience: should we audit this stage?

Parameters:

audit_stages (Any)
stage (str)
default (Any)

Return type:

bool

geoprior.utils.audit_utils.audit_stage1_scaling(*, df_train, inputs_train, targets_train, coord_scaler=None, coord_ranges=None, coord_mode='auto', coords_in_degrees=False, coord_epsg_used=None, coord_x_col_used='x', coord_y_col_used='y', x_col_used='x', y_col_used='y', time_col_used='t', normalize_coords=True, keep_coords_raw=False, shift_raw_coords=False, subs_model_col=None, gwl_dyn_col=None, gwl_target_col=None, h_field_col=None, dynamic_features=None, static_features=None, future_features=None, scaled_ml_numeric_cols=None, main_scaler_path=None, scaler_info=None, save_dir=None, table_width=110, title_prefix='COORDINATE + FEATURE SCALING AUDIT (Stage-1)', city='Unknown', model_name='Model', sample_rows=5, log_fn=None)[source]#

Stage-1 audit: - raw df_train coord stats (t/x/y) + heuristic units - model-fed coords stats from inputs_train[“coords”] (flattened) - coord scaler min/max + coord_ranges - SI channel sanity for physics cols (if present) - target arrays sanity - split of features: scaled ML vs __si vs other Saves a machine-readable JSON if save_dir is provided.

Parameters:

inputs_train (dict[str, Any])
targets_train (dict[str, Any])
coord_scaler (Any)
coord_ranges (dict[str, float] | None)
coord_mode (str)
coords_in_degrees (bool)
coord_epsg_used (Any)
coord_x_col_used (str)
coord_y_col_used (str)
x_col_used (str)
y_col_used (str)
time_col_used (str)
normalize_coords (bool)
keep_coords_raw (bool)
shift_raw_coords (bool)
subs_model_col (str | None)
gwl_dyn_col (str | None)
gwl_target_col (str | None)
h_field_col (str | None)
dynamic_features (Iterable[str] | None)
static_features (Iterable[str] | None)
future_features (Iterable[str] | None)
scaled_ml_numeric_cols (Iterable[str] | None)
main_scaler_path (str | None)
scaler_info (dict | None)
save_dir (str | None)
table_width (int)
title_prefix (str)
city (str)
model_name (str)
sample_rows (int)

Return type:

str | None

geoprior.utils.audit_utils.audit_stage2_handshake(*, X_train, X_val, y_train, y_val, time_steps, forecast_horizon, mode, dyn_names, fut_names, sta_names, coord_scaler=None, sk_final, save_dir, table_width=100, title_prefix='STAGE-2 HANDSHAKE AUDIT', city='Unkown', model_name='Model', log_fn=None)[source]#

Parameters:

X_train (dict)
X_val (dict)
y_train (dict)
y_val (dict)
time_steps (int)
forecast_horizon (int)
mode (str)
dyn_names (list)
fut_names (list)
sta_names (list)
sk_final (dict)
save_dir (str)
table_width (int)
title_prefix (str)

geoprior.utils.audit_utils.audit_stage1_stage2_coord_consistency(*, X_train, coord_scaler, sk_final, time_steps, forecast_horizon, time_units='year', save_dir=None, table_width=110, title_prefix='STAGE-1 <-> STAGE-2 COORD CONSISTENCY', city='Unknown', model_name='Model', log_fn=None)[source]#

Cross-check coordinate semantics between Stage-1 scaler and Stage-2 NPZ coords.

Key facts for GeoPrior Stage-2:

coords are (N, H, 3) and correspond to target horizon times not the full dynamic history. So t has exactly H unique values.
x/y typically cover full normalized [0,1] range if you have spatial coverage (often min=0 and max=1).

This audit:

computes normalized min/max for t/x/y in X_train[“coords”]
derives implied raw min/max using MinMaxScaler data_min_ / data_max_
checks raw ranges are within Stage-1 scaler bounds
checks t_unique count == H and t_raw_unique spacing (≈1 year)
provides UTM plausibility hint if epsg is UTM-like

Parameters:

X_train (dict)
sk_final (dict)
time_steps (int)
forecast_horizon (int)
time_units (str)
save_dir (str | None)
table_width (int)
title_prefix (str)
city (str)
model_name (str)

geoprior.utils.audit_utils.audit_stage3_run(*, manifest_path, manifest, cfg, fixed_params, best_hps, run_dir, best_model_path, best_weights_path, use_tf_savedmodel, quantiles, forecast_horizon, mode, pred_shapes=None, eval_results=None, phys_diag=None, calibrator_factors=None, forecast_csv_eval=None, forecast_csv_future=None, metrics_json_path=None, physics_payload_path=None, save_dir=None, table_width=100, title_prefix='STAGE-3 AUDIT', city='Unknown', model_name='Model', log_fn=None)[source]#

Stage-3 audit: tuned artifacts + eval sanity.

Parameters:

manifest_path (str | None)
manifest (dict[str, Any])
cfg (dict[str, Any])
fixed_params (dict[str, Any])
best_hps (dict[str, Any] | None)
run_dir (str)
best_model_path (str | None)
best_weights_path (str | None)
use_tf_savedmodel (bool)
quantiles (Any)
forecast_horizon (int)
mode (str)
pred_shapes (dict[str, Any] | None)
eval_results (dict[str, Any] | None)
phys_diag (dict[str, Any] | None)
calibrator_factors (Any)
forecast_csv_eval (str | None)
forecast_csv_future (str | None)
metrics_json_path (str | None)
physics_payload_path (str | None)
save_dir (str | None)
table_width (int)
title_prefix (str)
city (str)
model_name (str)

Return type:

str | None