Diagnostics#

This gallery focuses on workflow diagnostics, split credibility, training behavior, tuning interpretation, and physics-consistency checks in GeoPrior.

Unlike the forecasting gallery, which explains how forecast outputs are built and read, and unlike the uncertainty gallery, which explains how probabilistic forecasts are calibrated and interpreted, the pages collected here are organized around a different practical question:

How should a user check whether the workflow itself is behaving sensibly before trusting later forecast and physics results?

The emphasis is therefore on diagnostic credibility. These examples show how GeoPrior turns preprocessing logic, training histories, tuning artifacts, and physics summaries into readable workflow diagnostics such as:

  • group-validity masks,

  • holdout split designs,

  • training-curve summaries,

  • hyperparameter tuning summaries,

  • bridge diagnostics from training to physics inspection.

In other words, this gallery is about checking the workflow: not training the final model itself, and not yet producing the final forecast or physics figures.

Module guide#

Module

Main output

Purpose

holdout_group_masks.py

Group-validity masks

Compute which spatial groups are valid for training and which remain usable only for forecasting, then filter the raw table accordingly.

spatial_block_holdout.py

Holdout split design

Compare random and spatial-block train/validation/test group splits and explain why spatial-block holdout is often more credible for geospatial forecasting.

plot_stage1_data_checks.py

Stage-1 data-validity and split diagnostics

Check which spatial groups are valid for training or only for forecasting, filter the raw table accordingly, and compare random versus spatial-block holdout credibility.

plot_stage2_training_curves.py

Training-history diagnostics

Read Stage-2 loss curves, physics-loss components, validation behavior, and warmup / scaling controls.

plot_stage3_tuning_summary.py

Tuning-summary diagnostics

Inspect Stage-3 hyperparameter search results, top trials, search progression, and parameter-versus-score structure.

physics_diagnostic_bridge.py

Physics bridge diagnostics

Connect training diagnostics to later physics inspection using timescale consistency, field distributions, residual summaries, and payload-based diagnostics.

Reading path#

A useful way to move through this gallery is to follow the logic of a complete workflow check:

  1. begin by checking whether the data groups are valid and whether the holdout split is credible,

  2. inspect Stage-2 training behavior,

  3. inspect Stage-3 tuning behavior,

  4. finish with the bridge from optimization diagnostics to physics diagnostics.

That is why the examples are grouped by workflow-diagnostic purpose rather than only by utility or plotting function.

Why this separation matters#

This gallery deliberately keeps four concerns distinct:

  • Stage-1 data and split checks,

  • Stage-2 optimization diagnostics,

  • Stage-3 tuning diagnostics,

  • bridge diagnostics into physics inspection.

That separation makes the workflow easier to understand. It also helps users distinguish between:

  • helpers that validate which groups are usable,

  • helpers that define the credibility of the holdout split,

  • diagnostic summaries that explain how training and tuning behaved,

  • and bridge artifacts that prepare later physics inspection.

Notes#

  • These examples are intentionally compact and lesson-oriented.

  • The pages in this section are diagnostics-first: they may print tables, split summaries, or metric summaries, but their main purpose is to explain how the workflow should be checked before later forecasts or physics figures are trusted.

  • A useful rule of thumb is:

    • forecasting/ explains how forecast outputs are built, read, and evaluated,

    • uncertainty/ explains calibration, reliability, and event-risk interpretation,

    • diagnostics/ explains workflow validity, training behavior, tuning behavior, and the bridge into physics diagnostics,

    • tables_and_summaries/ builds reusable analysis artifacts.

  • A practical reading sequence is:

    • first validate the groups,

    • then inspect the holdout strategy,

    • then inspect Stage-2 training curves,

    • then inspect Stage-3 tuning summaries,

    • then move to the physics diagnostic bridge before reading later model-inspection or figure-generation pages.

Group-validity masks for Stage-1 diagnostics

Group-validity masks for Stage-1 diagnostics

Stage-2 training curves and physics diagnostics

Stage-2 training curves and physics diagnostics

Physics diagnostics bridge: from evaluate_physics to payload inspection

Physics diagnostics bridge: from evaluate_physics to payload inspection

Compare independent regression pairs with plot_r2_in

Compare independent regression pairs with plot_r2_in

Understand regression agreement with plot_r2

Understand regression agreement with plot_r2

Spatial-block holdout as a Stage-1 diagnostic

Spatial-block holdout as a Stage-1 diagnostic

Stage-1 data checks with group masks and holdout splitting

Stage-1 data checks with group masks and holdout splitting

Stage-2 training curves and physics-aware learning dynamics

Stage-2 training curves and physics-aware learning dynamics

Stage-3 tuning summary and best-trial diagnostics

Stage-3 tuning summary and best-trial diagnostics