Workflow 
=================

GeoPrior-v3 is designed as a **staged scientific workflow**
rather than as a single monolithic command or a model class
used in isolation.

This design is intentional. In physics-guided geohazard
modeling, many problems do not come from model architecture
alone. They come from the interaction between:

- data preparation,
- scaling and unit conventions,
- feature assembly,
- physics-aware configuration,
- training or inference behavior,
- diagnostics,
- export logic,
- and reproducibility requirements.

For that reason, GeoPrior-v3 organizes work into explicit
stages and treats configuration, artifacts, and audits as
first-class workflow objects.

Why the workflow is staged
--------------------------

A staged workflow is useful because it makes scientific and
technical problems easier to isolate.

Instead of pushing everything through one long opaque run,
GeoPrior-v3 encourages a stepwise progression in which each
stage has a clear role, a clear set of inputs, and a clear
set of outputs. This helps with:

- debugging path or data issues early,
- validating units and scaling before physics-heavy runs,
- separating preprocessing from modeling,
- keeping intermediate artifacts inspectable,
- making experiments easier to reproduce,
- reducing the risk of silent workflow drift.

This is especially important for a framework like GeoPrior-v3,
where a numerically successful run is not automatically a
scientifically trustworthy run.

Core workflow philosophy
------------------------

The staged workflow is built around a few guiding ideas.

**1. Configuration drives the run**

GeoPrior-v3 favors explicit configuration over hidden
assumptions. A run should be defined primarily by its
configuration and inputs, not by scattered code edits.

**2. Stages communicate through artifacts**

Instead of passing everything in memory across an opaque
pipeline, stages typically hand off structured artifacts such
as tensors, metadata, manifests, scaling contracts, logs,
forecasts, or figures.

**3. Audits matter**

Shape checks, scaling checks, unit checks, and handshake
audits are not side details. They are part of the scientific
workflow and help catch common silent failures early.

**4. Reproducibility is part of the design**

The workflow is meant to support not only model execution,
but also diagnostics, export, plotting, and figure-oriented
research pipelines.

How the five-stage view is organized
------------------------------------

In the current documentation, GeoPrior-v3 is organized into
a **five-stage workflow view**:

1. **Stage-1**
   prepares the initial workflow state, including early data
   processing, feature assembly, and stage-ready inputs.

2. **Stage-2**
   moves into the modeling-facing pipeline, including
   training-ready preparation, scaling or handshake logic,
   and model bring-up.

3. **Stage-3**
   focuses on downstream evaluation-oriented workflow steps,
   which may include diagnostics, calibration, or related
   post-training analysis depending on the run design.

4. **Stage-4**
   covers inference- or build-oriented workflow actions and
   the generation of deployable or analysis-ready outputs.

5. **Stage-5**
   completes the workflow with final export, plotting,
   reporting, or reproducibility-facing outputs.

.. note::

   Earlier internal or inherited documentation may still
   describe a narrower **Stage-1 → Stage-2** pipeline view.
   The current GeoPrior-v3 documentation expands that into a
   clearer five-stage structure so the full application
   workflow can be documented consistently as the project
   evolves.

What stays constant across stages
---------------------------------

Although the exact role of each stage may differ, the same
workflow principles apply throughout.

Across the stages, GeoPrior-v3 generally expects:

- explicit configuration,
- traceable artifacts,
- inspectable logs and outputs,
- stable naming conventions,
- stage-local validation,
- and clear handoff into the next step.

This means that each stage should be understandable not only
as code execution, but also as part of a larger scientific
contract.

Typical stage handoff artifacts
-------------------------------

A GeoPrior-v3 stage may read from or write to several kinds
of workflow artifacts.

Common examples include:

- processed arrays or tensors,
- exported NPZ bundles,
- scalers or encoders,
- metadata manifests,
- scaling or unit contracts,
- diagnostics JSON files,
- trained model bundles,
- forecast CSV outputs,
- figures and plot assets.

The exact files depend on the stage and application mode, but
the principle is the same: **each stage should leave behind a
traceable artifact boundary**.

Why artifact boundaries matter
------------------------------

Artifact boundaries are important because they make the
workflow inspectable.

For example, instead of assuming that the next stage received
the right data, you can inspect:

- whether a tensor export exists,
- whether scaling metadata was written,
- whether the output timestamp matches the current run,
- whether diagnostics indicate a mismatch,
- whether the next stage is using the intended manifest.

That is much safer than treating the workflow as a black box.

Relationship between configuration and stages
---------------------------------------------

The workflow stages are controlled by configuration.

A properly initialized configuration should define the
practical context of the run, including items such as:

- local paths,
- dataset or case-study identifiers,
- feature and artifact settings,
- runtime toggles,
- output locations,
- stage-specific options.

In practice, this means that users should not think of the
stages as isolated scripts. They should think of them as
**configuration-driven workflow steps** within one coherent
project run.

How a typical run progresses
----------------------------

A typical GeoPrior-v3 run follows this pattern:

1. initialize or review configuration;
2. run the earliest stage that prepares the workflow inputs;
3. inspect generated artifacts and basic diagnostics;
4. continue into modeling-facing stages;
5. inspect results, diagnostics, and exported summaries;
6. move to inference, build, plotting, or reproducibility
   steps as needed.

This pattern is more robust than trying to jump directly to a
late stage before confirming that earlier workflow contracts
were satisfied.

Recommended user mindset
------------------------

When using the workflow, it is best to think in terms of
**progressive validation**.

At each stage, ask:

- Did the stage read the intended config?
- Were the expected inputs found?
- Were outputs written where expected?
- Are shapes, units, and basic summaries plausible?
- Is the next stage reading the correct artifacts?

This mindset helps avoid one of the most common mistakes in
scientific workflow usage: assuming that command completion
automatically means scientific correctness.

How this connects to the CLI
----------------------------

GeoPrior-v3 exposes a command-line workflow surface through
dedicated entry points such as:

- ``geoprior``
- ``geoprior-run``
- ``geoprior-build``
- ``geoprior-plot``
- ``geoprior-init``

These entry points are part of the intended user experience.
The workflow is therefore not documented only as internal
Python code, but also as a real command-driven application
surface.

The stage pages in this section should be read together with:

- :doc:`cli`
- :doc:`configuration`

Those pages explain how the workflow is launched and how the
configuration layer controls it.

How this connects to the scientific foundations
-----------------------------------------------

The workflow is not independent from the scientific design.

In GeoPrior-v3, choices about:

- scaling,
- coordinates,
- units,
- physical residuals,
- and identifiability assumptions

can strongly affect what happens during later stages of the
workflow.

That is why users should not treat the workflow guide as
separate from the scientific foundations. A well-structured
run still depends on well-posed scientific assumptions.

In particular, the following pages become important once the
workflow reaches physics-guided execution:

- :doc:`../scientific_foundations/data_and_units`
- :doc:`../scientific_foundations/scaling`
- :doc:`../scientific_foundations/physics_formulation`
- :doc:`../scientific_foundations/identifiability`

Best practices for working stage by stage
-----------------------------------------

The most reliable way to use GeoPrior-v3 is to move through
the workflow incrementally.

Good practice includes:

- starting from a reviewed configuration;
- running one stage at a time when bringing up a project;
- inspecting artifacts before moving onward;
- keeping runs organized by output directory or manifest;
- avoiding ad hoc code edits when configuration is enough;
- using diagnostics and audits as part of the workflow, not
  as optional extras.

Bad practice includes:

- skipping directly to a late stage without checking earlier
  outputs;
- mixing artifacts from multiple incompatible runs;
- assuming old configs remain valid after workflow changes;
- interpreting forecasts before checking scaling and units.

A compact workflow map
----------------------

The GeoPrior-v3 workflow can be summarized like this:

.. code-block:: text

   initialize config
        ↓
   Stage-1: prepare inputs and early artifacts
        ↓
   Stage-2: bring up modeling-facing workflow
        ↓
   Stage-3: evaluate, diagnose, calibrate, refine
        ↓
   Stage-4: infer, build, or assemble final outputs
        ↓
   Stage-5: export, plot, and support reproducibility

This is a conceptual overview. The exact mechanics of each
stage are described in their dedicated pages.

Read the stages next
--------------------

The next best step is to move from the overview into the
individual stages.

.. grid:: 1 1 2 2
   :gutter: 3

   .. grid-item-card:: Stage-1
      :link: stage1
      :link-type: doc
      :class-card: sd-shadow-sm

      Learn how the workflow begins, how initial inputs are
      prepared, and what the first artifact boundary looks
      like.

   .. grid-item-card:: Stage-2
      :link: stage2
      :link-type: doc
      :class-card: sd-shadow-sm

      See how the workflow transitions into model-facing
      execution and stage-to-stage validation.

   .. grid-item-card:: Configuration
      :link: configuration
      :link-type: doc
      :class-card: sd-shadow-sm card--configuration

      Understand the configuration system that controls the
      staged workflow.

   .. grid-item-card:: CLI guide
      :link: cli
      :link-type: doc
      :class-card: sd-shadow-sm card--cli

      Move from the workflow concept into the actual command
      surface.

.. seealso::

   - :doc:`../getting_started/first_project_run`
   - :doc:`configuration`
   - :doc:`cli`
   - :doc:`stage1`
   - :doc:`stage2`
   - :doc:`../scientific_foundations/data_and_units`
   - :doc:`../scientific_foundations/scaling`