.. _cli-build-family: Build family ============ The **build family** is where you materialize reusable artifacts. Use this page when your goal is not to run one of the main staged workflows directly, and not to render a figure, but to **create a data product** that can be reused later. In GeoPrior, build commands are used to create things such as: - merged NPZ payloads, - compact forecast-ready samples, - spatially filtered or labeled tables, - external validation artifacts, - hotspot and exposure tables, - ablation and metrics summaries, - small geospatial side products. You can invoke these commands either from the root dispatcher: .. code-block:: bash geoprior build [args] or from the family-specific entry point: .. code-block:: bash geoprior-build [args] GeoPrior also supports ``make`` as an alias of ``build`` at the root dispatcher level. This page is intentionally structured as a **guide first** and a **reference scaffold second**: - the table below helps users find the right build command quickly, - the grouped sections afterwards explain how the family is organized, - detailed command subsections can be appended gradually as the build modules are reviewed in batches. How to choose a build command ----------------------------- A practical way to choose a build command is to first decide **what kind of artifact** you want: - if you need a compact or derived dataset, start with the **sampling and spatial preparation** commands, - if you need a model-facing NPZ or validation artifact, start with the **payload and validation** commands, - if you need a summary table or downstream analysis product, start with the **tables and summaries** commands, - if you need a lightweight derived boundary, grid, or zone-linked output, start with the **geospatial side products** commands. Many build commands also reuse the shared tabular input layer, so they can read one or many files, infer or force the input format, support Excel ``PATH::SHEET`` syntax, and write common tabular outputs in CSV, TSV, Parquet, Excel, JSON, Feather, or Pickle formats. That shared reader/writer behavior is centralized in ``geoprior.cli.utils`` rather than redefined in each command. Build commands at a glance -------------------------- .. list-table:: :header-rows: 1 :widths: 20 44 24 12 * - Command - Use it when - Main outcome - Related guide * - ``forecast-ready-sample`` - You want a compact panel sample prepared for forecasting-style workflows or demos. - Forecast-ready sample table. - :doc:`../user_guide/stage1` * - ``spatial-sampling`` - You want a stratified spatial sample from one or more input tables. - Spatial sample table. - :doc:`shared_conventions` * - ``batch-spatial-sampling`` - You want several non-overlapping spatial sampling batches rather than one sample. - Batched spatial sample outputs. - :doc:`shared_conventions` * - ``spatial-roi`` - You want a region-of-interest subset table. - ROI-filtered table. - :doc:`shared_conventions` * - ``spatial-clusters`` - You want spatial cluster labels added to a table. - Cluster-labeled table. - :doc:`../user_guide/diagnostics` * - ``extract-zones`` - You want threshold-based zone extraction. - Zone extraction table. - :doc:`../user_guide/diagnostics` * - ``assign-boreholes`` - You want nearest-city or city-assigned borehole tables. - Borehole assignment table. - :doc:`../user_guide/inference_and_export` * - ``add-zsurf-from-coords`` - You want to enrich a dataset with ``z_surf`` derived from coordinates. - ``z_surf``-enriched dataset. - :doc:`../scientific_foundations/data_and_units` * - ``full-inputs-npz`` - You want one merged ``full_inputs.npz`` assembled from split artifacts. - Full merged model-input NPZ. - :doc:`../user_guide/stage1` * - ``physics-payload-npz`` - You want a physics payload NPZ for downstream physics-oriented workflows. - Physics payload NPZ. - :doc:`../scientific_foundations/physics_formulation` * - ``external-validation-fullcity`` - You want full-city validation artifacts prepared for external validation workflows. - Full-city validation artifact set. - :doc:`../user_guide/inference_and_export` * - ``external-validation-metrics`` - You want summarized external validation metrics. - External validation metrics table. - :doc:`../user_guide/inference_and_export` * - ``sm3-collect-summaries`` - You want one combined SM3 summary table from multiple SM3 runs. - Combined SM3 summary table. - :doc:`run_family` * - ``brier-exceedance`` - You want exceedance-oriented Brier results computed into a table. - Exceedance Brier table. - :doc:`../user_guide/inference_and_export` * - ``hotspots`` - You want hotspot outputs computed from forecast or risk-style data. - Hotspot outputs. - :doc:`../user_guide/diagnostics` * - ``hotspots-summary`` - You want an already computed hotspot result condensed into a summary table. - Hotspot summary table. - :doc:`../user_guide/diagnostics` * - ``model-metrics`` - You want unified model metrics tables in tabular form. - Metrics tables in CSV/JSON style outputs. - :doc:`../user_guide/diagnostics` * - ``ablation-table`` - You want a compact ablation table built from ablation records. - Ablation summary table. - :doc:`../user_guide/diagnostics` * - ``update-ablation-records`` - You want to patch or enrich ablation record JSONL with metrics. - Updated ablation records. - :doc:`../user_guide/diagnostics` * - ``extend-forecast`` - You want to extend a future forecast CSV by extrapolation. - Extended future forecast CSV. - :doc:`../user_guide/stage4` * - ``boundary`` - You want to derive a boundary polygon from points. - Boundary geospatial output. - :doc:`../user_guide/inference_and_export` * - ``exposure`` - You want an exposure proxy table derived from points. - Exposure CSV or equivalent table. - :doc:`../applications/subsidence_forecasting` * - ``district-grid`` - You want a grid-based district layer. - District grid output. - :doc:`../user_guide/inference_and_export` * - ``clusters-with-zones`` - You want hotspot clusters assigned to zone IDs. - Zone-tagged cluster table. - :doc:`../user_guide/diagnostics` Shared patterns across many build commands ------------------------------------------ Although the build family covers many different artifacts, a large part of the user experience is intentionally consistent. Many build commands reuse the shared CLI data-loading layer. That gives the family a familiar feel across commands: - one or many input files can be accepted, - simple glob expansion is supported, - tabular formats can be inferred from file extensions or forced explicitly, - Excel files can use ``PATH::SHEET`` syntax, - loaded tables can be concatenated into one DataFrame, - output formats can be inferred from the destination extension and written consistently. Build commands also benefit from shared configuration and path helpers such as optional config installation, repeated ``--set KEY=VALUE`` overrides, reusable output-directory arguments, and consistent creation of destination folders. One build family, two implementation paths ------------------------------------------ The build family is broader than one implementation style. Some commands are implemented as dedicated wrappers under ``geoprior.cli``. Others are script-backed build commands exposed through the public registry under ``geoprior.scripts.registry``. From a user point of view, they still belong to one family, because the whole purpose of the dispatcher is to make them behave like one coherent public CLI. That unified design also preserves a helpful compatibility path for older reproducibility workflows. Script-backed commands can still be run through the legacy interface: .. code-block:: bash python -m scripts [args] while the modern public interface remains: .. code-block:: bash geoprior build [args] geoprior-build [args] This means the documentation can teach one public build family without hiding the fact that some commands also support legacy entry points. How the build family is organized --------------------------------- To make the build family easier to navigate, this guide groups the commands into three practical sections. The first group focuses on **sampling and spatial preparation**. These commands help turn raw or harmonized tables into compact, filtered, clustered, or region-specific datasets. The second group focuses on **payloads and validation artifacts**. These commands build model-facing or validation-facing artifacts that later workflows can consume directly. The third group focuses on **tables, summaries, and derived products**. These commands produce compact downstream artifacts for comparison, reporting, extrapolation, and lightweight geospatial analysis. That structure is meant to help users think in terms of **what they are trying to build**, not just in terms of command names. Sampling and spatial preparation ---------------------------------- These commands help you transform raw or harmonized tables into compact, filtered, clustered, or region-specific datasets. .. _build-forecast-ready-sample: ``forecast-ready-sample`` ~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``forecast-ready-sample`` when you want to build a **compact forecast-ready panel** from one or many tabular inputs. This command is meant for cases where you want a smaller, structured dataset that still respects the forecasting window logic. The wrapper describes it as a command that builds a compact forecast-ready panel sample, and it passes through options that control group sampling, window length, forecast horizon, year retention, and whether groups must contain a consecutive run of the required length. This is a good fit when you want to: - create a lighter-weight dataset for demos or examples, - test a forecasting workflow without using the full panel, - preserve valid forecasting windows while reducing data volume, - prepare a compact table for gallery lessons or downstream scripts. Usage ^^^^^ Build a simple forecast-ready sample: .. code-block:: bash geoprior build forecast-ready-sample data.csv -o forecast_sample.csv or: .. code-block:: bash geoprior-build forecast-ready-sample data.parquet -o forecast_sample.parquet Use multiple inputs with a larger lookback window: .. code-block:: bash geoprior-build forecast-ready-sample data/*.csv \ -o forecast_sample.csv \ --time-steps 5 \ --forecast-horizon 2 Control the sampled group fraction and spatial stratification: .. code-block:: bash geoprior-build forecast-ready-sample data.csv \ -o forecast_sample.csv \ --sample-size 0.10 \ --spatial-bins 10 12 \ --stratify-by city lithology_class Keep only the latest years per sampled group: .. code-block:: bash geoprior-build forecast-ready-sample data.csv \ -o forecast_sample.csv \ --keep-years 5 \ --year-mode latest Retain a fixed number of groups and a subset of columns: .. code-block:: bash geoprior-build forecast-ready-sample data.csv \ -o forecast_sample.csv \ --sample-size 500 \ --max-groups 500 \ --columns-to-keep longitude latitude year subsidence city Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--out`` Output table path. Its extension controls the written format. ``--time-steps`` and ``--forecast-horizon`` Define the lookback window and forecast horizon used to decide which sampled groups remain valid. ``--sample-size`` Sample size at the group level. A float means a fraction; an integer means an absolute group count. ``--keep-years`` and ``--year-mode`` Control how many years are retained per sampled group and whether they are chosen as the latest, earliest, random, or all available years. ``--require-consecutive`` Require each sampled group to contain a consecutive run of length ``time_steps + forecast_horizon``. ``--columns-to-keep`` Trim the final output to a selected subset of columns. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_forecast_ready_sample` .. _build-spatial-sampling: ``spatial-sampling`` ~~~~~~~~~~~~~~~~~~~~ Use ``spatial-sampling`` when you want to build **one stratified spatial sample table** from one combined input dataset. The wrapper reads one or many input files into one DataFrame, then calls ``spatial_sampling`` from ``geoprior.utils.spatial_utils``. It supports sampling by absolute count or fraction, optional stratification columns, explicit spatial binning, and automatic or explicit spatial coordinate selection. This command is a good fit when you want one sampled table that still tries to respect spatial structure rather than a purely random subset. Usage ^^^^^ Build one sampled table from a single input: .. code-block:: bash geoprior build spatial-sampling data.csv -o sampled.csv or from several inputs: .. code-block:: bash geoprior-build spatial-sampling data/*.csv -o sampled.parquet Sample by fraction: .. code-block:: bash geoprior-build spatial-sampling data.csv \ -o sampled.csv \ --sample-size 0.05 Sample by absolute count: .. code-block:: bash geoprior-build spatial-sampling data.csv \ -o sampled.csv \ --sample-size 5000 Add stratification and explicit spatial bins: .. code-block:: bash geoprior-build spatial-sampling data.csv \ -o sampled.csv \ --stratify-by city year \ --spatial-bins 12 12 \ --spatial-cols longitude latitude Use relative sampling with a minimum ratio: .. code-block:: bash geoprior-build spatial-sampling data.csv \ -o sampled.csv \ --method relative \ --min-relative-ratio 0.02 Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--sample-size`` Accepts either a positive integer or a fraction in ``(0, 1)``. ``--stratify-by`` Add non-spatial stratification columns on top of spatial sampling. ``--spatial-bins`` Accept one integer or one value per spatial column. Internally, the wrapper normalizes this to either a scalar or a tuple. ``--spatial-cols`` Override the coordinate columns used for sampling. ``--method`` and ``--min-relative-ratio`` Control whether the sampler behaves in absolute or relative mode. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_spatial_sampling` .. _build-batch-spatial-sampling: ``batch-spatial-sampling`` ~~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``batch-spatial-sampling`` when you want to build **many non-overlapping spatial sample batches** rather than one sampled table. The wrapper reads one combined input table, calls ``batch_spatial_sampling``, and then writes a stacked output with a batch identifier column. It can also optionally write one file per batch into a separate directory. This command is useful when you want repeated sampled subsets for comparisons, ablations, batch-style downstream processing, or repeated small-batch experiments. Usage ^^^^^ Build a stacked batch table: .. code-block:: bash geoprior build batch-spatial-sampling data.csv -o batches.csv Create more batches with a larger sample size: .. code-block:: bash geoprior-build batch-spatial-sampling data.csv \ -o batches.csv \ --sample-size 0.2 \ --n-batches 20 Control stratification and spatial bins: .. code-block:: bash geoprior-build batch-spatial-sampling data.csv \ -o batches.parquet \ --stratify-by city year \ --spatial-bins 10 12 \ --spatial-cols longitude latitude Also write one file per batch: .. code-block:: bash geoprior-build batch-spatial-sampling data.csv \ -o batches.csv \ --split-dir sampled_batches \ --split-prefix batch_ \ --split-format csv Change the batch identifier column in the stacked output: .. code-block:: bash geoprior-build batch-spatial-sampling data.csv \ -o batches.csv \ --batch-col batch_id Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--n-batches`` Number of non-overlapping sampled batches to create. ``--batch-col`` Column name inserted into the stacked output to mark each batch. ``--split-dir`` Optional directory where one separate file per batch is also written. ``--split-prefix`` and ``--split-format`` Control the naming and format of the per-batch files. See also ^^^^^^^^^^^^^^^^^ - :doc:`../auto_examples/tables_and_summaries/build_batch_spatial_sampling` .. _build-spatial-roi: ``spatial-roi`` ~~~~~~~~~~~~~~~ Use ``spatial-roi`` when you want to extract a **rectangular region of interest** from one combined spatial table. The wrapper merges one or many inputs into one DataFrame, then applies ``extract_spatial_roi``. It requires an ``x`` range and a ``y`` range, and it can either use the exact requested bounds or snap them to the nearest available coordinates. This is a simple and useful command when you want to cut out one study window, map tile, or local subregion before downstream analysis. Usage ^^^^^ Extract one ROI from a single input: .. code-block:: bash geoprior build spatial-roi data.csv \ --x-range 113.4 113.8 \ --y-range 22.6 22.9 \ -o roi.csv Use explicit coordinate column names: .. code-block:: bash geoprior-build spatial-roi data.csv \ --x-range 1090 1720 \ --y-range 835 1220 \ --x-col x_m \ --y-col y_m \ -o roi.parquet Use exact bounds without snapping: .. code-block:: bash geoprior-build spatial-roi data.csv \ --x-range 113.4 113.8 \ --y-range 22.6 22.9 \ --no-snap-to-closest \ -o roi.csv Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--x-range`` and ``--y-range`` Required lower and upper bounds for the rectangular ROI. ``--x-col`` and ``--y-col`` Coordinate column names used for filtering. ``--no-snap-to-closest`` Disable snapping to the nearest available coordinates and instead use the exact bounds supplied on the command line. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_spatial_roi` .. _build-spatial-clusters: ``spatial-clusters`` ~~~~~~~~~~~~~~~~~~~~ Use ``spatial-clusters`` when you want to add **spatial cluster labels** to one combined input table. The wrapper reads one or many tabular files, merges them into one DataFrame, and applies ``create_spatial_clusters``. It supports several clustering backends, optional standard scaling, configurable output column naming, and an optional diagnostic cluster plot. This is a good fit when you want region labels, cluster-based grouping, or spatial partitions that can later be summarized, plotted, or joined to other analysis outputs. Usage ^^^^^ Create cluster labels with default settings: .. code-block:: bash geoprior build spatial-clusters data.csv -o clustered.csv Choose explicit spatial columns and output label name: .. code-block:: bash geoprior-build spatial-clusters data.csv \ --spatial-cols longitude latitude \ --cluster-col region \ -o clustered.csv Fix the number of clusters and backend: .. code-block:: bash geoprior-build spatial-clusters data.csv \ --n-clusters 8 \ --algorithm kmeans \ -o clustered.csv Use a different clustering backend: .. code-block:: bash geoprior-build spatial-clusters data.csv \ --algorithm dbscan \ -o clustered.csv Display the diagnostic cluster plot: .. code-block:: bash geoprior-build spatial-clusters data.csv \ --view \ --figsize 14 10 \ --marker-size 60 \ --cmap tab20 \ -o clustered.csv Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--spatial-cols`` The two coordinate columns used for clustering. ``--cluster-col`` Name of the output column that stores the assigned cluster labels. ``--n-clusters`` Optional number of clusters. For ``kmeans``, the helper may auto-detect this when omitted. ``--algorithm`` Choose among ``kmeans``, ``dbscan``, or ``agglo``. ``--no-auto-scale`` Disable standard scaling before clustering. ``--view`` Display a diagnostic cluster plot using the plotting options exposed by the wrapper. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_spatial_clusters` .. _build-extract-zones: ``extract-zones`` ~~~~~~~~~~~~~~~~~ Use ``extract-zones`` when you want to extract rows that satisfy a **threshold-based zone criterion**. The wrapper reads one or many tables, merges them, and applies ``extract_zones_from``. The threshold can be ``auto``, one numeric value, or two numeric values for a between-range filter, and the command can optionally display a diagnostic plot of the extracted zone. This command is useful when you want to isolate risk zones, anomaly zones, hotspot candidates, or other threshold-defined subsets from a larger table. Usage ^^^^^ Extract rows above or below an automatically chosen threshold: .. code-block:: bash geoprior build extract-zones data.csv \ --z-col subsidence \ --threshold auto \ -o zones.csv Use a single numeric threshold: .. code-block:: bash geoprior-build extract-zones data.csv \ --z-col subsidence \ --threshold 10 \ --condition above \ -o zones.csv Use a between-range filter: .. code-block:: bash geoprior-build extract-zones data.csv \ --z-col subsidence \ --threshold 5 15 \ --condition between \ -o zones.csv Use percentile-driven auto thresholding with positive criteria: .. code-block:: bash geoprior-build extract-zones data.csv \ --z-col subsidence \ --threshold auto \ --percentile 90 \ --positive-criteria \ -o zones.csv Display the diagnostic plot with spatial coordinates: .. code-block:: bash geoprior-build extract-zones data.csv \ --z-col subsidence \ --threshold auto \ --x-col longitude \ --y-col latitude \ --view \ --plot-type scatter \ -o zones.csv Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--z-col`` Required column used to define the zone criterion. ``--threshold`` Accept ``auto``, one float, or two floats. The wrapper parses this into either an automatic threshold, a single threshold, or a between-range tuple. ``--condition`` Choose among ``auto``, ``above``, ``below``, and ``between``. ``--percentile`` and ``--positive-criteria`` Control how the automatic threshold behaves. ``--x-col`` / ``--y-col`` / ``--view`` Enable a diagnostic spatial view of the extracted zone. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_extract_zones` .. _build-assign-boreholes: ``assign-boreholes`` ~~~~~~~~~~~~~~~~~~~~ Use ``assign-boreholes`` when you want to classify each borehole row to the **nearest city point cloud**. This command is more structured than the earlier table-only wrappers. It can resolve city clouds from several sources: 1. explicit processed CSVs via ``--city-csv CITY=PATH``, 2. explicit Stage-1 directories via ``--city-stage1 CITY=DIR``, 3. repeated ``--stage1-dir DIR`` arguments with automatic city-name inference, 4. repeated city names via ``--cities`` together with ``--results-dir`` and optionally ``--model``. Its outputs include one combined classified CSV and, unless disabled, optional per-city split CSVs. Usage ^^^^^ Classify boreholes using explicit city processed CSVs: .. code-block:: bash geoprior build assign-boreholes \ --borehole-csv boreholes.csv \ --city-csv nansha=results/nansha_proc.csv \ --city-csv zhongshan=results/zhongshan_proc.csv Use explicit Stage-1 directories instead: .. code-block:: bash geoprior-build assign-boreholes \ --borehole-csv boreholes.csv \ --city-stage1 nansha=results/nansha_GeoPriorSubsNet_stage1 \ --city-stage1 zhongshan=results/zhongshan_GeoPriorSubsNet_stage1 Resolve cities from a results layout: .. code-block:: bash geoprior-build assign-boreholes \ --cities nansha zhongshan \ --results-dir results \ --model GeoPriorSubsNet Control output layout explicitly: .. code-block:: bash geoprior-build assign-boreholes \ --borehole-csv boreholes.csv \ --city-csv nansha=results/nansha_proc.csv \ --city-csv zhongshan=results/zhongshan_proc.csv \ --outdir borehole_assignment \ --output-stem boreholes Skip the per-city split files: .. code-block:: bash geoprior-build assign-boreholes \ --borehole-csv boreholes.csv \ --city-csv nansha=results/nansha_proc.csv \ --city-csv zhongshan=results/zhongshan_proc.csv \ --no-split-files Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--borehole-csv`` Validation borehole CSV containing point coordinates. When omitted, the command can also fall back to configured validation paths. ``--city-csv`` / ``--city-stage1`` / ``--stage1-dir`` / ``--cities`` Multiple ways to resolve the city clouds used for nearest-city classification. ``--borehole-x-col`` / ``--borehole-y-col`` and ``--city-x-col`` / ``--city-y-col`` Separate coordinate-column controls for the borehole table and the city processed tables. ``--tie-label`` and ``--tie-tol`` Control how ties in nearest-city distance are labeled. ``--output-stem`` / ``--classified-out`` / ``--no-split-files`` Control whether the command writes only one combined CSV or also one file per assigned city. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_assign_boreholes` .. _build-add-zsurf-from-coords: ``add-zsurf-from-coords`` ~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``add-zsurf-from-coords`` when you want to enrich one or more main datasets with **surface elevation** derived from coordinate-based lookup tables. This command is more than a plain join. It merges a main tabular dataset with a coordinate-to-elevation lookup on rounded longitude/latitude pairs and can optionally compute hydraulic head when a compatible depth-below-ground-surface column is available. It supports several ways to resolve inputs: - repeated city names with ``--city``, - explicit ``CITY=PATH`` mappings for main datasets and elevation lookups, - root directories plus filename patterns for main and elevation CSVs. This command is a good fit when you want to harmonize datasets before physics-aware analysis or to derive ``z_surf`` and optionally ``head_m`` in a reproducible way. Usage ^^^^^ Process one city using explicit input files: .. code-block:: bash geoprior build add-zsurf-from-coords \ --city nansha \ --main-csv nansha=data/nansha_final_main_std.harmonized.csv \ --elev-csv nansha=data/nansha_coords_with_elevation.csv Process several cities using repeated mappings: .. code-block:: bash geoprior-build add-zsurf-from-coords \ --city nansha \ --city zhongshan \ --main-csv nansha=data/nansha_main.csv \ --main-csv zhongshan=data/zhongshan_main.csv \ --elev-csv nansha=data/nansha_elev.csv \ --elev-csv zhongshan=data/zhongshan_elev.csv Resolve inputs from directory roots and filename patterns: .. code-block:: bash geoprior-build add-zsurf-from-coords \ --city nansha \ --data-root data/main \ --coords-root data/elevation \ --main-pattern "{city}_final_main_std.harmonized.csv" \ --elev-pattern "{city}_coords_with_elevation.csv" Write outputs to a dedicated folder and emit diagnostics JSON: .. code-block:: bash geoprior-build add-zsurf-from-coords \ --city nansha \ --main-csv nansha=data/nansha_main.csv \ --elev-csv nansha=data/nansha_elev.csv \ --outdir enriched \ --summary-json enriched/zsurf_summary.json Control coordinate rounding and duplicate reduction: .. code-block:: bash geoprior-build add-zsurf-from-coords \ --city nansha \ --main-csv nansha=data/nansha_main.csv \ --elev-csv nansha=data/nansha_elev.csv \ --round-decimals 6 \ --reducer median Disable hydraulic-head computation: .. code-block:: bash geoprior-build add-zsurf-from-coords \ --city nansha \ --main-csv nansha=data/nansha_main.csv \ --elev-csv nansha=data/nansha_elev.csv \ --no-head Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--main-csv`` and ``--elev-csv`` Repeated ``CITY=PATH`` mappings for the main dataset and the elevation lookup CSV. ``--data-root`` / ``--coords-root`` plus ``--main-pattern`` / ``--elev-pattern`` Let the command resolve city-specific files automatically from root directories and filename templates. ``--zsurf-col`` and ``--head-col`` Control the output column names for surface elevation and hydraulic head. ``--depth-col`` Provide one or more candidate depth-below-ground-surface column names used to compute hydraulic head. ``--round-decimals`` and ``--reducer`` Control coordinate rounding before merging and how duplicate coordinate-elevation rows are reduced. ``--summary-json`` Write merge diagnostics across all processed cities. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_add_zsurf_from_coords` Payloads and validation artifacts ---------------------------------- These commands build model-facing or validation-facing artifacts that other workflows can consume later. .. _build-full-inputs-npz: ``full-inputs-npz`` ~~~~~~~~~~~~~~~~~~~ Use ``full-inputs-npz`` when you want to build one **merged ``full_inputs.npz``** from the Stage-1 split input artifacts. This command is the simplest payload builder in the family. It resolves a Stage-1 ``manifest.json``, reads the split input NPZ artifacts listed there, concatenates them in split order, and writes one combined NPZ. By default it looks for the standard ``train``, ``val``, and ``test`` input splits and writes the result under the Stage-1 ``artifacts/`` directory as ``full_inputs.npz``. This command is a good fit when you want one reusable full-city input payload for downstream inference, physics export, or external validation-style workflows. Usage ^^^^^ Build the default merged input payload from a Stage-1 manifest: .. code-block:: bash geoprior build full-inputs-npz \ --manifest results/nansha_GeoPriorSubsNet_stage1/manifest.json or from the Stage-1 directory directly: .. code-block:: bash geoprior-build full-inputs-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 Resolve the manifest from a results layout: .. code-block:: bash geoprior-build full-inputs-npz \ --results-dir results \ --city nansha \ --model GeoPriorSubsNet Choose a subset of splits explicitly: .. code-block:: bash geoprior-build full-inputs-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --splits train val Write to an explicit output path: .. code-block:: bash geoprior-build full-inputs-npz \ --manifest results/nansha_GeoPriorSubsNet_stage1/manifest.json \ --output exports/nansha_full_inputs.npz Relax strict key matching across splits: .. code-block:: bash geoprior-build full-inputs-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --allow-missing-keys Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--manifest`` / ``--stage1-dir`` / ``--results-dir`` with ``--city`` and ``--model`` Multiple ways to resolve the Stage-1 manifest used to discover the split NPZ files. ``--splits`` Control which split input NPZ files are concatenated and in what order. The default is ``train val test``. ``--output`` and ``--output-name`` Either choose one explicit output path or let the command write under the Stage-1 ``artifacts/`` directory with a default name. ``--allow-missing-keys`` Relax strict key alignment when the split NPZ files do not all expose exactly the same arrays. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_full_inputs_npz` .. _build-physics-payload-npz: ``physics-payload-npz`` ~~~~~~~~~~~~~~~~~~~~~~~ Use ``physics-payload-npz`` when you want to export a **physics payload NPZ** from a trained GeoPrior model together with one resolved input payload. This command sits one step downstream from ``full-inputs-npz``. It resolves a Stage-1 manifest, selects or assembles the inputs to use, resolves a trained model, builds a TensorFlow dataset, and then calls the model’s ``export_physics_payload`` method to write a saved NPZ. When no explicit input NPZ is supplied, it can reuse an existing ``full_inputs.npz`` or assemble one from Stage-1 split artifacts. The output metadata includes the source inputs, manifest path, model path, and split label used for export. This command is a good fit when you want model-derived physics fields such as ``K``, ``Ss``, ``Hd``, ``H``, and related payload metrics in a compact artifact that later workflows can consume. Tests in the package also confirm that exported payloads include physics arrays such as ``tau``, ``tau_prior``, ``K``, ``Ss``, ``Hd``, and ``metrics``. Usage ^^^^^ Export a physics payload from a Stage-1 manifest and an explicit model: .. code-block:: bash geoprior build physics-payload-npz \ --manifest results/nansha_GeoPriorSubsNet_stage1/manifest.json \ --model-path results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/model_best.keras Let the command resolve the model and default inputs automatically: .. code-block:: bash geoprior-build physics-payload-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 Use one explicit full input NPZ: .. code-block:: bash geoprior-build physics-payload-npz \ --manifest results/nansha_GeoPriorSubsNet_stage1/manifest.json \ --inputs-npz results/nansha_GeoPriorSubsNet_stage1/artifacts/full_inputs.npz Assemble the payload from selected Stage-1 splits instead: .. code-block:: bash geoprior-build physics-payload-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --splits train val test Write a custom output file and label the source explicitly: .. code-block:: bash geoprior-build physics-payload-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --output exports/nansha_phys_payload_full.npz \ --source-label full_city_union Limit export to a smaller number of batches: .. code-block:: bash geoprior-build physics-payload-npz \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --batch-size 128 \ --max-batches 20 Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--inputs-npz`` Use one explicit input NPZ instead of asking the command to reuse or assemble inputs from Stage-1 artifacts. ``--splits`` Control which Stage-1 split inputs are merged when no explicit input NPZ is supplied. ``--model-path`` Point directly to the trained ``.keras`` model used for export. Otherwise the command tries to resolve one automatically. ``--output`` / ``--output-name`` / ``--source-label`` Control the saved payload path and how the source payload is labeled in the output metadata and default filename logic. ``--batch-size`` and ``--max-batches`` Control how much data is processed per export pass and whether the export is limited to only part of the dataset. **Related example:** - :doc:`../auto_examples/figure_generation/plot_physics_fields` .. _build-external-validation-fullcity: ``external-validation-fullcity`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``external-validation-fullcity`` when you want one **end-to-end external validation artifact workflow** starting from Stage-1 artifacts and a trained Stage-2 inference bundle. This command is the most orchestration-heavy builder in this group. Its implementation can: - resolve Stage-1 manifest and split input artifacts, - build ``full_inputs.npz`` when needed, - resolve the Stage-2 inference bundle from a model path, a Stage-2 manifest, or a Stage-2 run directory, - export a full-city physics payload, - match validation sites to the nearest model pixels, - compute site-level validation outputs and headline metrics. In other words, this is the convenience command you use when you want the **whole external validation pipeline** rather than only one piece of it. Usage ^^^^^ Run the full workflow with explicit Stage-1 and Stage-2 hints: .. code-block:: bash geoprior build external-validation-fullcity \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --stage2-run-dir results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500 \ --validation-csv data/nansha_validation.csv Resolve from manifests instead: .. code-block:: bash geoprior-build external-validation-fullcity \ --manifest results/nansha_GeoPriorSubsNet_stage1/manifest.json \ --stage2-manifest results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/manifest.json \ --validation-csv data/nansha_validation.csv Provide one explicit model file: .. code-block:: bash geoprior-build external-validation-fullcity \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --model-path results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/model_best.keras \ --validation-csv data/nansha_validation.csv Reuse a prebuilt full-city input payload: .. code-block:: bash geoprior-build external-validation-fullcity \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --full-inputs-npz results/nansha_GeoPriorSubsNet_stage1/artifacts/full_inputs.npz \ --validation-csv data/nansha_validation.csv Choose explicit output locations: .. code-block:: bash geoprior-build external-validation-fullcity \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --stage2-run-dir results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500 \ --validation-csv data/nansha_validation.csv \ --outdir exports/external_validation_nansha \ --out-payload exports/external_validation_nansha/fullcity_phys_payload.npz Control matching and validation column choices: .. code-block:: bash geoprior-build external-validation-fullcity \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --stage2-run-dir results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500 \ --validation-csv data/nansha_validation.csv \ --x-col x \ --y-col y \ --productivity-col step3_specific_capacity_Lps_per_m \ --thickness-col approx_compressible_thickness_m Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--validation-csv`` External validation table used for site-to-pixel matching and metric computation. ``--full-inputs-npz`` and ``--out-payload`` Reuse or explicitly place the full-city input and payload artifacts involved in the workflow. ``--stage2-run-dir`` / ``--stage2-manifest`` / ``--model-path`` Multiple ways to resolve the trained inference bundle used for the full-city payload export. ``--x-col`` / ``--y-col`` / ``--productivity-col`` / ``--thickness-col`` Select the validation-table columns used for nearest-pixel matching and for the site-level metrics. ``--horizon-reducer`` / ``--site-reducer`` / ``--max-match-distance-m`` / ``--min-unique-pixels`` Control how horizon-level arrays are collapsed to site-level values and how strict the site-to-pixel sanity checks are. **Related example:** - :doc:`../auto_examples/figure_generation/plot_external_validation` .. _build-external-validation-metrics: ``external-validation-metrics`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``external-validation-metrics`` when you already have the necessary artifacts and want to compute **external validation joins and headline metrics** without rerunning the full-city payload export workflow. This command focuses on the site-level validation step itself. It loads a validation CSV, resolves the Stage-1 manifest and the input NPZ used for matching, loads a saved physics payload, matches each validation site to the nearest model pixel, and writes both a site-level table and a metrics JSON. The top-level module docstring describes it as a command for computing borehole or pumping validation metrics by matching site coordinates to the nearest pixel in a Stage-1 grid and then joining model-derived fields from a saved physics payload. This is the right command when the payload already exists and you want to recompute metrics or experiment with different matching and reduction settings without rebuilding the payload first. Usage ^^^^^ Compute metrics from explicit Stage-1 and payload artifacts: .. code-block:: bash geoprior build external-validation-metrics \ --manifest results/nansha_GeoPriorSubsNet_stage1/manifest.json \ --physics-payload results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/nansha_phys_payload_full_city_union.npz \ --validation-csv data/nansha_validation.csv Resolve the Stage-1 side from the Stage-1 directory: .. code-block:: bash geoprior-build external-validation-metrics \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --physics-payload results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/nansha_phys_payload_full_city_union.npz \ --validation-csv data/nansha_validation.csv Use a specific split input NPZ for matching: .. code-block:: bash geoprior-build external-validation-metrics \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --split test \ --inputs-npz results/nansha_GeoPriorSubsNet_stage1/artifacts/test_inputs.npz \ --physics-payload results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/nansha_phys_payload_test.npz \ --validation-csv data/nansha_validation.csv Let the command resolve the payload from a Stage-2 manifest: .. code-block:: bash geoprior-build external-validation-metrics \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --stage2-manifest results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/manifest.json \ --validation-csv data/nansha_validation.csv Write the metrics artifacts to a chosen directory: .. code-block:: bash geoprior-build external-validation-metrics \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --physics-payload results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/nansha_phys_payload_full_city_union.npz \ --validation-csv data/nansha_validation.csv \ --outdir exports/external_validation_metrics Use different reducers and stricter sanity checks: .. code-block:: bash geoprior-build external-validation-metrics \ --stage1-dir results/nansha_GeoPriorSubsNet_stage1 \ --physics-payload results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500/nansha_phys_payload_full_city_union.npz \ --validation-csv data/nansha_validation.csv \ --horizon-reducer mean \ --site-reducer median \ --max-distance-m 3000 \ --min-unique-pixels 5 Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--validation-csv`` External site table used for the nearest-pixel join. ``--physics-payload`` Saved physics payload NPZ from which model-derived fields such as ``K`` and ``Hd`` are joined into the site-level validation table. ``--split`` / ``--inputs-npz`` / ``--coord-scaler`` Control which Stage-1 input coordinates are used for matching and how they are inverse-transformed into physical space. ``--stage2-manifest`` Optional shortcut for resolving the payload path from a Stage-2 run. ``--horizon-reducer`` / ``--site-reducer`` / ``--max-distance-m`` / ``--min-unique-pixels`` Control the horizon reduction, the site aggregation, and the match sanity thresholds. **Related example:** - :doc:`../auto_examples/figure_generation/plot_external_validation` .. _build-sm3-collect-summaries: ``sm3-collect-summaries`` ~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``sm3-collect-summaries`` when you want to collect **many per-run SM3 summary CSV files into one combined table**. This command scans an SM3 suite root, reads the per-regime summary CSV files found beneath it, injects the inferred regime and run directory into each collected table, concatenates everything, and writes both a combined CSV and a combined JSON output. If no explicit suite root is provided, it can discover the newest suite-like directory under a results root or resolve one from config. This is the natural build-side companion to the run-family ``sm3-suite`` command: the suite creates many regime runs, and ``sm3-collect-summaries`` condenses their summary outputs into one combined artifact. Usage ^^^^^ Collect one explicit suite root: .. code-block:: bash geoprior build sm3-collect-summaries \ --suite-root results/sm3_tau_suite_20260330-120000 or discover the newest suite under a results directory: .. code-block:: bash geoprior-build sm3-collect-summaries \ --results-dir results Write the combined outputs into a dedicated folder: .. code-block:: bash geoprior-build sm3-collect-summaries \ --suite-root results/sm3_tau_suite_20260330-120000 \ --outdir exports/sm3_summary_bundle Choose a custom output stem: .. code-block:: bash geoprior-build sm3-collect-summaries \ --suite-root results/sm3_tau_suite_20260330-120000 \ --output-stem tau_suite_combined Write explicit CSV and JSON paths: .. code-block:: bash geoprior-build sm3-collect-summaries \ --suite-root results/sm3_tau_suite_20260330-120000 \ --out-csv exports/tau_suite_combined.csv \ --out-json exports/tau_suite_combined.json Tighten the collection rules: .. code-block:: bash geoprior-build sm3-collect-summaries \ --suite-root results/sm3_tau_suite_20260330-120000 \ --summary-name sm3_synth_summary.csv \ --regime-pattern "sm3_(?:tau|both)_(.+?)_50$" \ --strict Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--suite-root`` Explicit suite directory to scan. If omitted, the command can try to discover the newest suite under ``--results-dir`` or from config. ``--summary-name`` Name of the per-run summary CSV files to collect. ``--regime-pattern`` Regular expression used to infer the regime name from each run directory. ``--output-stem`` / ``--out-csv`` / ``--out-json`` Control whether the command writes to default combined outputs or to fully explicit CSV and JSON destinations. ``--strict`` Fail on malformed or unreadable summary files instead of skipping them. **Related example:** - :doc:`../auto_examples/figure_generation/plot_sm3_bounds_ridge_summary` Tables, summaries, and derived products ------------------------------------------ These commands build compact downstream artifacts for analysis, comparison, or reporting. .. _build-brier-exceedance: ``brier-exceedance`` ~~~~~~~~~~~~~~~~~~~~ Use ``brier-exceedance`` when you want to compute **Brier scores for subsidence exceedance events** from calibrated forecast CSV files. This command works from forecast outputs that contain ``subsidence_actual`` together with predictive quantiles such as ``subsidence_q10``, ``subsidence_q50``, and ``subsidence_q90``. It approximates exceedance probabilities by piecewise-linear interpolation of the quantile-based CDF, then computes tidy Brier results across one or more thresholds and years. It supports test-first auto-discovery, explicit city CSV overrides, and a choice between test or validation sources. This command is a good fit when you want one compact table for event probability skill rather than a full forecast or uncertainty panel. Usage ^^^^^ Auto-discover the forecast CSVs under a results root: .. code-block:: bash geoprior build brier-exceedance \ --root results Force the command to use validation CSVs: .. code-block:: bash geoprior-build brier-exceedance \ --root results \ --source val Use explicit city CSVs: .. code-block:: bash geoprior-build brier-exceedance \ --ns-csv results/nansha_eval_calibrated.csv \ --zh-csv results/zhongshan_eval_calibrated.csv Evaluate several thresholds over selected years: .. code-block:: bash geoprior-build brier-exceedance \ --root results \ --thresholds 30,50,70 \ --years 2020,2021,2022 Write to an explicit output CSV: .. code-block:: bash geoprior-build brier-exceedance \ --root results \ --out external_brier_scores.csv Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--source`` Choose ``auto``, ``test``, or ``val`` discovery mode for the input forecast CSVs. ``auto`` prefers test-set forecasts when available. ``--thresholds`` Comma-separated exceedance thresholds in mm/yr. ``--years`` Comma-separated years to evaluate, or ``all``. ``--ns-csv`` and ``--zh-csv`` Explicit city-level CSV overrides that bypass directory scanning. **Related example:** - :doc:`../auto_examples/tables_and_summaries/compute_brier_exceedance` .. _build-hotspots: ``hotspots`` ~~~~~~~~~~~~ Use ``hotspots`` when you want to compute a **city × year hotspot characteristics table** from evaluation and future forecast CSVs. This command compares forecast or annualized subsidence against a baseline year, computes anomaly magnitude, selects hotspots using a percentile threshold, and exports a compact summary table. It supports different subsidence interpretations (``cumulative``, ``rate``, or ``increment``), several future years, and both CSV and LaTeX export. The implementation summarizes hotspot counts, hotspot subsidence intensity, anomaly magnitude, baseline mean, and the threshold used for each city-year pair. This is the right command when you want a compact **hotspot table** rather than hotspot point clouds. Usage ^^^^^ Auto-discover city inputs and summarize the default future years: .. code-block:: bash geoprior build hotspots \ --ns-src results/nansha \ --zh-src results/zhongshan Use explicit eval and future CSVs: .. code-block:: bash geoprior-build hotspots \ --ns-eval results/nansha_eval.csv \ --ns-future results/nansha_future.csv \ --zh-eval results/zhongshan_eval.csv \ --zh-future results/zhongshan_future.csv Choose a different baseline year and percentile threshold: .. code-block:: bash geoprior-build hotspots \ --ns-src results/nansha \ --zh-src results/zhongshan \ --baseline-year 2022 \ --percentile 95 Use a different forecast quantile and baseline source: .. code-block:: bash geoprior-build hotspots \ --ns-src results/nansha \ --zh-src results/zhongshan \ --quantile q90 \ --baseline-source q50 Export both CSV and LaTeX: .. code-block:: bash geoprior-build hotspots \ --ns-src results/nansha \ --zh-src results/zhongshan \ --format both \ --out tab_hotspots Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--baseline-year`` Reference year against which hotspot anomaly is computed. ``--percentile`` Percentile threshold used to define hotspot anomaly magnitude. ``--subsidence-kind`` Tell the command whether the CSV values are cumulative, annual rate, or annual increment. ``--baseline-source`` Choose whether the baseline comes from ``actual`` or ``q50`` in the evaluation CSV. ``--quantile`` Select the forecast quantile used for hotspot computation. ``--format`` Export as CSV, TeX, or both. **Related example:** - :doc:`../auto_examples/tables_and_summaries/compute_hotspots` .. _build-hotspots-summary: ``hotspots-summary`` ~~~~~~~~~~~~~~~~~~~~ Use ``hotspots-summary`` when you already have a **hotspot point CSV** and want a grouped summary by city, year, and kind. This command is narrower than ``hotspots``. It does not recompute the hotspot logic from evaluation and future forecasts; instead, it reads a hotspot point cloud CSV and summarizes it into a tidy table with counts, min/mean/max hotspot value, and min/mean/max anomaly metric. When available, it also preserves grouped baseline and threshold summaries. This is the right command when the hotspot points already exist and you only want a reporting table. Usage ^^^^^ Summarize one hotspot point cloud CSV: .. code-block:: bash geoprior build hotspots-summary \ --hotspot-csv results/fig6_hotspot_points.csv Write to a custom output path: .. code-block:: bash geoprior-build hotspots-summary \ --hotspot-csv results/fig6_hotspot_points.csv \ --out fig6_hotspot_summary.csv Suppress console printing: .. code-block:: bash geoprior-build hotspots-summary \ --hotspot-csv results/fig6_hotspot_points.csv \ --quiet true Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--hotspot-csv`` Required hotspot point CSV produced upstream by a hotspot or spatial figure workflow. ``--out`` Output CSV written under ``scripts/out/`` when relative. ``--quiet`` Disable console table printing. **Related example:** - :doc:`../auto_examples/tables_and_summaries/summarize_hotspots` .. _build-model-metrics: ``model-metrics`` ~~~~~~~~~~~~~~~~~ Use ``model-metrics`` when you want a **unified model metrics table** built from one results root or one run directory. This command scans ablation record JSONL files, preferring updated records when duplicates exist, normalizes legacy units when necessary, flattens per-horizon metrics, and writes a wide metrics table plus optional long-form horizon-level tables. It can filter by city or model, deduplicate by timestamp/city/model, and export both CSV and JSON variants. This is a good fit when you want a reusable metrics inventory across many runs rather than a paper-style ablation table. Usage ^^^^^ Scan the default results root: .. code-block:: bash geoprior build model-metrics Scan one specific run directory: .. code-block:: bash geoprior-build model-metrics \ --src results/nansha_GeoPriorSubsNet_stage1/train_20260330-101500 Restrict to selected cities or models: .. code-block:: bash geoprior-build model-metrics \ --src results \ --city Nansha,Zhongshan \ --models GeoPriorSubsNet Disable the long per-horizon export: .. code-block:: bash geoprior-build model-metrics \ --src results \ --include-long false Write to a custom output stem: .. code-block:: bash geoprior-build model-metrics \ --src results \ --out comparison/model_metrics_main Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--src`` Accept either a results root or one run directory to scan. ``--include-long`` Control whether per-horizon long tables are also written. ``--dedupe`` Deduplicate on ``(timestamp, city, model)`` while preferring updated. ``--city`` and ``--models`` Filter the collected runs before export. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_model_metrics` .. _build-ablation-table: ``ablation-table`` ~~~~~~~~~~~~~~~~~~ Use ``ablation-table`` when you want a **paper-oriented ablation or sensitivity table** built from ablation records. This command is broader than ``model-metrics`` in formatting and table logic. It can scan a results root, read explicit JSONL/JSON/CSV inputs, normalize metric units, flatten per-horizon blocks, sort by one metric, and write compact CSV, JSON, TeX, or text outputs. It also supports paper mode, best-per-city exports, and grouped supplementary outputs such as S6/S7-style tables. This is the right command when you want something closer to a publication-ready ablation table rather than a raw metrics inventory. Usage ^^^^^ Build a default ablation table from the results root: .. code-block:: bash geoprior build ablation-table Use explicit inputs instead of scanning: .. code-block:: bash geoprior-build ablation-table \ --input results/runA/ablation_record.updated.jsonl \ --input results/runB/ablation_record.updated.jsonl Sort by a chosen metric and export CSV plus TeX: .. code-block:: bash geoprior-build ablation-table \ --root results \ --sort-by mae \ --formats csv,tex \ --out table_ablations Use paper mode with one error metric: .. code-block:: bash geoprior-build ablation-table \ --root results \ --for-paper \ --err-metric rmse \ --keep-r2 Also export best-per-city rows: .. code-block:: bash geoprior-build ablation-table \ --root results \ --best-per-city Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--input`` Repeatable explicit inputs that override the default root scan. Supports JSONL, JSON, and CSV. ``--formats`` Choose among CSV, JSON, TeX, and TXT style outputs. ``--for-paper`` / ``--err-metric`` / ``--keep-r2`` Switch to a more compact paper-oriented table layout. ``--best-per-city`` Also export the best row per city under the chosen sort metric. ``--group-cols`` / ``--s6-metrics`` / ``--s7-cols`` Support grouped supplementary outputs such as S6 and S7 style summaries. **Related example:** - :doc:`../auto_examples/tables_and_summaries/build_ablation_table` .. _build-update-ablation-records: ``update-ablation-records`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``update-ablation-records`` when you want to **patch or enrich ablation record JSONL files with metrics**. This command is part of the public build family and is registered with the description *“Patch ablation record JSONL with metrics.”* At the moment, I have only the registry exposure for this command in the files you shared, not the implementation module itself, so this subsection is kept intentionally short until that file is reviewed directly. Usage ^^^^^ Inspect the public CLI help first: .. code-block:: bash geoprior build update-ablation-records --help or: .. code-block:: bash geoprior-build update-ablation-records --help Note ^^^^ Once you share the ``update_ablation_records.py`` implementation, this subsection can be expanded in the same style as the others with: - typical inputs, - update logic, - output behavior, - distinctive options. **Related example:** - :doc:`../auto_examples/tables_and_summaries/index` .. _build-extend-forecast: ``extend-forecast`` ~~~~~~~~~~~~~~~~~~~ Use ``extend-forecast`` when you want to **extend a future forecast CSV by extrapolation** for one or more cities. This command resolves eval and future forecast CSVs, chooses a split, then extends the future trajectory by one or more years using either a linear-fit or linear-last trend. It also supports uncertainty widening rules across the added years and lets you control whether the output kind remains cumulative or is converted to rate form. This is a good fit when you want a longer future panel without rerunning the full Stage-4 forecasting workflow. Usage ^^^^^ Extend the default future horizon by one year: .. code-block:: bash geoprior build extend-forecast \ --ns-src results/nansha \ --zh-src results/zhongshan Use explicit eval and future CSVs: .. code-block:: bash geoprior-build extend-forecast \ --ns-eval results/nansha_eval.csv \ --ns-future results/nansha_future.csv \ --zh-eval results/zhongshan_eval.csv \ --zh-future results/zhongshan_future.csv Add explicit years instead of a count: .. code-block:: bash geoprior-build extend-forecast \ --ns-src results/nansha \ --zh-src results/zhongshan \ --years 2026 2027 Use a different extrapolation rule and fit window: .. code-block:: bash geoprior-build extend-forecast \ --ns-src results/nansha \ --zh-src results/zhongshan \ --method linear_last \ --window 4 Control uncertainty growth: .. code-block:: bash geoprior-build extend-forecast \ --ns-src results/nansha \ --zh-src results/zhongshan \ --unc-growth linear \ --unc-scale 1.5 Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--method`` Choose between ``linear_fit`` and ``linear_last`` extrapolation. ``--years`` and ``--add-years`` Either list the exact years to add or request N more years. ``--subsidence-kind`` and ``--out-kind`` Control whether the inputs and outputs are treated as cumulative, rate, or increment style series. ``--unc-growth`` and ``--unc-scale`` Control how predictive uncertainty widens into the added years. **Related example:** - :doc:`../auto_examples/tables_and_summaries/extend_forecast` .. _build-boundary: ``boundary`` ~~~~~~~~~~~~ Use ``boundary`` when you want to derive a **boundary polygon from forecast point clouds**. The public build command ``boundary`` is exposed from the legacy ``make-boundary`` script. It loads coordinate points from the eval and future CSVs, builds either a convex hull or a concave hull, and exports GeoJSON, Shapefile, or both. The implementation checks that the written boundary file actually exists after export. This is a good fit when you need a quick city outline or a lightweight polygon boundary for later spatial workflows. Usage ^^^^^ Create one default boundary export: .. code-block:: bash geoprior build boundary \ --ns-src results/nansha \ --zh-src results/zhongshan Use explicit eval and future CSVs: .. code-block:: bash geoprior-build boundary \ --ns-eval results/nansha_eval.csv \ --ns-future results/nansha_future.csv \ --zh-eval results/zhongshan_eval.csv \ --zh-future results/zhongshan_future.csv Choose a concave hull instead of the default convex hull: .. code-block:: bash geoprior-build boundary \ --ns-src results/nansha \ --zh-src results/zhongshan \ --method concave \ --alpha 0.2 Write both GeoJSON and Shapefile outputs: .. code-block:: bash geoprior-build boundary \ --ns-src results/nansha \ --zh-src results/zhongshan \ --format both Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--method`` Choose ``convex`` or ``concave`` hull construction. ``--alpha`` Concave-hull ratio used when ``--method concave`` is selected. ``--format`` Export GeoJSON, Shapefile, or both. **Related example:** - :doc:`../auto_examples/tables_and_summaries/make_boundary` .. _build-exposure: ``exposure`` ~~~~~~~~~~~~ Use ``exposure`` when you want to build a **simple exposure proxy table** from forecast point clouds. The public build command ``exposure`` is exposed from the legacy ``make-exposure`` script. It loads spatial points, deduplicates by ``sample_idx``, and writes one exposure table per city set. It supports a uniform exposure mode as well as a density-based proxy computed from k-nearest-neighbor distance using scikit-learn. This is a good fit when you need a lightweight exposure field for downstream mapping, ranking, or derived risk proxies. Usage ^^^^^ Build the default exposure proxy: .. code-block:: bash geoprior build exposure \ --ns-src results/nansha \ --zh-src results/zhongshan Use explicit eval and future CSVs: .. code-block:: bash geoprior-build exposure \ --ns-eval results/nansha_eval.csv \ --ns-future results/nansha_future.csv \ --zh-eval results/zhongshan_eval.csv \ --zh-future results/zhongshan_future.csv Switch to a uniform exposure field: .. code-block:: bash geoprior-build exposure \ --ns-src results/nansha \ --zh-src results/zhongshan \ --mode uniform Control the density exposure neighborhood size: .. code-block:: bash geoprior-build exposure \ --ns-src results/nansha \ --zh-src results/zhongshan \ --mode density \ --k 50 Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--mode`` Choose ``uniform`` or ``density`` exposure behavior. ``--k`` kNN size used by the density-style exposure proxy. **Related example:** - :doc:`../auto_examples/tables_and_summaries/make_exposure` .. _build-district-grid: ``district-grid`` ~~~~~~~~~~~~~~~~~ Use ``district-grid`` when you want to create a **grid-based district layer** from forecast point clouds. The public build command ``district-grid`` is exposed from the legacy ``make-district-grid`` script. It resolves eval and future CSVs, builds a rectangular grid across the spatial extent, optionally clips it to a boundary polygon, writes GeoJSON or Shapefile outputs, and can also write a sample-to-zone assignment CSV. This is a good fit when you want stable Zone IDs for later hotspot, exposure, or policy-style spatial aggregation. Usage ^^^^^ Create a default district grid: .. code-block:: bash geoprior build district-grid \ --ns-src results/nansha \ --zh-src results/zhongshan Change the grid resolution: .. code-block:: bash geoprior-build district-grid \ --ns-src results/nansha \ --zh-src results/zhongshan \ --nx 16 \ --ny 14 Use an external boundary and clip the grid: .. code-block:: bash geoprior-build district-grid \ --ns-src results/nansha \ --zh-src results/zhongshan \ --boundary exports/boundary_nansha.geojson \ --clip-boundary Also export sample-to-zone assignments: .. code-block:: bash geoprior-build district-grid \ --ns-src results/nansha \ --zh-src results/zhongshan \ --assign-samples Write both GeoJSON and Shapefile outputs: .. code-block:: bash geoprior-build district-grid \ --ns-src results/nansha \ --zh-src results/zhongshan \ --format both Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--nx`` and ``--ny`` Set the grid resolution in east-west columns and north-south rows. ``--pad`` Expand the grid extent slightly beyond the raw point span. ``--boundary`` / ``--clip-boundary`` / ``--min-area-frac`` Control optional clipping of the grid to one boundary polygon and dropping very small clipped cells. ``--assign-samples`` Also write a CSV mapping ``sample_idx`` to ``zone_id``. **Related example:** - :doc:`../auto_examples/tables_and_summaries/make_district_grid` .. _build-clusters-with-zones: ``clusters-with-zones`` ~~~~~~~~~~~~~~~~~~~~~~~ Use ``clusters-with-zones`` when you want to assign **hotspot cluster centroids to Zone IDs** from a district grid. The public build command ``clusters-with-zones`` is exposed from the legacy ``tag-clusters-with-zones`` script. It reads a cluster-centroid CSV and a district-grid GeoJSON or Shapefile, optionally filters by city or year, then assigns each centroid to the containing polygon or, when requested, to the nearest polygon. The final output is a clean CSV with cluster identifiers, zone IDs, zone labels, centroids, and summary cluster fields. This is the natural follow-up to ``district-grid`` when you want to join cluster analysis back to stable zone labels. Usage ^^^^^ Assign all clusters to a district grid: .. code-block:: bash geoprior build clusters-with-zones \ --clusters results/hotspot_clusters.csv \ --grid exports/district_grid_nansha.geojson Restrict the assignment to one city and one year: .. code-block:: bash geoprior-build clusters-with-zones \ --clusters results/hotspot_clusters.csv \ --grid exports/district_grid_nansha.geojson \ --city Nansha \ --year 2027 Use nearest-zone fallback when a centroid falls outside polygons: .. code-block:: bash geoprior-build clusters-with-zones \ --clusters results/hotspot_clusters.csv \ --grid exports/district_grid_nansha.geojson \ --nearest Use non-default field names from the grid: .. code-block:: bash geoprior-build clusters-with-zones \ --clusters results/hotspot_clusters.csv \ --grid exports/custom_grid.geojson \ --zone-id-field zid \ --zone-label-field zlabel Distinctive options to notice ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``--clusters`` Cluster centroid CSV containing at least ``centroid_x`` and ``centroid_y``. ``--grid`` District grid GeoJSON or Shapefile used for polygon assignment. ``--zone-id-field`` and ``--zone-label-field`` Select the grid fields used as the output zone identifier and label. ``--nearest`` Assign the nearest zone when no polygon contains the centroid. **Related example:** - :doc:`../auto_examples/tables_and_summaries/tag_clusters_with_zones` From here --------- A natural next reading path is: - :doc:`shared_conventions` - :doc:`../auto_examples/tables_and_summaries/index` - :doc:`../auto_examples/figure_generation/index`