.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/tables_and_summaries/extend_forecast.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_tables_and_summaries_extend_forecast.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_tables_and_summaries_extend_forecast.py:


Extend forecast CSVs to later years
===================================

This example teaches you how to use GeoPrior's
``extend-forecast`` utility.

Unlike the plotting scripts, this command is a forecast-product
builder. It takes an existing future forecast CSV and extends it to
one or more later years by simple extrapolation.

Why this matters
----------------
In many workflows, the trained model only emits forecasts to a fixed
horizon, but downstream reporting still needs:

- one or two extra years,
- a quick scenario extension,
- or a compact artifact for later mapping and hotspot analysis.

This builder helps create those extended future CSVs directly from
existing forecast exports.

.. GENERATED FROM PYTHON SOURCE LINES 26-31

Imports
-------
We call the real production entrypoint from the project code.
Then we read the generated CSVs back in and build one compact
teaching preview.

.. GENERATED FROM PYTHON SOURCE LINES 31-45

.. code-block:: Python


    from __future__ import annotations

    import tempfile
    from pathlib import Path

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd

    from geoprior.scripts.extend_forecast import (
        extend_forecast_main,
    )


.. GENERATED FROM PYTHON SOURCE LINES 46-71

Build compact synthetic forecast archives
-----------------------------------------
The production builder resolves, per city:

- one eval CSV,
- one future CSV,

and then extends the future horizon.

For the lesson, we create synthetic cumulative forecast archives
for:

- Nansha
- Zhongshan

with:

- eval years   = 2020, 2021, 2022
- future years = 2023, 2024, 2025

The synthetic paths are designed so that:

- Zhongshan has a higher cumulative level,
- the final years contain a clean trend,
- and uncertainty widens gently with horizon.

.. GENERATED FROM PYTHON SOURCE LINES 71-193

.. code-block:: Python


    rng = np.random.default_rng(21)

    eval_years = [2020, 2021, 2022]
    future_years = [2023, 2024, 2025]
    n_points = 70


    def _city_forecasts(
        *,
        city: str,
        base_shift: float,
        trend_shift: float,
    ) -> tuple[pd.DataFrame, pd.DataFrame]:
        rows_eval: list[dict[str, object]] = []
        rows_future: list[dict[str, object]] = []

        for sample_idx in range(n_points):
            x = 100.0 + 0.55 * sample_idx
            y = 250.0 + 0.18 * sample_idx

            local = rng.normal(0.0, 1.2)
            slope = 14.0 + trend_shift + 0.03 * sample_idx + local
            start = 18.0 + base_shift + 0.35 * sample_idx

            # Annual increments.
            inc_2020 = start
            inc_2021 = start + 0.65 * slope
            inc_2022 = start + 1.00 * slope
            inc_2023 = start + 1.28 * slope
            inc_2024 = start + 1.55 * slope
            inc_2025 = start + 1.83 * slope

            # Cumulative q50 path.
            q50_2020 = inc_2020
            q50_2021 = q50_2020 + inc_2021
            q50_2022 = q50_2021 + inc_2022
            q50_2023 = q50_2022 + inc_2023
            q50_2024 = q50_2023 + inc_2024
            q50_2025 = q50_2024 + inc_2025

            # Eval actuals.
            act_2020 = max(0.1, q50_2020 + rng.normal(0.0, 2.0))
            act_2021 = max(0.1, q50_2021 + rng.normal(0.0, 2.5))
            act_2022 = max(0.1, q50_2022 + rng.normal(0.0, 3.0))

            def _band(center: float, year: int) -> tuple[float, float]:
                width = 8.0 + 0.7 * (year - 2020)
                return max(0.0, center - width), center + width

            for step, (year, q50, actual) in enumerate(
                [
                    (2020, q50_2020, act_2020),
                    (2021, q50_2021, act_2021),
                    (2022, q50_2022, act_2022),
                ],
                start=1,
            ):

                q10, q90 = _band(q50, year)
                rows_eval.append(
                    {
                        "city": city,
                        "sample_idx": sample_idx,
                        "forecast_step": step,
                        "coord_x": float(x),
                        "coord_y": float(y),
                        "coord_t": int(year),
                        "subsidence_actual": float(actual),
                        "subsidence_q10": float(q10),
                        "subsidence_q50": float(q50),
                        "subsidence_q90": float(q90),
                        "subsidence_unit": "mm",
                    }
                )
            
            # Future rows
            for step, (year, q50) in enumerate(
                [
                    (2023, q50_2023),
                    (2024, q50_2024),
                    (2025, q50_2025),
                ],
                start=1,
            ):
                q10, q90 = _band(q50, year)
                rows_future.append(
                    {
                        "city": city,
                        "sample_idx": sample_idx,
                        "forecast_step": step,
                        "coord_x": float(x),
                        "coord_y": float(y),
                        "coord_t": int(year),
                        "subsidence_q10": float(q10),
                        "subsidence_q50": float(q50),
                        "subsidence_q90": float(q90),
                        "subsidence_unit": "mm",
                    }
                )

        return pd.DataFrame(rows_eval), pd.DataFrame(rows_future)


    ns_eval_df, ns_future_df = _city_forecasts(
        city="Nansha",
        base_shift=0.0,
        trend_shift=0.0,
    )
    zh_eval_df, zh_future_df = _city_forecasts(
        city="Zhongshan",
        base_shift=8.0,
        trend_shift=1.8,
    )

    print("Nansha future preview")
    print(ns_future_df.head(6).to_string(index=False))

    print("")
    print("Zhongshan future preview")
    print(zh_future_df.head(6).to_string(index=False))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Nansha future preview
      city  sample_idx  forecast_step  coord_x  coord_y  coord_t  subsidence_q10  subsidence_q50  subsidence_q90 subsidence_unit
    Nansha           0              1 100.0000 250.0000     2023        104.1814        114.2814        124.3814              mm
    Nansha           0              2 100.0000 250.0000     2024        143.8488        154.6488        165.4488              mm
    Nansha           0              3 100.0000 250.0000     2025        187.5566        199.0566        210.5566              mm
    Nansha           1              1 100.5500 250.1800     2023        104.2415        114.3415        124.4415              mm
    Nansha           1              2 100.5500 250.1800     2024        143.5500        154.3500        165.1500              mm
    Nansha           1              3 100.5500 250.1800     2025        186.7710        198.2710        209.7710              mm

    Zhongshan future preview
         city  sample_idx  forecast_step  coord_x  coord_y  coord_t  subsidence_q10  subsidence_q50  subsidence_q90 subsidence_unit
    Zhongshan           0              1 100.0000 250.0000     2023        136.6795        146.7795        156.8795              mm
    Zhongshan           0              2 100.0000 250.0000     2024        184.6103        195.4103        206.2103              mm
    Zhongshan           0              3 100.0000 250.0000     2025        236.6292        248.1292        259.6292              mm
    Zhongshan           1              1 100.5500 250.1800     2023        145.4446        155.5446        165.6446              mm
    Zhongshan           1              2 100.5500 250.1800     2024        197.6216        208.4216        219.2216              mm
    Zhongshan           1              3 100.5500 250.1800     2025        254.5905        266.0905        277.5905              mm


.. GENERATED FROM PYTHON SOURCE LINES 194-198

Write the synthetic CSV inputs
------------------------------
The production command works from CSV files, so the lesson keeps
the same workflow.

.. GENERATED FROM PYTHON SOURCE LINES 198-223

.. code-block:: Python


    tmp_dir = Path(
        tempfile.mkdtemp(prefix="gp_sg_extend_forecast_")
    )

    ns_eval_csv = tmp_dir / "nansha_eval.csv"
    ns_future_csv = tmp_dir / "nansha_future.csv"
    zh_eval_csv = tmp_dir / "zhongshan_eval.csv"
    zh_future_csv = tmp_dir / "zhongshan_future.csv"

    ns_eval_df.to_csv(ns_eval_csv, index=False)
    ns_future_df.to_csv(ns_future_csv, index=False)
    zh_eval_df.to_csv(zh_eval_csv, index=False)
    zh_future_df.to_csv(zh_future_csv, index=False)

    print("")
    print("Input files")
    for p in [
        ns_eval_csv,
        ns_future_csv,
        zh_eval_csv,
        zh_future_csv,
    ]:
        print(" -", p.name)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Input files
     - nansha_eval.csv
     - nansha_future.csv
     - zhongshan_eval.csv
     - zhongshan_future.csv


.. GENERATED FROM PYTHON SOURCE LINES 224-236

Run the real forecast-extension builder
---------------------------------------
We ask the builder to:

- interpret the inputs as cumulative subsidence,
- keep the output in cumulative form,
- extend to the explicit years 2026 and 2027,
- use a short linear-fit window,
- and widen uncertainty with a sqrt rule.

Because we request both cities, the script writes one CSV per city
with a city suffix added to the output stem.

.. GENERATED FROM PYTHON SOURCE LINES 236-270

.. code-block:: Python


    out_stem = tmp_dir / "future_extended_gallery.csv"

    extend_forecast_main(
        [
            "--ns-eval",
            str(ns_eval_csv),
            "--ns-future",
            str(ns_future_csv),
            "--zh-eval",
            str(zh_eval_csv),
            "--zh-future",
            str(zh_future_csv),
            "--subsidence-kind",
            "cumulative",
            "--out-kind",
            "same",
            "--method",
            "linear_fit",
            "--window",
            "3",
            "--years",
            "2026",
            "2027",
            "--unc-growth",
            "sqrt",
            "--unc-scale",
            "1.0",
            "--out",
            str(out_stem),
        ],
        prog="extend-forecast",
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [OK] Nansha: wrote /tmp/gp_sg_extend_forecast_y2mw5jor/future_extended_gallery_nansha.csv (manual)
    [OK] Zhongshan: wrote /tmp/gp_sg_extend_forecast_y2mw5jor/future_extended_gallery_zhongshan.csv (manual)


.. GENERATED FROM PYTHON SOURCE LINES 271-274

Inspect the produced files
--------------------------
The builder writes one output CSV per city in multi-city mode.

.. GENERATED FROM PYTHON SOURCE LINES 274-288

.. code-block:: Python


    written = sorted(tmp_dir.glob("future_extended_gallery*"))
    if not written:
        written = sorted(
            (tmp_dir / "scripts" / "out").glob(
                "future_extended_gallery*"
            )
        )

    print("")
    print("Written files")
    for p in written:
        print(" -", p.name)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Written files
     - future_extended_gallery_nansha.csv
     - future_extended_gallery_zhongshan.csv


.. GENERATED FROM PYTHON SOURCE LINES 289-293

Read the extended outputs
-------------------------
We read both city-level outputs back in and inspect the newly added
years.

.. GENERATED FROM PYTHON SOURCE LINES 293-323

.. code-block:: Python


    def _pick_city_output(paths: list[Path], city_slug: str) -> Path:
        for p in paths:
            if city_slug in p.name.lower():
                return p
        raise FileNotFoundError(city_slug)


    ns_out_csv = _pick_city_output(written, "nansha")
    zh_out_csv = _pick_city_output(written, "zhongshan")

    ns_ext = pd.read_csv(ns_out_csv)
    zh_ext = pd.read_csv(zh_out_csv)

    print("")
    print("Extended Nansha output")
    print(
        ns_ext.loc[ns_ext["coord_t"].isin([2025, 2026, 2027])]
        .head(8)
        .to_string(index=False)
    )

    print("")
    print("Extended Zhongshan output")
    print(
        zh_ext.loc[zh_ext["coord_t"].isin([2025, 2026, 2027])]
        .head(8)
        .to_string(index=False)
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Extended Nansha output
      city  sample_idx  forecast_step  coord_x  coord_y  coord_t  subsidence_q10  subsidence_q50  subsidence_q90 subsidence_unit extended extend_kind extend_method unc_growth
    Nansha           0              3 100.0000 250.0000     2025        187.5566        199.0566        210.5566              mm      NaN         NaN           NaN        NaN
       NaN           0              4 100.0000 250.0000     2026        234.9189        247.4088        259.8988              mm     True  cumulative    linear_fit       sqrt
       NaN           0              5 100.0000 250.0000     2027        286.0671        299.7695        313.4719              mm     True  cumulative    linear_fit       sqrt
    Nansha           1              3 100.5500 250.1800     2025        186.7710        198.2710        209.7710              mm      NaN         NaN           NaN        NaN
       NaN           1              4 100.5500 250.1800     2026        233.5214        246.0114        258.5013              mm     True  cumulative    linear_fit       sqrt
       NaN           1              5 100.5500 250.1800     2027        283.9308        297.6331        311.3355              mm     True  cumulative    linear_fit       sqrt
    Nansha           2              3 101.1000 250.3600     2025        187.7252        199.2252        210.7252              mm      NaN         NaN           NaN        NaN
       NaN           2              4 101.1000 250.3600     2026        234.4436        246.9335        259.4235              mm     True  cumulative    linear_fit       sqrt

    Extended Zhongshan output
         city  sample_idx  forecast_step  coord_x  coord_y  coord_t  subsidence_q10  subsidence_q50  subsidence_q90 subsidence_unit extended extend_kind extend_method unc_growth
    Zhongshan           0              3 100.0000 250.0000     2025        236.6292        248.1292        259.6292              mm      NaN         NaN           NaN        NaN
          NaN           0              4 100.0000 250.0000     2026        292.3490        304.8389        317.3289              mm     True  cumulative    linear_fit       sqrt
          NaN           0              5 100.0000 250.0000     2027        351.9020        365.6044        379.3068              mm     True  cumulative    linear_fit       sqrt
    Zhongshan           1              3 100.5500 250.1800     2025        254.5905        266.0905        277.5905              mm      NaN         NaN           NaN        NaN
          NaN           1              4 100.5500 250.1800     2026        315.9474        328.4374        340.9273              mm     True  cumulative    linear_fit       sqrt
          NaN           1              5 100.5500 250.1800     2027        381.8358        395.5382        409.2405              mm     True  cumulative    linear_fit       sqrt
    Zhongshan           2              3 101.1000 250.3600     2025        247.6953        259.1953        270.6953              mm      NaN         NaN           NaN        NaN
          NaN           2              4 101.1000 250.3600     2026        306.4038        318.8937        331.3837              mm     True  cumulative    linear_fit       sqrt


.. GENERATED FROM PYTHON SOURCE LINES 324-330

Summarize before vs after
-------------------------
A compact summary makes the extension behavior clearer.

We compute city-level mean q10/q50/q90 paths before and after the
extension.

.. GENERATED FROM PYTHON SOURCE LINES 330-354

.. code-block:: Python


    def _mean_path(df: pd.DataFrame) -> pd.DataFrame:
        return (
            df.groupby("coord_t", as_index=False)[
                ["subsidence_q10", "subsidence_q50", "subsidence_q90"]
            ]
            .mean()
            .sort_values("coord_t")
        )


    ns_before = _mean_path(ns_future_df)
    zh_before = _mean_path(zh_future_df)
    ns_after = _mean_path(ns_ext)
    zh_after = _mean_path(zh_ext)

    print("")
    print("Mean Nansha path after extension")
    print(ns_after.to_string(index=False))

    print("")
    print("Mean Zhongshan path after extension")
    print(zh_after.to_string(index=False))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Mean Nansha path after extension
     coord_t  subsidence_q10  subsidence_q50  subsidence_q90
        2023        154.1724        164.2724        174.3724
        2024        206.8092        217.6092        228.4092
        2025        263.6481        275.1481        286.6481
        2026        324.2992        336.7892        349.2791
        2027        388.8966        402.5990        416.3014

    Mean Zhongshan path after extension
     coord_t  subsidence_q10  subsidence_q50  subsidence_q90
        2023        192.5489        202.6489        212.7489
        2024        256.5590        267.3590        278.1590
        2025        325.3806        336.8806        348.3806
        2026        398.6091        411.0991        423.5890
        2027        476.3885        490.0909        503.7933


.. GENERATED FROM PYTHON SOURCE LINES 355-367

Build one compact visual preview
--------------------------------
This preview is not part of the production builder itself.
It is a teaching aid for the gallery page.

Left:
  Nansha before/after q50 path.

Right:
  Zhongshan before/after q50 path.

The shaded ribbons show the q10-q90 interval after extension.

.. GENERATED FROM PYTHON SOURCE LINES 367-403

.. code-block:: Python


    fig, axes = plt.subplots(
        1,
        2,
        figsize=(9.4, 3.9),
        constrained_layout=True,
    )

    for ax, city, before, after in [
        (axes[0], "Nansha", ns_before, ns_after),
        (axes[1], "Zhongshan", zh_before, zh_after),
    ]:
        ax.plot(
            before["coord_t"].to_numpy(int),
            before["subsidence_q50"].to_numpy(float),
            marker="o",
            label="Original future q50",
        )
        ax.plot(
            after["coord_t"].to_numpy(int),
            after["subsidence_q50"].to_numpy(float),
            marker="o",
            linestyle="--",
            label="Extended q50",
        )
        ax.fill_between(
            after["coord_t"].to_numpy(int),
            after["subsidence_q10"].to_numpy(float),
            after["subsidence_q90"].to_numpy(float),
            alpha=0.2,
        )
        ax.set_title(city)
        ax.set_xlabel("Year")
        ax.set_ylabel("Cumulative subsidence [mm]")
        ax.legend(fontsize=8)


.. image-sg:: /auto_examples/tables_and_summaries/images/sphx_glr_extend_forecast_001.png
   :alt: Nansha, Zhongshan
   :srcset: /auto_examples/tables_and_summaries/images/sphx_glr_extend_forecast_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 404-420

Learn how to read this builder
------------------------------
The extension logic starts from an *existing* future forecast CSV.
It does not retrain the model.

The practical reading order is:

1. inspect the original future path up to its last available year;
2. check which extension rule was requested;
3. verify the new years added to the tail;
4. inspect how the q10-q90 interval widens after extension.

In other words:

- this is a forecast-product builder,
- not a new model inference stage.

.. GENERATED FROM PYTHON SOURCE LINES 422-435

Explicit years vs add-years
---------------------------
The script supports two extension styles.

Explicit years:
    add the exact years requested by ``--years``.

Add N years:
    if ``--years`` is omitted, append the next
    ``--add-years`` years after the existing tail.

The lesson uses explicit years because it makes the page easier
to read, but both workflows are supported by the real command.

.. GENERATED FROM PYTHON SOURCE LINES 437-454

Why subsidence-kind and out-kind matter
---------------------------------------
The command distinguishes:

- the meaning of the input series,
- and the meaning of the output series.

``subsidence-kind``
    tells the builder whether the source CSV represents
    cumulative values or annual/rate-style values.

``out-kind``
    controls whether the written extension should stay in the same
    convention, or be converted to cumulative or rate form.

This is useful because later scripts may consume different forecast
conventions.

.. GENERATED FROM PYTHON SOURCE LINES 456-469

Why uncertainty growth matters
------------------------------
The extrapolated years are less certain than the original trained
horizon, so the command exposes:

- ``hold``
- ``sqrt``
- ``linear``

uncertainty-growth modes, plus an ``unc-scale`` multiplier.

The visual preview above makes that visible through the widening
q10-q90 ribbon in 2026 and 2027.

.. GENERATED FROM PYTHON SOURCE LINES 471-490

Why this page belongs in tables_and_summaries
---------------------------------------------
This script produces a reusable forecast CSV artifact that later
builders can consume.

A useful workflow is:

1. generate the original future forecast,
2. extend the future CSV if later years are needed,
3. pass the extended CSV to hotspot or spatial-summary builders,
4. only then move to paper-ready maps or narrative tables.

That keeps:

- model inference,
- forecast extrapolation,
- and final visualization

clearly separated.

.. GENERATED FROM PYTHON SOURCE LINES 492-542

Command-line version
--------------------
The same lesson can be reproduced from the CLI.

Legacy dispatcher:

.. code-block:: bash

   python -m scripts extend-forecast \
     --ns-eval results/nansha_eval.csv \
     --ns-future results/nansha_future.csv \
     --zh-eval results/zhongshan_eval.csv \
     --zh-future results/zhongshan_future.csv \
     --subsidence-kind cumulative \
     --out-kind same \
     --method linear_fit \
     --window 3 \
     --years 2026 2027 \
     --unc-growth sqrt \
     --out future_extended

Add the next 2 years instead:

.. code-block:: bash

   python -m scripts extend-forecast \
     --ns-src results/nansha_run \
     --zh-src results/zhongshan_run \
     --split auto \
     --add-years 2 \
     --method linear_last \
     --out future_extended

Modern CLI:

.. code-block:: bash

   geoprior build extend-forecast \
     --ns-src results/nansha_run \
     --zh-src results/zhongshan_run \
     --split auto \
     --add-years 2 \
     --subsidence-kind cumulative \
     --out-kind same \
     --method linear_fit \
     --window 3 \
     --unc-growth sqrt \
     --out future_extended

The gallery page teaches the builder.
The command line reproduces it in a workflow.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.378 seconds)


.. _sphx_glr_download_auto_examples_tables_and_summaries_extend_forecast.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: extend_forecast.ipynb <extend_forecast.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: extend_forecast.py <extend_forecast.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: extend_forecast.zip <extend_forecast.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_