.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/evaluation/plot_forecast_comparison_overview.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_evaluation_plot_forecast_comparison_overview.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_evaluation_plot_forecast_comparison_overview.py:


Learn to compare forecasts visually with ``plot_forecast_comparison``
=========================================================================

This lesson explains how to use
``geoprior.plot.evaluation.plot_forecast_comparison``
when you want to **look at forecasts directly** instead of only reading
summary metrics.

Why this function matters
-------------------------
Metric plots answer questions such as "Which horizon is worse?" or
"Which model has lower MAE?". Those views are essential, but they do
not show the *shape* of the forecast itself.

A forecast can look good numerically while still showing practical
problems such as:

- a persistent bias above or below the actual values,
- a prediction interval that is too narrow or too wide,
- specific samples that are much harder than the rest,
- or spatial forecast patterns that look implausible.

This plotting helper is therefore a **reading tool**. It helps users
inspect the forecast as a visual object before trusting a metric table.
The goal of this page is not only to call the function. It is to teach
how to decide:

- which sample trajectories deserve closer inspection,
- how to read temporal point forecasts,
- how to read median-plus-interval forecasts,
- and when a spatial view is more informative than a temporal one.

.. GENERATED FROM PYTHON SOURCE LINES 39-54

.. code-block:: Python


    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd

    from geoprior.plot.evaluation import plot_forecast_comparison

    pd.set_option("display.max_columns", 24)
    pd.set_option("display.width", 118)
    pd.set_option(
        "display.float_format",
        lambda v: f"{v:0.4f}",
    )


.. GENERATED FROM PYTHON SOURCE LINES 55-86

What this helper expects
------------------------

``plot_forecast_comparison`` is designed around a tidy forecast table
with at least two structural columns:

- ``sample_idx``
- ``forecast_step``

Those two columns tell the function *which trajectory to draw* and
*where each point belongs on the horizon axis*.

For a simple point-forecast comparison, the key value columns are:

- ``<target>_actual``
- ``<target>_pred``

For a quantile forecast, you typically also provide:

- ``<target>_q10``
- ``<target>_q50``
- ``<target>_q90``

If you want spatial plots, the table must also contain two coordinate
columns such as ``coord_x`` and ``coord_y``.

A practical detail matters here: the current implementation is most
naturally centered on the information already present in
``forecast_df``. In other words, this lesson focuses on the path that
users can rely on immediately: one long-format table containing
sample-wise forecasts, horizon steps, and optional coordinates.

.. GENERATED FROM PYTHON SOURCE LINES 89-112

Build a realistic demo forecast table
-------------------------------------

For a gallery lesson, we want one stable table that can support both
temporal and spatial examples.

We therefore create a synthetic forecast frame with:

- 12 spatial locations,
- 3 forecast steps,
- actual values,
- point predictions,
- 10/50/90 quantile columns,
- and two coordinate columns for spatial maps.

The demo is intentionally designed so that:

- some samples are easier than others,
- later horizons drift more,
- and interval width grows with forecast step.

That makes the lesson easier to read because the plots tell a coherent
forecasting story.

.. GENERATED FROM PYTHON SOURCE LINES 112-168

.. code-block:: Python


    rng = np.random.default_rng(7)
    rows: list[dict[str, float | int | str]] = []

    n_locations = 12
    horizons = [1, 2, 3]

    x_coords = np.repeat(np.linspace(113.20, 113.95, 4), 3)
    y_coords = np.tile(np.linspace(22.10, 22.70, 3), 4)

    for sample_idx, (x_val, y_val) in enumerate(
        zip(x_coords, y_coords, strict=False)
    ):
        spatial_effect = (
            2.2 * (x_val - x_coords.mean())
            - 1.4 * (y_val - y_coords.mean())
        )
        local_difficulty = 0.35 + 0.12 * (sample_idx % 4)

        for step in horizons:
            baseline = 16.0 + 1.35 * step + spatial_effect
            actual = (
                baseline
                + 0.30 * np.sin(sample_idx / 2.0)
                + rng.normal(0.0, 0.35)
            )

            pred_bias = 0.18 * step
            pred_noise = local_difficulty * (0.80 + 0.25 * step)
            pred = actual + pred_bias + rng.normal(0.0, pred_noise)

            half_width = 0.95 + 0.55 * step
            q10 = pred - half_width
            q50 = pred
            q90 = pred + half_width

            rows.append(
                {
                    "sample_idx": sample_idx,
                    "forecast_step": step,
                    "coord_x": x_val,
                    "coord_y": y_val,
                    "subsidence_actual": actual,
                    "subsidence_pred": pred,
                    "subsidence_q10": q10,
                    "subsidence_q50": q50,
                    "subsidence_q90": q90,
                }
            )

    forecast_df = pd.DataFrame(rows)

    print("Demo forecast table")
    print(forecast_df.head(12))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Demo forecast table
        sample_idx  forecast_step  coord_x  coord_y  subsidence_actual  subsidence_pred  subsidence_q10  subsidence_q50  \
    0            0              1 113.2000  22.1000            16.9454          17.2352         15.7352         17.2352   
    1            0              2 113.2000  22.1000            18.1991          18.1538         16.1038         18.1538   
    2            0              3 113.2000  22.1000            19.4859          19.4879         16.8879         19.4879   
    3            1              1 113.2000  22.4000            16.6899          17.5313         16.0313         17.5313   
    4            1              2 113.2000  22.4000            17.8466          17.8274         15.7774         17.8274   
    5            1              3 113.2000  22.4000            19.5403          20.3403         17.7403         20.3403   
    6            2              1 113.2000  22.7000            16.3943          15.9979         14.4979         15.9979   
    7            2              2 113.2000  22.7000            17.6972          18.5905         16.5405         18.5905   
    8            2              3 113.2000  22.7000            18.5870          18.7085         16.1085         18.7085   
    9            3              1 113.4500  22.1000            17.1288          16.3475         14.8475         16.3475   
    10           3              2 113.4500  22.1000            18.4996          18.6427         16.5927         18.6427   
    11           3              3 113.4500  22.1000            20.0506          20.8892         18.2892         20.8892   

        subsidence_q90  
    0          18.7352  
    1          20.2038  
    2          22.0879  
    3          19.0313  
    4          19.8774  
    5          22.9403  
    6          17.4979  
    7          20.6405  
    8          21.3085  
    9          17.8475  
    10         20.6927  
    11         23.4892  


.. GENERATED FROM PYTHON SOURCE LINES 169-181

Read the structure before plotting
----------------------------------

A useful habit is to verify the structure explicitly before drawing
any figure. This avoids two very common user mistakes:

1. using the wrong target prefix,
2. expecting the helper to infer sample trajectories without
   ``sample_idx``.

In this lesson the target prefix is ``subsidence``. That is why all
examples below use ``target_name='subsidence'``.

.. GENERATED FROM PYTHON SOURCE LINES 181-192

.. code-block:: Python


    print("\nColumns used in this lesson")
    print(list(forecast_df.columns))

    print("\nRows per sample")
    print(forecast_df.groupby("sample_idx").size().head())

    print("\nRows per forecast step")
    print(forecast_df.groupby("forecast_step").size())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Columns used in this lesson
    ['sample_idx', 'forecast_step', 'coord_x', 'coord_y', 'subsidence_actual', 'subsidence_pred', 'subsidence_q10', 'subsidence_q50', 'subsidence_q90']

    Rows per sample
    sample_idx
    0    3
    1    3
    2    3
    3    3
    4    3
    dtype: int64

    Rows per forecast step
    forecast_step
    1    12
    2    12
    3    12
    dtype: int64


.. GENERATED FROM PYTHON SOURCE LINES 193-209

Start with the most important reading: temporal trajectories
------------------------------------------------------------

The temporal mode is usually the best place to begin.

Why? Because it answers the most immediate visual question:

*For a few concrete samples, how does the predicted trajectory compare
to the actual trajectory as the forecast horizon advances?*

We begin with point forecasts only. This keeps the figure simple and
helps users learn the plotting logic before adding uncertainty bands.

By default, ``sample_ids='first_n'`` and ``num_samples=3`` already
give a useful small panel view, so this is a good first call for new
users.

.. GENERATED FROM PYTHON SOURCE LINES 209-223

.. code-block:: Python


    plot_forecast_comparison(
        forecast_df=forecast_df,
        target_name="subsidence",
        kind="temporal",
        sample_ids="first_n",
        num_samples=3,
        output_dim=1,
        max_cols=2,
        figsize_per_subplot=(6.2, 4.2),
        point_plot_kwargs={"color": "tab:blue"},
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_001.png
   :alt: Sample 0, Sample 1, Sample 2
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [DEBUG] Selected sample_idx: [0, 1, 2]
    [INFO] Forecast visualisation complete.


.. GENERATED FROM PYTHON SOURCE LINES 224-240

How to read the temporal point-forecast view
--------------------------------------------

Each panel corresponds to one ``sample_idx``. The x-axis is
``forecast_step`` and the y-axis is the target value.

When you inspect this view, read it in this order:

1. Does the prediction follow the overall level of the actual line?
2. Is there a consistent upward or downward bias?
3. Does the mismatch get worse at later horizons?
4. Are some samples much harder than others?

That last question is especially important. A model can have a good
average metric while still failing on a small but meaningful subset of
samples. This function helps expose those cases.

.. GENERATED FROM PYTHON SOURCE LINES 243-252

Select specific difficult or interesting samples
------------------------------------------------

After a first broad look, users usually want to inspect *specific*
samples rather than only the first few rows.

Here we manually select three samples from different parts of the
spatial domain. This is a good pattern when you already know that
some sites are operationally important, unusual, or error-prone.

.. GENERATED FROM PYTHON SOURCE LINES 252-270

.. code-block:: Python


    plot_forecast_comparison(
        forecast_df=forecast_df,
        target_name="subsidence",
        kind="temporal",
        sample_ids=[1, 5, 10],
        output_dim=1,
        max_cols=3,
        figsize_per_subplot=(5.2, 4.0),
        titles=[
            "Western site",
            "Central site",
            "Eastern site",
        ],
        point_plot_kwargs={"linewidth": 2.0},
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_002.png
   :alt: Western site, Central site, Eastern site
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_002.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [DEBUG] Selected sample_idx: [1, 5, 10]
    [INFO] Forecast visualisation complete.


.. GENERATED FROM PYTHON SOURCE LINES 271-285

Why sample-wise selection is important
--------------------------------------

A forecast table may contain hundreds or thousands of trajectories.
Looking at all of them at once is rarely useful.

The better workflow is usually:

1. start with the first few samples,
2. identify cases that deserve more attention,
3. then pass an explicit list to ``sample_ids``.

In other words, this helper is strongest when used as a targeted
inspection tool, not as an attempt to display every sample at once.

.. GENERATED FROM PYTHON SOURCE LINES 288-302

Add uncertainty with quantile bands
-----------------------------------

Point forecasts are only part of the story.

When quantile columns are available, the temporal view becomes much
more informative because it can show:

- the median forecast,
- the actual values,
- and a prediction interval using the outermost supplied quantiles.

In this demo, the interval is built from q10 and q90, and the median
line is q50.

.. GENERATED FROM PYTHON SOURCE LINES 302-317

.. code-block:: Python


    plot_forecast_comparison(
        forecast_df=forecast_df,
        target_name="subsidence",
        quantiles=[0.10, 0.50, 0.90],
        kind="temporal",
        sample_ids=[0, 4, 8],
        output_dim=1,
        max_cols=3,
        figsize_per_subplot=(5.3, 4.1),
        median_plot_kwargs={"color": "tab:green", "linewidth": 2.0},
        fill_between_kwargs={"alpha": 0.25},
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_003.png
   :alt: Sample 0, Sample 4, Sample 8
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_003.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [DEBUG] Selected sample_idx: [0, 4, 8]
    [INFO] Forecast visualisation complete.


.. GENERATED FROM PYTHON SOURCE LINES 318-333

How to read the interval view
-----------------------------

This is one of the most useful views in the evaluation gallery.

Read it with three questions in mind:

1. Is the actual line usually inside the interval?
2. Does the median forecast stay near the actual values?
3. Does the interval widen sensibly as horizon increases?

A very narrow interval that frequently misses the actual values is not
trustworthy. But an interval that becomes excessively wide may also be
unhelpful in practice. This plot gives users the visual context behind
later coverage and WIS metrics.

.. GENERATED FROM PYTHON SOURCE LINES 336-351

Move from temporal reading to spatial reading
---------------------------------------------

Temporal plots are best when you want to inspect *sample trajectories*.
Spatial plots are better when you want to inspect *patterns across the
map* at a specific horizon.

In the current implementation, the spatial mode colors each point by
the point prediction or, when quantiles are provided, by the median
forecast column.

This answers a different question:

*At step H1, H2, or H3, what does the forecasted field look like over
space?*

.. GENERATED FROM PYTHON SOURCE LINES 351-368

.. code-block:: Python


    plot_forecast_comparison(
        forecast_df=forecast_df,
        target_name="subsidence",
        quantiles=[0.10, 0.50, 0.90],
        kind="spatial",
        horizon_steps=[1, 2, 3],
        spatial_cols=["coord_x", "coord_y"],
        output_dim=1,
        max_cols=3,
        figsize_per_subplot=(5.1, 4.2),
        cmap="viridis",
        s=85,
        alpha=0.85,
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_004.png
   :alt: Step 1, Step 2, Step 3
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_004.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [DEBUG] Selected sample_idx: [0, 1, 2]
    [INFO] Forecast visualisation complete.


.. GENERATED FROM PYTHON SOURCE LINES 369-384

How to read the spatial view
----------------------------

Each panel now corresponds to one forecast step instead of one sample.
The point locations are fixed, and the color shows the forecast value.

This view is especially useful for checking whether:

- the predicted field remains spatially smooth or coherent,
- strong hotspots appear where you expect them,
- later horizons produce unrealistic spatial jumps,
- or the map pattern becomes too noisy.

In a real project, this is often the moment when a user notices that a
model is numerically acceptable but spatially unconvincing.

.. GENERATED FROM PYTHON SOURCE LINES 387-395

Compare a single horizon in point-forecast mode
-----------------------------------------------

We can also call the spatial view without quantiles. In that case the
color reflects ``<target>_pred`` directly.

This is the simplest spatial usage pattern and is a good default when
the workflow is deterministic rather than probabilistic.

.. GENERATED FROM PYTHON SOURCE LINES 395-410

.. code-block:: Python


    plot_forecast_comparison(
        forecast_df=forecast_df,
        target_name="subsidence",
        kind="spatial",
        horizon_steps=2,
        spatial_cols=["coord_x", "coord_y"],
        output_dim=1,
        max_cols=1,
        figsize_per_subplot=(6.0, 4.8),
        cmap="plasma",
        s=95,
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_005.png
   :alt: Step 2
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_forecast_comparison_overview_005.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [DEBUG] Selected sample_idx: [0, 1, 2]
    [INFO] Forecast visualisation complete.


.. GENERATED FROM PYTHON SOURCE LINES 411-428

What this function is best at
-----------------------------

``plot_forecast_comparison`` is strongest when the user wants to *see*
the forecast rather than only score it.

It is especially good for:

- a quick inspection of a few sample trajectories,
- visual verification of median-plus-interval behaviour,
- checking whether difficult samples look pathological,
- and examining horizon-specific spatial forecast maps.

In contrast, it is not the main tool for aggregated comparison across
many groups or many metrics. For those tasks, the evaluation gallery
should usually move next to helpers such as
``plot_metric_over_horizon`` or the calibration/interval plots.

.. GENERATED FROM PYTHON SOURCE LINES 431-454

A practical checklist for your own data
---------------------------------------

Before applying this helper to a real saved forecast table, check the
following:

1. Does your table contain ``sample_idx`` and ``forecast_step``?
2. Are the target columns named consistently with ``target_name``?
3. For interval mode, do your quantile columns follow the qXX naming?
4. For spatial mode, do the coordinate columns exist and have no
   unexpected missing values?
5. Are you selecting only a manageable number of samples for temporal
   inspection?

A simple adaptation pattern is:

- replace ``forecast_df`` with your saved long-format prediction table,
- set ``target_name`` to your real target prefix,
- pass explicit ``sample_ids`` for important trajectories,
- and use ``spatial_cols`` only when the coordinates are already in
  the same frame.

That workflow keeps the function honest and readable.

.. GENERATED FROM PYTHON SOURCE LINES 457-474

Final lesson takeaway
---------------------

A forecast comparison plot is not a replacement for evaluation
metrics. It is the visual companion that helps users understand *why*
a forecast is acceptable, suspicious, overconfident, or spatially
implausible.

A good workflow is therefore:

1. inspect a few temporal trajectories,
2. inspect interval behaviour if quantiles exist,
3. inspect one or more spatial horizons when coordinates are present,
4. then move to horizon-wise metrics and calibration summaries.

That combination gives a much more trustworthy evaluation story than
metrics alone.

.. GENERATED FROM PYTHON SOURCE LINES 474-476

.. code-block:: Python


    plt.close("all")


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.726 seconds)


.. _sphx_glr_download_auto_examples_evaluation_plot_forecast_comparison_overview.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_forecast_comparison_overview.ipynb <plot_forecast_comparison_overview.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_forecast_comparison_overview.py <plot_forecast_comparison_overview.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_forecast_comparison_overview.zip <plot_forecast_comparison_overview.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_