.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/figure_generation/plot_driver_response.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_figure_generation_plot_driver_response.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_figure_generation_plot_driver_response.py:


Driver-response plots: learning how the response moves with the drivers
==========================================================================

This example teaches you how to read the GeoPrior
driver-response figure.

Most figures in this documentation show maps, metrics, or
physics diagnostics. This one asks a simpler but very important
question:

**How does the response variable behave as each driver changes?**

That sounds simple, but it is often one of the fastest ways to
see whether a model workflow is scientifically plausible.

Instead of looking at space first, we look at pairwise
relationships:

- one driver on the x-axis,
- one response on the y-axis,
- and one panel for each city-driver pair.

What the figure shows
---------------------
The plotting backend builds a grid with:

- one row per city,
- one column per driver,
- a hexbin density view in each panel,
- and an optional robust trend line.

This is useful because raw scatter plots often become too dense
to read. A hexbin panel shows where data are concentrated,
while the trend line gives a compact summary of the central
relationship.

Why this matters
----------------
A forecasting workflow can produce good aggregate metrics while
still hiding unrealistic driver-response behaviour.

This figure helps answer questions like:

- Does subsidence rise where drawdown deepens?
- Does the relationship look monotonic or nonlinear?
- Do the two cities show similar behaviour?
- Is one driver clearly stronger than another?

The real script supports custom driver lists, custom response
columns, city filtering, year filtering, optional subsampling,
trend overlays, and several y-axis sharing modes. This gallery
page uses a compact synthetic dataset so the lesson is fully
executable during the documentation build.

.. GENERATED FROM PYTHON SOURCE LINES 58-62

Imports
-------
We use the real plotting backend from the project script, so
this example teaches the actual function used by the CLI.

.. GENERATED FROM PYTHON SOURCE LINES 62-77

.. code-block:: Python


    from __future__ import annotations

    import tempfile
    from pathlib import Path

    import matplotlib.image as mpimg
    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd

    from geoprior.scripts.plot_driver_response import (
        plot_driver_response,
    )


.. GENERATED FROM PYTHON SOURCE LINES 78-95

Step 1 - Build a synthetic driver-response dataset
--------------------------------------------------
The real plotting function expects a table with:

- a ``city`` column,
- one or more numeric driver columns,
- one numeric response column.

For a teaching page, we want relationships that are easy to
read:

- a groundwater-related driver with a strong positive response,
- a rainfall driver with a gentler compensating effect,
- an urban-load driver with a moderate positive effect.

We build two synthetic cities with slightly different response
shapes so the reader can compare them.

.. GENERATED FROM PYTHON SOURCE LINES 95-172

.. code-block:: Python


    rng = np.random.default_rng(7)
    n_city = 2600

    def make_city_df(
        city: str,
        *,
        z_shift: float,
        rain_shift: float,
        dens_shift: float,
        noise: float,
    ) -> pd.DataFrame:
        z_gwl = rng.normal(
            loc=8.0 + z_shift,
            scale=2.2,
            size=n_city,
        )
        rainfall = rng.normal(
            loc=1400.0 + rain_shift,
            scale=230.0,
            size=n_city,
        )
        building_density = rng.uniform(
            0.15 + dens_shift,
            0.92,
            size=n_city,
        )

        # Response: cumulative subsidence [mm]
        #
        # We make the response:
        # - increase with deeper groundwater drawdown,
        # - decrease slightly with rainfall,
        # - increase with building density,
        # - and keep some nonlinear structure.
        subsidence_cum = (
            18.0
            + 4.8 * z_gwl
            - 0.010 * rainfall
            + 21.0 * building_density
            + 2.0 * np.sin(0.9 * z_gwl)
            + rng.normal(0.0, noise, size=n_city)
        )

        return pd.DataFrame(
            {
                "city": city,
                "z_gwl": z_gwl,
                "rainfall": rainfall,
                "building_density": building_density,
                "subsidence_cum": subsidence_cum,
            }
        )


    df = pd.concat(
        [
            make_city_df(
                "Nansha",
                z_shift=0.0,
                rain_shift=0.0,
                dens_shift=0.00,
                noise=4.5,
            ),
            make_city_df(
                "Zhongshan",
                z_shift=1.2,
                rain_shift=-80.0,
                dens_shift=-0.03,
                noise=5.2,
            ),
        ],
        ignore_index=True,
    )

    print(df.head().to_string(index=False))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

      city  z_gwl  rainfall  building_density  subsidence_cum
    Nansha 8.0027 1540.3129            0.8830         59.3572
    Nansha 8.6572 1547.0135            0.7851         59.4123
    Nansha 7.3969 1118.1415            0.6385         49.7034
    Nansha 6.0407 1326.0462            0.6842         40.8612
    Nansha 6.9997 1432.3947            0.7467         48.8158


.. GENERATED FROM PYTHON SOURCE LINES 173-182

Step 2 - Choose the teaching drivers
------------------------------------
The plotting function accepts a list of driver column names.
Here we choose three drivers that create visually distinct
relationships:

- ``z_gwl``              : strongest positive trend
- ``rainfall``           : weaker negative trend
- ``building_density``   : moderate positive trend

.. GENERATED FROM PYTHON SOURCE LINES 182-189

.. code-block:: Python


    drivers = [
        "z_gwl",
        "rainfall",
        "building_density",
    ]


.. GENERATED FROM PYTHON SOURCE LINES 190-196

Step 3 - Render the real figure
-------------------------------
The backend uses hexbin panels with logarithmic count coloring.
We also enable the robust trend line, which is computed from
binned medians. This is especially useful when the cloud is
broad and a raw linear fit would be misleading.

.. GENERATED FROM PYTHON SOURCE LINES 196-227

.. code-block:: Python


    tmp_dir = Path(
        tempfile.mkdtemp(prefix="gp_sg_driver_resp_")
    )
    out_base = str(
        tmp_dir / "driver_response_gallery"
    )

    plot_driver_response(
        df,
        cities=["Nansha", "Zhongshan"],
        drivers=drivers,
        ycol="subsidence_cum",
        sharey="row",
        out=out_base,
        gridsize=42,
        vmin=1,
        trend=True,
        trend_bins=24,
        trend_min_n=18,
        show_legend=True,
        show_labels=True,
        show_ticklabels=True,
        show_title=True,
        show_panel_titles=True,
        title=(
            "Synthetic driver-response sanity: "
            "how cumulative subsidence varies with key drivers"
        ),
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [OK] wrote /tmp/gp_sg_driver_resp_oh555370/driver_response_gallery.png/.svg


.. GENERATED FROM PYTHON SOURCE LINES 228-233

Step 4 - Display the saved figure inside the gallery page
---------------------------------------------------------
The plotting function writes PNG and SVG outputs, then closes
the figure. We reload the PNG so Sphinx-Gallery shows the real
result directly on the page.

.. GENERATED FROM PYTHON SOURCE LINES 233-240

.. code-block:: Python


    img = mpimg.imread(tmp_dir / "driver_response_gallery.png")

    fig, ax = plt.subplots(figsize=(8.6, 5.2))
    ax.imshow(img)
    ax.axis("off")


.. image-sg:: /auto_examples/figure_generation/images/sphx_glr_plot_driver_response_001.png
   :alt: plot driver response
   :srcset: /auto_examples/figure_generation/images/sphx_glr_plot_driver_response_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    (np.float64(-0.5), np.float64(4237.5), np.float64(3088.5), np.float64(-0.5))


.. GENERATED FROM PYTHON SOURCE LINES 241-249

Step 5 - Inspect the relationships numerically
----------------------------------------------
A visual plot is useful, but a teaching page becomes stronger
when it also shows a small quantitative summary.

Here we compute simple rank correlations for each city-driver
pair. The exact plotting backend does not require this step,
but it helps connect visual trends to a numeric intuition.

.. GENERATED FROM PYTHON SOURCE LINES 249-269

.. code-block:: Python


    summary_rows: list[dict[str, float | str]] = []

    for city, sub in df.groupby("city", sort=True):
        for d in drivers:
            rho = sub[d].corr(
                sub["subsidence_cum"],
                method="spearman",
            )
            summary_rows.append(
                {
                    "city": city,
                    "driver": d,
                    "spearman_rho": float(rho),
                }
            )

    summary = pd.DataFrame(summary_rows)
    print(summary.to_string(index=False))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

         city           driver  spearman_rho
       Nansha            z_gwl        0.8303
       Nansha         rainfall       -0.1449
       Nansha building_density        0.3409
    Zhongshan            z_gwl        0.7845
    Zhongshan         rainfall       -0.1903
    Zhongshan building_density        0.4021


.. GENERATED FROM PYTHON SOURCE LINES 270-292

Step 6 - Learn how to read one panel
------------------------------------
Each panel should be read in two layers.

First layer: density
~~~~~~~~~~~~~~~~~~~~
The hexagons show where observations are concentrated. Darker
or denser zones tell you where the bulk of the dataset lives.
This prevents a few isolated points from dominating your
interpretation.

Second layer: robust trend
~~~~~~~~~~~~~~~~~~~~~~~~~~
The black line is a robust trend built from binned medians.
This is useful because it captures the central directional
relationship without assuming a strict linear form.

Together, these two layers answer:

- where are most samples?
- what is the central trend?
- is the relationship straight, curved, weak, or strong?

.. GENERATED FROM PYTHON SOURCE LINES 294-320

Step 7 - Compare the drivers
----------------------------
Now let us interpret the synthetic example.

z_gwl
~~~~~
This is the strongest driver in the lesson. Its panels should
show the clearest positive rise in cumulative subsidence.

rainfall
~~~~~~~~
This relationship is weaker and slightly negative. In a real
hydrogeological setting, stronger recharge can partially reduce
the conditions associated with subsidence, though the exact
interpretation depends on data conventions and lag structure.

building_density
~~~~~~~~~~~~~~~~
This is a slower structural driver. Its response tends to be
smoother and more monotonic in the synthetic example.

Why city-to-city comparison matters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Because the figure is arranged by city, the user can quickly
ask whether the same driver behaves similarly in both places.
That is one of the main strengths of this figure.

.. GENERATED FROM PYTHON SOURCE LINES 322-343

Step 8 - Why sharey matters
---------------------------
The real script supports:

- ``sharey="none"``
- ``sharey="row"``
- ``sharey="all"``

In this lesson we use ``row``.

That is often a good default because:

- all drivers for one city become directly comparable,
- while still allowing another city to keep its own response
  scale if needed.

``sharey="all"`` is stricter and makes cross-city comparison
more direct.

``sharey="none"`` gives each panel maximum local readability,
but weaker visual comparability.

.. GENERATED FROM PYTHON SOURCE LINES 345-360

Step 9 - Practical takeaway
---------------------------
This figure is best used as a sanity-check and interpretation
tool.

It helps you verify that the learned or observed response
behaves plausibly with respect to key drivers before you move
on to more elaborate spatial or physics interpretation.

In practice, it is especially good for:

- spotting obviously implausible relationships,
- comparing driver influence across cities,
- and communicating directional behaviour in a compact,
  visually intuitive way.

.. GENERATED FROM PYTHON SOURCE LINES 362-422

Command-line version
--------------------
The same figure can be produced from the CLI.

The real script supports:

- ``--src`` as a required CSV file or directory,
- ``--file`` when ``--src`` is a directory with one combined CSV,
- ``--ns-file`` and ``--zh-file`` for per-city files,
- city flags through the shared city selector,
- ``--year`` filtering,
- optional ``--sample-frac`` and ``--sample-n``,
- ``--drivers`` as a comma list,
- ``--response`` (alias ``--col``),
- ``--subs-kind`` to choose ``subsidence_cum`` or
  ``subsidence`` when ``--response`` is omitted,
- ``--sharey`` with ``none``, ``row``, or ``all``,
- and the plotting controls for trend, gridsize, and labels.

Legacy dispatcher:

.. code-block:: bash

   python -m scripts plot-driver-response \
     --src results/combined_driver_table.csv \
     --drivers z_gwl,rainfall,building_density \
     --response subsidence_cum \
     --sharey row \
     --gridsize 60 \
     --trend true \
     --trend-bins 30 \
     --trend-min-n 20 \
     --show-title true \
     --show-panel-titles true \
     --out figS2_driver_response

Directory-based use:

.. code-block:: bash

   python -m scripts plot-driver-response \
     --src results/driver_tables \
     --ns-file nansha_final_main_std.harmonized.csv \
     --zh-file zhongshan_final_main_std.harmonized.csv \
     --drivers z_gwl,rainfall,building_density \
     --subs-kind cum \
     --sharey all \
     --out figS2_driver_response

Modern CLI:

.. code-block:: bash

   geoprior plot driver-response \
     --src results/combined_driver_table.csv \
     --drivers z_gwl,rainfall,building_density \
     --response subsidence_cum \
     --out figS2_driver_response

The gallery page teaches the figure.
The command line reproduces it in a workflow.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 3.355 seconds)


.. _sphx_glr_download_auto_examples_figure_generation_plot_driver_response.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_driver_response.ipynb <plot_driver_response.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_driver_response.py <plot_driver_response.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_driver_response.zip <plot_driver_response.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_