.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/tables_and_summaries/summarize_hotspots.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_tables_and_summaries_summarize_hotspots.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_tables_and_summaries_summarize_hotspots.py:


Summarize hotspot point clouds into tidy group tables
=====================================================

This example teaches you how to use GeoPrior's
``summarize-hotspots`` utility.

Unlike the plotting scripts, this command is a table builder.
It starts from a hotspot *point cloud* CSV and converts it into
a tidy summary grouped by city, year, and hotspot kind.

Why this matters
----------------
A hotspot map or point cloud is useful for visual inspection,
but it is not yet the compact artifact that downstream analysis
usually needs.

This builder converts hotspot points into a grouped table with:

- hotspot counts,
- min/mean/max hotspot subsidence values,
- min/mean/max hotspot anomaly values,
- optional baseline summaries,
- optional threshold summaries.

That makes it a strong lesson page for the
``tables_and_summaries`` section.

.. GENERATED FROM PYTHON SOURCE LINES 31-36

Imports
-------
We call the real production entrypoint from the project code.
Then we read the generated CSV back in and build one compact
teaching preview.

.. GENERATED FROM PYTHON SOURCE LINES 36-51

.. code-block:: Python


    from __future__ import annotations

    import tempfile
    from pathlib import Path

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd

    from geoprior.scripts.summarize_hotspots import (
        summarize_hotspots,
        summarize_hotspots_main,
    )


.. GENERATED FROM PYTHON SOURCE LINES 52-71

Build a compact synthetic hotspot point cloud
---------------------------------------------
The real script expects a hotspot CSV produced by the spatial
forecast workflow, with fields such as:

- city
- year
- kind
- value
- metric_value

and optionally:

- baseline_value
- threshold

For the lesson, we create two cities, three years, and two
hotspot kinds. That is enough to show how the builder groups the
point cloud into one summary row per (city, year, kind).

.. GENERATED FROM PYTHON SOURCE LINES 71-131

.. code-block:: Python


    rng = np.random.default_rng(11)

    rows: list[dict[str, object]] = []

    for city in ["Nansha", "Zhongshan"]:
        city_shift = 0.0 if city == "Nansha" else 6.0

        for year in [2025, 2027, 2030]:
            year_shift = {2025: 0.0, 2027: 3.5, 2030: 8.0}[year]

            for kind in ["q50", "q90"]:
                kind_shift = 0.0 if kind == "q50" else 5.0

                n_hot = 12 if kind == "q50" else 8
                if city == "Zhongshan":
                    n_hot += 3

                baseline_mean = 36.0 + city_shift
                threshold = 11.0 + 0.7 * year_shift + 0.4 * kind_shift

                for i in range(n_hot):
                    coord_x = 100 + rng.normal(0.0, 18.0)
                    coord_y = 200 + rng.normal(0.0, 14.0)

                    metric_value = max(
                        0.1,
                        threshold
                        + rng.normal(2.8 + 0.2 * year_shift, 1.3),
                    )

                    baseline_value = max(
                        0.1,
                        baseline_mean + rng.normal(0.0, 2.2),
                    )

                    value = baseline_value + metric_value

                    rows.append(
                        {
                            "city": city,
                            "panel": "future_hotspots",
                            "kind": kind,
                            "year": year,
                            "coord_x": float(coord_x),
                            "coord_y": float(coord_y),
                            "value": float(value),
                            "hotspot_mode": "delta_abs",
                            "hotspot_quantile": 0.90,
                            "metric_value": float(metric_value),
                            "baseline_value": float(baseline_value),
                            "threshold": float(threshold),
                        }
                    )

    hotspots_df = pd.DataFrame(rows)

    print("Hotspot point-cloud preview")
    print(hotspots_df.head(10).to_string(index=False))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Hotspot point-cloud preview
      city           panel kind  year  coord_x  coord_y   value hotspot_mode  hotspot_quantile  metric_value  baseline_value  threshold
    Nansha future_hotspots  q50  2025 100.6155 219.0365 50.2695    delta_abs            0.9000       15.3921         34.8773    11.0000
    Nansha future_hotspots  q50  2025  94.6365 192.6166 50.4173    delta_abs            0.9000       14.5406         35.8767    11.0000
    Nansha future_hotspots  q50  2025 113.4439 174.1375 51.6244    delta_abs            0.9000       15.8365         35.7878    11.0000
    Nansha future_hotspots  q50  2025 112.2468 198.0881 50.3260    delta_abs            0.9000       13.3072         37.0188    11.0000
    Nansha future_hotspots  q50  2025 114.8412 197.1646 51.1099    delta_abs            0.9000       13.6014         37.5085    11.0000
    Nansha future_hotspots  q50  2025  84.3339 178.7986 48.8382    delta_abs            0.9000       14.3135         34.5248    11.0000
    Nansha future_hotspots  q50  2025  65.4339 188.6032 46.5671    delta_abs            0.9000       13.1921         33.3750    11.0000
    Nansha future_hotspots  q50  2025  73.1357 200.5129 50.4535    delta_abs            0.9000       14.9664         35.4871    11.0000
    Nansha future_hotspots  q50  2025  86.6153 205.3899 50.0724    delta_abs            0.9000       14.7324         35.3400    11.0000
    Nansha future_hotspots  q50  2025 109.8040 214.6003 47.7412    delta_abs            0.9000       13.5310         34.2103    11.0000


.. GENERATED FROM PYTHON SOURCE LINES 132-136

Write the synthetic hotspot CSV
-------------------------------
The production command consumes a point-cloud CSV, so we follow
the same workflow here.

.. GENERATED FROM PYTHON SOURCE LINES 136-147

.. code-block:: Python


    tmp_dir = Path(
        tempfile.mkdtemp(prefix="gp_sg_hotspots_summary_")
    )
    hotspot_csv = tmp_dir / "fig6_hotspot_points.synthetic.csv"

    hotspots_df.to_csv(hotspot_csv, index=False)

    print("")
    print(f"Input hotspot CSV written to: {hotspot_csv}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Input hotspot CSV written to: /tmp/gp_sg_hotspots_summary_ocgu5__9/fig6_hotspot_points.synthetic.csv


.. GENERATED FROM PYTHON SOURCE LINES 148-151

Run the real summarizer
-----------------------
We ask the production command to build the grouped summary CSV.

.. GENERATED FROM PYTHON SOURCE LINES 151-166

.. code-block:: Python


    out_csv = tmp_dir / "fig6_hotspot_summary.csv"

    summarize_hotspots_main(
        [
            "--hotspot-csv",
            str(hotspot_csv),
            "--out",
            str(out_csv),
            "--quiet",
            "false",
        ],
        prog="summarize-hotspots",
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

         city  year kind  n_hotspots  value_min  value_mean  value_max  metric_min  metric_mean  metric_max  baseline_min  baseline_max  baseline_mean  threshold_min  threshold_max
       Nansha  2025  q50     12.0000    46.5671     49.4705    51.6244     11.5458      14.1823     15.8365       33.1739       37.5085        35.2882        11.0000        11.0000
       Nansha  2025  q90      8.0000    50.1120     52.9304    57.3783     13.5020      16.3759     17.6013       33.0696       39.9406        36.5545        13.0000        13.0000
       Nansha  2027  q50     12.0000    49.1654     52.7279    55.9298     15.3097      17.1285     19.5806       32.0881       38.6542        35.5994        13.4500        13.4500
       Nansha  2027  q90      8.0000    48.9804     55.0306    58.4324     16.1703      19.0887     21.7183       32.8101       40.4233        35.9420        15.4500        15.4500
       Nansha  2030  q50     12.0000    53.7489     56.8669    59.5841     19.0182      20.3903     22.2338       34.1392       38.1938        36.4766        16.6000        16.6000
       Nansha  2030  q90      8.0000    54.0185     57.7708    63.7386     21.5822      22.7645     24.0733       30.9021       39.6653        35.0063        18.6000        18.6000
    Zhongshan  2025  q50     15.0000    53.3934     57.1740    61.2156     11.7490      14.0620     15.8060       39.2505       46.4539        43.1120        11.0000        11.0000
    Zhongshan  2025  q90     11.0000    53.8707     59.0115    66.2886     12.1832      15.5184     16.4947       40.2590       49.9718        43.4931        13.0000        13.0000
    Zhongshan  2027  q50     15.0000    55.8386     58.9779    61.2734     13.3199      17.2894     19.2169       36.8076       44.2371        41.6885        13.4500        13.4500
    Zhongshan  2027  q90     11.0000    57.1924     60.3965    65.2812     17.0242      19.0553     20.8428       38.2264       46.0037        41.3412        15.4500        15.4500
    Zhongshan  2030  q50     15.0000    57.8785     62.6666    67.5113     18.5462      21.1885     23.1967       37.4222       45.5846        41.4780        16.6000        16.6000
    Zhongshan  2030  q90     11.0000    61.8799     65.4382    69.0123     21.8191      23.3426     24.6560       39.0328       44.3724        42.0956        18.6000        18.6000

    [OK] summary -> /tmp/gp_sg_hotspots_summary_ocgu5__9/fig6_hotspot_summary.csv


.. GENERATED FROM PYTHON SOURCE LINES 167-170

Read the generated summary table
--------------------------------
The output has one row per (city, year, kind) group.

.. GENERATED FROM PYTHON SOURCE LINES 170-181

.. code-block:: Python


    summary = pd.read_csv(out_csv)

    print("")
    print("Written file")
    print(" -", out_csv.name)

    print("")
    print("Grouped summary table")
    print(summary.to_string(index=False))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Written file
     - fig6_hotspot_summary.csv

    Grouped summary table
         city  year kind  n_hotspots  value_min  value_mean  value_max  metric_min  metric_mean  metric_max  baseline_min  baseline_max  baseline_mean  threshold_min  threshold_max
       Nansha  2025  q50     12.0000    46.5671     49.4705    51.6244     11.5458      14.1823     15.8365       33.1739       37.5085        35.2882        11.0000        11.0000
       Nansha  2025  q90      8.0000    50.1120     52.9304    57.3783     13.5020      16.3759     17.6013       33.0696       39.9406        36.5545        13.0000        13.0000
       Nansha  2027  q50     12.0000    49.1654     52.7279    55.9298     15.3097      17.1285     19.5806       32.0881       38.6542        35.5994        13.4500        13.4500
       Nansha  2027  q90      8.0000    48.9804     55.0306    58.4324     16.1703      19.0887     21.7183       32.8101       40.4233        35.9420        15.4500        15.4500
       Nansha  2030  q50     12.0000    53.7489     56.8669    59.5841     19.0182      20.3903     22.2338       34.1392       38.1938        36.4766        16.6000        16.6000
       Nansha  2030  q90      8.0000    54.0185     57.7708    63.7386     21.5822      22.7645     24.0733       30.9021       39.6653        35.0063        18.6000        18.6000
    Zhongshan  2025  q50     15.0000    53.3934     57.1740    61.2156     11.7490      14.0620     15.8060       39.2505       46.4539        43.1120        11.0000        11.0000
    Zhongshan  2025  q90     11.0000    53.8707     59.0115    66.2886     12.1832      15.5184     16.4947       40.2590       49.9718        43.4931        13.0000        13.0000
    Zhongshan  2027  q50     15.0000    55.8386     58.9779    61.2734     13.3199      17.2894     19.2169       36.8076       44.2371        41.6885        13.4500        13.4500
    Zhongshan  2027  q90     11.0000    57.1924     60.3965    65.2812     17.0242      19.0553     20.8428       38.2264       46.0037        41.3412        15.4500        15.4500
    Zhongshan  2030  q50     15.0000    57.8785     62.6666    67.5113     18.5462      21.1885     23.1967       37.4222       45.5846        41.4780        16.6000        16.6000
    Zhongshan  2030  q90     11.0000    61.8799     65.4382    69.0123     21.8191      23.3426     24.6560       39.0328       44.3724        42.0956        18.6000        18.6000


.. GENERATED FROM PYTHON SOURCE LINES 182-186

Compare with the direct in-memory API
-------------------------------------
The script also exposes the core grouping function directly.
This is useful for tests and notebook workflows.

.. GENERATED FROM PYTHON SOURCE LINES 186-193

.. code-block:: Python


    summary_api = summarize_hotspots(hotspots_df)

    print("")
    print("Direct API result matches CSV output:")
    print(summary_api.equals(summary))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Direct API result matches CSV output:
    False


.. GENERATED FROM PYTHON SOURCE LINES 194-204

Build one compact visual preview
--------------------------------
This preview is not part of the production builder itself.
It is a teaching aid for the gallery page.

Left:
  hotspot counts by city-year-kind.

Right:
  ranked anomaly means.

.. GENERATED FROM PYTHON SOURCE LINES 204-241

.. code-block:: Python


    summary["group"] = (
        summary["city"].astype(str)
        + " | "
        + summary["year"].astype(str)
        + " | "
        + summary["kind"].astype(str)
    )

    ranked = summary.sort_values(
        ["metric_mean", "n_hotspots"],
        ascending=[False, False],
    ).reset_index(drop=True)

    fig, axes = plt.subplots(
        1,
        2,
        figsize=(10.0, 4.2),
        constrained_layout=True,
    )

    # Hotspot counts
    ax = axes[0]
    ax.bar(summary["group"], summary["n_hotspots"])
    ax.set_title("Hotspot counts by group")
    ax.set_xlabel("City | Year | Kind")
    ax.set_ylabel("n_hotspots")
    ax.tick_params(axis="x", rotation=75)

    # Ranked anomaly means
    ax = axes[1]
    ax.bar(ranked["group"], ranked["metric_mean"])
    ax.set_title("Ranked anomaly means")
    ax.set_xlabel("City | Year | Kind")
    ax.set_ylabel("metric_mean [mm/yr]")
    ax.tick_params(axis="x", rotation=75)


.. image-sg:: /auto_examples/tables_and_summaries/images/sphx_glr_summarize_hotspots_001.png
   :alt: Hotspot counts by group, Ranked anomaly means
   :srcset: /auto_examples/tables_and_summaries/images/sphx_glr_summarize_hotspots_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 242-264

Learn how to read the summary table
-----------------------------------
Each row corresponds to one:

- city
- year
- kind

combination.

The core reading order is:

1. read ``n_hotspots`` to see how large the hotspot cloud is;
2. read ``value_*`` to understand hotspot subsidence levels;
3. read ``metric_*`` to understand hotspot anomaly intensity;
4. use ``baseline_*`` and ``threshold_*`` when the source CSV
   provides those optional columns.

In other words:

- the point-cloud CSV is the geometric/point-level artifact,
- the summary CSV is the compact grouped artifact.

.. GENERATED FROM PYTHON SOURCE LINES 266-284

What the key columns mean
-------------------------
``value_*``
    summary of the hotspot subsidence values themselves.

``metric_*``
    summary of the anomaly or delta measure used to define or
    score hotspot points.

``baseline_*``
    optional summary of the baseline values attached to the
    hotspot points.

``threshold_*``
    optional summary of the group threshold values.

The threshold is often constant within a group, but the script
still keeps both min and max for robustness.

.. GENERATED FROM PYTHON SOURCE LINES 286-305

Why this builder is useful in practice
--------------------------------------
This builder is a bridge between hotspot maps and later
reporting.

A useful workflow is:

1. generate hotspot points from a spatial forecast workflow,
2. summarize them by city, year, and kind,
3. compare counts and anomaly magnitudes across groups,
4. only then build narrative tables or paper figures.

This keeps:

- hotspot extraction,
- group-level tabulation,
- and final visualization

clearly separated.

.. GENERATED FROM PYTHON SOURCE LINES 307-319

Why this page belongs after compute_hotspots.py
-----------------------------------------------
The previous lesson built a compact city-year hotspot table from
forecast CSVs directly.

This lesson starts later in the chain:

- it assumes hotspot *points* already exist,
- and it compresses those points into grouped summary rows.

So the two scripts are related, but they summarize different
intermediate artifacts.

.. GENERATED FROM PYTHON SOURCE LINES 321-351

Command-line version
--------------------
The same lesson can be reproduced from the CLI.

Legacy dispatcher:

.. code-block:: bash

   python -m scripts summarize-hotspots \
     --hotspot-csv results/figs/fig6_hotspot_points.csv \
     --out fig6_hotspot_summary.csv

Quiet mode:

.. code-block:: bash

   python -m scripts summarize-hotspots \
     --hotspot-csv results/figs/fig6_hotspot_points.csv \
     --out fig6_hotspot_summary.csv \
     --quiet true

Modern CLI:

.. code-block:: bash

   geoprior build hotspots-summary \
     --hotspot-csv results/figs/fig6_hotspot_points.csv \
     --out fig6_hotspot_summary.csv

The gallery page teaches the builder.
The command line reproduces it in a workflow.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.311 seconds)


.. _sphx_glr_download_auto_examples_tables_and_summaries_summarize_hotspots.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: summarize_hotspots.ipynb <summarize_hotspots.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: summarize_hotspots.py <summarize_hotspots.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: summarize_hotspots.zip <summarize_hotspots.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_