.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/tables_and_summaries/summarize_hotspots.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_tables_and_summaries_summarize_hotspots.py: Summarize hotspot point clouds into tidy group tables ===================================================== This example teaches you how to use GeoPrior's ``summarize-hotspots`` utility. Unlike the plotting scripts, this command is a table builder. It starts from a hotspot *point cloud* CSV and converts it into a tidy summary grouped by city, year, and hotspot kind. Why this matters ---------------- A hotspot map or point cloud is useful for visual inspection, but it is not yet the compact artifact that downstream analysis usually needs. This builder converts hotspot points into a grouped table with: - hotspot counts, - min/mean/max hotspot subsidence values, - min/mean/max hotspot anomaly values, - optional baseline summaries, - optional threshold summaries. That makes it a strong lesson page for the ``tables_and_summaries`` section. .. GENERATED FROM PYTHON SOURCE LINES 31-36 Imports ------- We call the real production entrypoint from the project code. Then we read the generated CSV back in and build one compact teaching preview. .. GENERATED FROM PYTHON SOURCE LINES 36-51 .. code-block:: Python from __future__ import annotations import tempfile from pathlib import Path import matplotlib.pyplot as plt import numpy as np import pandas as pd from geoprior.scripts.summarize_hotspots import ( summarize_hotspots, summarize_hotspots_main, ) .. GENERATED FROM PYTHON SOURCE LINES 52-71 Build a compact synthetic hotspot point cloud --------------------------------------------- The real script expects a hotspot CSV produced by the spatial forecast workflow, with fields such as: - city - year - kind - value - metric_value and optionally: - baseline_value - threshold For the lesson, we create two cities, three years, and two hotspot kinds. That is enough to show how the builder groups the point cloud into one summary row per (city, year, kind). .. GENERATED FROM PYTHON SOURCE LINES 71-131 .. code-block:: Python rng = np.random.default_rng(11) rows: list[dict[str, object]] = [] for city in ["Nansha", "Zhongshan"]: city_shift = 0.0 if city == "Nansha" else 6.0 for year in [2025, 2027, 2030]: year_shift = {2025: 0.0, 2027: 3.5, 2030: 8.0}[year] for kind in ["q50", "q90"]: kind_shift = 0.0 if kind == "q50" else 5.0 n_hot = 12 if kind == "q50" else 8 if city == "Zhongshan": n_hot += 3 baseline_mean = 36.0 + city_shift threshold = 11.0 + 0.7 * year_shift + 0.4 * kind_shift for i in range(n_hot): coord_x = 100 + rng.normal(0.0, 18.0) coord_y = 200 + rng.normal(0.0, 14.0) metric_value = max( 0.1, threshold + rng.normal(2.8 + 0.2 * year_shift, 1.3), ) baseline_value = max( 0.1, baseline_mean + rng.normal(0.0, 2.2), ) value = baseline_value + metric_value rows.append( { "city": city, "panel": "future_hotspots", "kind": kind, "year": year, "coord_x": float(coord_x), "coord_y": float(coord_y), "value": float(value), "hotspot_mode": "delta_abs", "hotspot_quantile": 0.90, "metric_value": float(metric_value), "baseline_value": float(baseline_value), "threshold": float(threshold), } ) hotspots_df = pd.DataFrame(rows) print("Hotspot point-cloud preview") print(hotspots_df.head(10).to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Hotspot point-cloud preview city panel kind year coord_x coord_y value hotspot_mode hotspot_quantile metric_value baseline_value threshold Nansha future_hotspots q50 2025 100.6155 219.0365 50.2695 delta_abs 0.9000 15.3921 34.8773 11.0000 Nansha future_hotspots q50 2025 94.6365 192.6166 50.4173 delta_abs 0.9000 14.5406 35.8767 11.0000 Nansha future_hotspots q50 2025 113.4439 174.1375 51.6244 delta_abs 0.9000 15.8365 35.7878 11.0000 Nansha future_hotspots q50 2025 112.2468 198.0881 50.3260 delta_abs 0.9000 13.3072 37.0188 11.0000 Nansha future_hotspots q50 2025 114.8412 197.1646 51.1099 delta_abs 0.9000 13.6014 37.5085 11.0000 Nansha future_hotspots q50 2025 84.3339 178.7986 48.8382 delta_abs 0.9000 14.3135 34.5248 11.0000 Nansha future_hotspots q50 2025 65.4339 188.6032 46.5671 delta_abs 0.9000 13.1921 33.3750 11.0000 Nansha future_hotspots q50 2025 73.1357 200.5129 50.4535 delta_abs 0.9000 14.9664 35.4871 11.0000 Nansha future_hotspots q50 2025 86.6153 205.3899 50.0724 delta_abs 0.9000 14.7324 35.3400 11.0000 Nansha future_hotspots q50 2025 109.8040 214.6003 47.7412 delta_abs 0.9000 13.5310 34.2103 11.0000 .. GENERATED FROM PYTHON SOURCE LINES 132-136 Write the synthetic hotspot CSV ------------------------------- The production command consumes a point-cloud CSV, so we follow the same workflow here. .. GENERATED FROM PYTHON SOURCE LINES 136-147 .. code-block:: Python tmp_dir = Path( tempfile.mkdtemp(prefix="gp_sg_hotspots_summary_") ) hotspot_csv = tmp_dir / "fig6_hotspot_points.synthetic.csv" hotspots_df.to_csv(hotspot_csv, index=False) print("") print(f"Input hotspot CSV written to: {hotspot_csv}") .. rst-class:: sphx-glr-script-out .. code-block:: none Input hotspot CSV written to: /tmp/gp_sg_hotspots_summary_ocgu5__9/fig6_hotspot_points.synthetic.csv .. GENERATED FROM PYTHON SOURCE LINES 148-151 Run the real summarizer ----------------------- We ask the production command to build the grouped summary CSV. .. GENERATED FROM PYTHON SOURCE LINES 151-166 .. code-block:: Python out_csv = tmp_dir / "fig6_hotspot_summary.csv" summarize_hotspots_main( [ "--hotspot-csv", str(hotspot_csv), "--out", str(out_csv), "--quiet", "false", ], prog="summarize-hotspots", ) .. rst-class:: sphx-glr-script-out .. code-block:: none city year kind n_hotspots value_min value_mean value_max metric_min metric_mean metric_max baseline_min baseline_max baseline_mean threshold_min threshold_max Nansha 2025 q50 12.0000 46.5671 49.4705 51.6244 11.5458 14.1823 15.8365 33.1739 37.5085 35.2882 11.0000 11.0000 Nansha 2025 q90 8.0000 50.1120 52.9304 57.3783 13.5020 16.3759 17.6013 33.0696 39.9406 36.5545 13.0000 13.0000 Nansha 2027 q50 12.0000 49.1654 52.7279 55.9298 15.3097 17.1285 19.5806 32.0881 38.6542 35.5994 13.4500 13.4500 Nansha 2027 q90 8.0000 48.9804 55.0306 58.4324 16.1703 19.0887 21.7183 32.8101 40.4233 35.9420 15.4500 15.4500 Nansha 2030 q50 12.0000 53.7489 56.8669 59.5841 19.0182 20.3903 22.2338 34.1392 38.1938 36.4766 16.6000 16.6000 Nansha 2030 q90 8.0000 54.0185 57.7708 63.7386 21.5822 22.7645 24.0733 30.9021 39.6653 35.0063 18.6000 18.6000 Zhongshan 2025 q50 15.0000 53.3934 57.1740 61.2156 11.7490 14.0620 15.8060 39.2505 46.4539 43.1120 11.0000 11.0000 Zhongshan 2025 q90 11.0000 53.8707 59.0115 66.2886 12.1832 15.5184 16.4947 40.2590 49.9718 43.4931 13.0000 13.0000 Zhongshan 2027 q50 15.0000 55.8386 58.9779 61.2734 13.3199 17.2894 19.2169 36.8076 44.2371 41.6885 13.4500 13.4500 Zhongshan 2027 q90 11.0000 57.1924 60.3965 65.2812 17.0242 19.0553 20.8428 38.2264 46.0037 41.3412 15.4500 15.4500 Zhongshan 2030 q50 15.0000 57.8785 62.6666 67.5113 18.5462 21.1885 23.1967 37.4222 45.5846 41.4780 16.6000 16.6000 Zhongshan 2030 q90 11.0000 61.8799 65.4382 69.0123 21.8191 23.3426 24.6560 39.0328 44.3724 42.0956 18.6000 18.6000 [OK] summary -> /tmp/gp_sg_hotspots_summary_ocgu5__9/fig6_hotspot_summary.csv .. GENERATED FROM PYTHON SOURCE LINES 167-170 Read the generated summary table -------------------------------- The output has one row per (city, year, kind) group. .. GENERATED FROM PYTHON SOURCE LINES 170-181 .. code-block:: Python summary = pd.read_csv(out_csv) print("") print("Written file") print(" -", out_csv.name) print("") print("Grouped summary table") print(summary.to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Written file - fig6_hotspot_summary.csv Grouped summary table city year kind n_hotspots value_min value_mean value_max metric_min metric_mean metric_max baseline_min baseline_max baseline_mean threshold_min threshold_max Nansha 2025 q50 12.0000 46.5671 49.4705 51.6244 11.5458 14.1823 15.8365 33.1739 37.5085 35.2882 11.0000 11.0000 Nansha 2025 q90 8.0000 50.1120 52.9304 57.3783 13.5020 16.3759 17.6013 33.0696 39.9406 36.5545 13.0000 13.0000 Nansha 2027 q50 12.0000 49.1654 52.7279 55.9298 15.3097 17.1285 19.5806 32.0881 38.6542 35.5994 13.4500 13.4500 Nansha 2027 q90 8.0000 48.9804 55.0306 58.4324 16.1703 19.0887 21.7183 32.8101 40.4233 35.9420 15.4500 15.4500 Nansha 2030 q50 12.0000 53.7489 56.8669 59.5841 19.0182 20.3903 22.2338 34.1392 38.1938 36.4766 16.6000 16.6000 Nansha 2030 q90 8.0000 54.0185 57.7708 63.7386 21.5822 22.7645 24.0733 30.9021 39.6653 35.0063 18.6000 18.6000 Zhongshan 2025 q50 15.0000 53.3934 57.1740 61.2156 11.7490 14.0620 15.8060 39.2505 46.4539 43.1120 11.0000 11.0000 Zhongshan 2025 q90 11.0000 53.8707 59.0115 66.2886 12.1832 15.5184 16.4947 40.2590 49.9718 43.4931 13.0000 13.0000 Zhongshan 2027 q50 15.0000 55.8386 58.9779 61.2734 13.3199 17.2894 19.2169 36.8076 44.2371 41.6885 13.4500 13.4500 Zhongshan 2027 q90 11.0000 57.1924 60.3965 65.2812 17.0242 19.0553 20.8428 38.2264 46.0037 41.3412 15.4500 15.4500 Zhongshan 2030 q50 15.0000 57.8785 62.6666 67.5113 18.5462 21.1885 23.1967 37.4222 45.5846 41.4780 16.6000 16.6000 Zhongshan 2030 q90 11.0000 61.8799 65.4382 69.0123 21.8191 23.3426 24.6560 39.0328 44.3724 42.0956 18.6000 18.6000 .. GENERATED FROM PYTHON SOURCE LINES 182-186 Compare with the direct in-memory API ------------------------------------- The script also exposes the core grouping function directly. This is useful for tests and notebook workflows. .. GENERATED FROM PYTHON SOURCE LINES 186-193 .. code-block:: Python summary_api = summarize_hotspots(hotspots_df) print("") print("Direct API result matches CSV output:") print(summary_api.equals(summary)) .. rst-class:: sphx-glr-script-out .. code-block:: none Direct API result matches CSV output: False .. GENERATED FROM PYTHON SOURCE LINES 194-204 Build one compact visual preview -------------------------------- This preview is not part of the production builder itself. It is a teaching aid for the gallery page. Left: hotspot counts by city-year-kind. Right: ranked anomaly means. .. GENERATED FROM PYTHON SOURCE LINES 204-241 .. code-block:: Python summary["group"] = ( summary["city"].astype(str) + " | " + summary["year"].astype(str) + " | " + summary["kind"].astype(str) ) ranked = summary.sort_values( ["metric_mean", "n_hotspots"], ascending=[False, False], ).reset_index(drop=True) fig, axes = plt.subplots( 1, 2, figsize=(10.0, 4.2), constrained_layout=True, ) # Hotspot counts ax = axes[0] ax.bar(summary["group"], summary["n_hotspots"]) ax.set_title("Hotspot counts by group") ax.set_xlabel("City | Year | Kind") ax.set_ylabel("n_hotspots") ax.tick_params(axis="x", rotation=75) # Ranked anomaly means ax = axes[1] ax.bar(ranked["group"], ranked["metric_mean"]) ax.set_title("Ranked anomaly means") ax.set_xlabel("City | Year | Kind") ax.set_ylabel("metric_mean [mm/yr]") ax.tick_params(axis="x", rotation=75) .. image-sg:: /auto_examples/tables_and_summaries/images/sphx_glr_summarize_hotspots_001.png :alt: Hotspot counts by group, Ranked anomaly means :srcset: /auto_examples/tables_and_summaries/images/sphx_glr_summarize_hotspots_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 242-264 Learn how to read the summary table ----------------------------------- Each row corresponds to one: - city - year - kind combination. The core reading order is: 1. read ``n_hotspots`` to see how large the hotspot cloud is; 2. read ``value_*`` to understand hotspot subsidence levels; 3. read ``metric_*`` to understand hotspot anomaly intensity; 4. use ``baseline_*`` and ``threshold_*`` when the source CSV provides those optional columns. In other words: - the point-cloud CSV is the geometric/point-level artifact, - the summary CSV is the compact grouped artifact. .. GENERATED FROM PYTHON SOURCE LINES 266-284 What the key columns mean ------------------------- ``value_*`` summary of the hotspot subsidence values themselves. ``metric_*`` summary of the anomaly or delta measure used to define or score hotspot points. ``baseline_*`` optional summary of the baseline values attached to the hotspot points. ``threshold_*`` optional summary of the group threshold values. The threshold is often constant within a group, but the script still keeps both min and max for robustness. .. GENERATED FROM PYTHON SOURCE LINES 286-305 Why this builder is useful in practice -------------------------------------- This builder is a bridge between hotspot maps and later reporting. A useful workflow is: 1. generate hotspot points from a spatial forecast workflow, 2. summarize them by city, year, and kind, 3. compare counts and anomaly magnitudes across groups, 4. only then build narrative tables or paper figures. This keeps: - hotspot extraction, - group-level tabulation, - and final visualization clearly separated. .. GENERATED FROM PYTHON SOURCE LINES 307-319 Why this page belongs after compute_hotspots.py ----------------------------------------------- The previous lesson built a compact city-year hotspot table from forecast CSVs directly. This lesson starts later in the chain: - it assumes hotspot *points* already exist, - and it compresses those points into grouped summary rows. So the two scripts are related, but they summarize different intermediate artifacts. .. GENERATED FROM PYTHON SOURCE LINES 321-351 Command-line version -------------------- The same lesson can be reproduced from the CLI. Legacy dispatcher: .. code-block:: bash python -m scripts summarize-hotspots \ --hotspot-csv results/figs/fig6_hotspot_points.csv \ --out fig6_hotspot_summary.csv Quiet mode: .. code-block:: bash python -m scripts summarize-hotspots \ --hotspot-csv results/figs/fig6_hotspot_points.csv \ --out fig6_hotspot_summary.csv \ --quiet true Modern CLI: .. code-block:: bash geoprior build hotspots-summary \ --hotspot-csv results/figs/fig6_hotspot_points.csv \ --out fig6_hotspot_summary.csv The gallery page teaches the builder. The command line reproduces it in a workflow. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.311 seconds) .. _sphx_glr_download_auto_examples_tables_and_summaries_summarize_hotspots.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: summarize_hotspots.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: summarize_hotspots.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: summarize_hotspots.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_