.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/tables_and_summaries/compute_hotspots.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_tables_and_summaries_compute_hotspots.py: Compute hotspot summary tables from forecast CSVs ================================================= This example teaches you how to use GeoPrior's ``compute-hotspots`` utility. Unlike the figure-generation scripts, this command is an artifact builder. It reads evaluation and future forecast CSVs, compares future annual subsidence against a baseline year, and writes one compact hotspot summary table per city and year. Why this matters ---------------- Forecast maps can look dramatic, but downstream reporting usually needs a much more compact artifact. This builder converts forecast CSVs into a tidy hotspot table with: - the number of hotspots, - hotspot subsidence ranges, - hotspot anomaly ranges relative to the baseline year, - the baseline mean, - and the percentile threshold used to define hotspots. That makes it a strong lesson page for the ``tables_and_summaries`` gallery. .. GENERATED FROM PYTHON SOURCE LINES 31-35 Imports ------- We call the real production builder, then read its outputs back in and build one compact teaching preview. .. GENERATED FROM PYTHON SOURCE LINES 35-47 .. code-block:: Python from __future__ import annotations import tempfile from pathlib import Path import matplotlib.pyplot as plt import numpy as np import pandas as pd import geoprior.scripts.compute_hotspots as hotspots_mod .. GENERATED FROM PYTHON SOURCE LINES 48-56 Compatibility shim ------------------ The current script forwards ``prog=...`` into ``parse_args()``, but the helper does not yet accept that keyword. For the lesson page, we patch the helper locally so the example stays runnable. Once the script itself is patched, this shim can be removed. .. GENERATED FROM PYTHON SOURCE LINES 56-70 .. code-block:: Python _orig_parse_args = hotspots_mod.parse_args def _parse_args_compat( argv: list[str] | None = None, *, prog: str | None = None, ): return hotspots_mod.build_parser(prog=prog).parse_args(argv) hotspots_mod.parse_args = _parse_args_compat # type: ignore[assignment] .. GENERATED FROM PYTHON SOURCE LINES 71-99 Build compact synthetic cumulative forecast archives ---------------------------------------------------- The production builder can work from: - explicit per-city eval and future CSVs, - or auto-discovered city source folders. For a self-contained lesson page, we use explicit CSVs. We generate: - Nansha - Zhongshan with: - eval years 2021 and 2022, - future years 2023, 2024, and 2025, - cumulative subsidence columns, - and a stronger hotspot trend in Zhongshan. This is useful because the production script's default mode for this builder is often: - baseline year = 2022 - subsidence kind = cumulative - hotspot quantile = q50 - hotspot threshold = 90th percentile of anomaly magnitude .. GENERATED FROM PYTHON SOURCE LINES 99-229 .. code-block:: Python rng = np.random.default_rng(7) eval_years = [2021, 2022] future_years = [2023, 2024, 2025] n_samples = 120 def _city_frames( *, city: str, level_shift: float, hotspot_boost: float, ) -> tuple[pd.DataFrame, pd.DataFrame]: rows_eval: list[dict[str, object]] = [] rows_future: list[dict[str, object]] = [] hotspot_ids = set(range(16)) for sample_idx in range(n_samples): spatial = 4.5 * np.sin(sample_idx / 11.0) local = rng.normal(0.0, 1.4) annual_2021 = 24.0 + level_shift + 0.7 * spatial + local annual_2022 = 31.0 + level_shift + 0.9 * spatial + local is_hot = sample_idx in hotspot_ids hot_term = hotspot_boost if is_hot else 0.0 annual_2023 = ( annual_2022 + 3.2 + 0.6 * hot_term + rng.normal(0.0, 1.2) ) annual_2024 = ( annual_2023 + 2.6 + 0.8 * hot_term + rng.normal(0.0, 1.2) ) annual_2025 = ( annual_2024 + 1.8 + 1.0 * hot_term + rng.normal(0.0, 1.2) ) # Build cumulative paths. q50_cum_2021 = annual_2021 q50_cum_2022 = q50_cum_2021 + annual_2022 q50_cum_2023 = q50_cum_2022 + annual_2023 q50_cum_2024 = q50_cum_2023 + annual_2024 q50_cum_2025 = q50_cum_2024 + annual_2025 # Eval actuals are slightly noisy but remain monotone. act_annual_2021 = max( 1.0, annual_2021 + rng.normal(0.0, 0.9), ) act_annual_2022 = max( 1.0, annual_2022 + rng.normal(0.0, 1.0), ) act_cum_2021 = act_annual_2021 act_cum_2022 = act_cum_2021 + act_annual_2022 def _q_band(center: float, width: float) -> tuple[float, float]: return max(0.0, center - width), center + width # Eval rows. for year, q50, actual, width in [ (2021, q50_cum_2021, act_cum_2021, 6.5), (2022, q50_cum_2022, act_cum_2022, 7.5), ]: q10, q90 = _q_band(q50, width) rows_eval.append( { "city": city, "sample_idx": sample_idx, "coord_t": year, "subsidence_actual": float(actual), "subsidence_q10": float(q10), "subsidence_q50": float(q50), "subsidence_q90": float(q90), "subsidence_unit": "mm", } ) # Future rows. for year, q50, width in [ (2023, q50_cum_2023, 9.0), (2024, q50_cum_2024, 10.5), (2025, q50_cum_2025, 12.0), ]: q10, q90 = _q_band(q50, width) rows_future.append( { "city": city, "sample_idx": sample_idx, "coord_t": year, "subsidence_q10": float(q10), "subsidence_q50": float(q50), "subsidence_q90": float(q90), "subsidence_unit": "mm", } ) return pd.DataFrame(rows_eval), pd.DataFrame(rows_future) ns_eval_df, ns_future_df = _city_frames( city="Nansha", level_shift=0.0, hotspot_boost=4.0, ) zh_eval_df, zh_future_df = _city_frames( city="Zhongshan", level_shift=5.5, hotspot_boost=7.0, ) print("Nansha eval preview") print(ns_eval_df.head(6).to_string(index=False)) print("") print("Nansha future preview") print(ns_future_df.head(6).to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Nansha eval preview city sample_idx coord_t subsidence_actual subsidence_q10 subsidence_q50 subsidence_q90 subsidence_unit Nansha 0 2021 23.5925 17.5017 24.0017 30.5017 mm Nansha 0 2022 53.6026 47.5034 55.0034 62.5034 mm Nansha 1 2021 24.8110 17.8702 24.3702 30.8702 mm Nansha 1 2022 56.6198 48.3220 55.8220 63.3220 mm Nansha 2 2021 23.5074 18.2172 24.7172 31.2172 mm Nansha 2 2022 54.9296 49.0971 56.5971 64.0971 mm Nansha future preview city sample_idx coord_t subsidence_q10 subsidence_q50 subsidence_q90 subsidence_unit Nansha 0 2023 82.9637 91.9637 100.9637 mm Nansha 0 2024 123.8949 134.3949 144.8949 mm Nansha 0 2025 169.5575 181.5575 193.5575 mm Nansha 1 2023 85.4822 94.4822 103.4822 mm Nansha 1 2024 127.8517 138.3517 148.8517 mm Nansha 1 2025 175.2766 187.2766 199.2766 mm .. GENERATED FROM PYTHON SOURCE LINES 230-234 Write the lesson inputs ----------------------- The production builder consumes CSV files, so we follow the real workflow exactly. .. GENERATED FROM PYTHON SOURCE LINES 234-259 .. code-block:: Python tmp_dir = Path( tempfile.mkdtemp(prefix="gp_sg_hotspots_") ) ns_eval_csv = tmp_dir / "nansha_eval.csv" ns_future_csv = tmp_dir / "nansha_future.csv" zh_eval_csv = tmp_dir / "zhongshan_eval.csv" zh_future_csv = tmp_dir / "zhongshan_future.csv" ns_eval_df.to_csv(ns_eval_csv, index=False) ns_future_df.to_csv(ns_future_csv, index=False) zh_eval_df.to_csv(zh_eval_csv, index=False) zh_future_df.to_csv(zh_future_csv, index=False) print("") print("Input files") for p in [ ns_eval_csv, ns_future_csv, zh_eval_csv, zh_future_csv, ]: print(" -", p.name) .. rst-class:: sphx-glr-script-out .. code-block:: none Input files - nansha_eval.csv - nansha_future.csv - zhongshan_eval.csv - zhongshan_future.csv .. GENERATED FROM PYTHON SOURCE LINES 260-271 Run the real hotspot builder ---------------------------- We ask the production command to: - use 2022 as the baseline year, - interpret the inputs as cumulative subsidence, - use actual values for the baseline when available, - summarize 2023, 2024, and 2025, - and export both CSV and LaTeX. The result is one compact city × year table. .. GENERATED FROM PYTHON SOURCE LINES 271-306 .. code-block:: Python out_stem = tmp_dir / "hotspots_gallery" hotspots_mod.compute_hotspots_main( [ "--ns-eval", str(ns_eval_csv), "--ns-future", str(ns_future_csv), "--zh-eval", str(zh_eval_csv), "--zh-future", str(zh_future_csv), "--baseline-year", "2022", "--percentile", "90", "--subsidence-kind", "cumulative", "--baseline-source", "actual", "--quantile", "q50", "--years", "2023", "2024", "2025", "--format", "both", "--out", str(out_stem), ], prog="compute-hotspots", ) .. GENERATED FROM PYTHON SOURCE LINES 307-315 Inspect the produced artifacts ------------------------------ The builder writes: - one CSV summary table, - one LaTeX sidewaystable. That is the core artifact family for this command. .. GENERATED FROM PYTHON SOURCE LINES 315-323 .. code-block:: Python written = sorted(tmp_dir.glob("hotspots_gallery*")) print("") print("Written files") for p in written: print(" -", p.name) .. rst-class:: sphx-glr-script-out .. code-block:: none Written files - hotspots_gallery.csv - hotspots_gallery.tex .. GENERATED FROM PYTHON SOURCE LINES 324-327 Read the generated hotspot table -------------------------------- This is the main output users will usually inspect first. .. GENERATED FROM PYTHON SOURCE LINES 327-342 .. code-block:: Python tab_csv = tmp_dir / "hotspots_gallery.csv" tab_tex = tmp_dir / "hotspots_gallery.tex" tab = pd.read_csv(tab_csv) print("") print("Hotspot summary table") print(tab.to_string(index=False)) print("") print("LaTeX preview") for line in tab_tex.read_text(encoding="utf-8").splitlines()[:12]: print(line) .. rst-class:: sphx-glr-script-out .. code-block:: none Hotspot summary table City Year Hotspots_n s_hot_min s_hot_mean s_hot_max d_hot_mean d_hot_max mean_2022 T_0p9 Nansha 2023 12 31.9408 39.0185 44.0507 7.1779 10.3769 31.3303 6.1956 Nansha 2024 12 42.1282 45.3584 47.5097 12.3683 14.1385 31.3303 10.4846 Nansha 2025 12 47.1625 50.8033 53.7864 17.8132 19.7875 31.3303 15.3173 Zhongshan 2023 12 40.1022 45.1890 50.6990 8.8040 12.0300 36.9405 7.6105 Zhongshan 2024 12 50.9522 54.6610 60.0870 15.7146 18.1200 36.9405 13.4832 Zhongshan 2025 12 60.0815 63.8033 69.7340 24.6216 27.7670 36.9405 21.5802 LaTeX preview \begin{sidewaystable}[p] \centering \caption{Characteristics of forecast hotspot clusters. Hotspots are defined as locations where the absolute change in annual subsidence relative to 2022 exceeds the 90th percentile threshold $T_{0.9}$ within each city and year.} \label{tab:hotspots} \begin{tabular}{llrccccccc} \toprule & & & \multicolumn{3}{c}{Hotspot subsidence $s_{\mathrm{hot}}$} & \multicolumn{2}{c}{Hotspot anomaly $|\Delta s_{\mathrm{hot}}|$ vs.\ 2022} & Mean 2022 subsidence & $T_{0.9}$ \\ City & Year & Hotspots & \multicolumn{3}{c}{(mm yr$^{-1}$)} & \multicolumn{2}{c}{(mm yr$^{-1}$)} & (mm yr$^{-1}$) & (mm yr$^{-1}$) \\ \cmidrule(lr){4-6} \cmidrule(lr){7-8} & & $n$ & min & mean & max & mean & max & & \\ \midrule Nansha & 2023 & 12 & 31.9 & 39.0 & 44.1 & 7.2 & 10.4 & 31.3 & 6.2 \\ .. GENERATED FROM PYTHON SOURCE LINES 343-358 Show how cumulative inputs become annual series ----------------------------------------------- This builder supports both: - rate / increment data, where annual values are already present, - cumulative data, where annual values must be reconstructed. For cumulative inputs, the script uses: - baseline annual 2022 = cumulative_2022 - cumulative_2021 - annual 2023 = future_2023 - cumulative_2022 - annual 2024 = future_2024 - future_2023 - annual 2025 = future_2025 - future_2024 Below is a concrete one-sample demonstration for Nansha. .. GENERATED FROM PYTHON SOURCE LINES 358-400 .. code-block:: Python demo_sid = 0 ev0 = ns_eval_df.loc[ ns_eval_df["sample_idx"] == demo_sid ].sort_values("coord_t") fu0 = ns_future_df.loc[ ns_future_df["sample_idx"] == demo_sid ].sort_values("coord_t") cum_2021 = float( ev0.loc[ev0["coord_t"] == 2021, "subsidence_actual"].iloc[0] ) cum_2022 = float( ev0.loc[ev0["coord_t"] == 2022, "subsidence_actual"].iloc[0] ) fut_2023 = float( fu0.loc[fu0["coord_t"] == 2023, "subsidence_q50"].iloc[0] ) fut_2024 = float( fu0.loc[fu0["coord_t"] == 2024, "subsidence_q50"].iloc[0] ) fut_2025 = float( fu0.loc[fu0["coord_t"] == 2025, "subsidence_q50"].iloc[0] ) annual_demo = pd.DataFrame( { "year": [2022, 2023, 2024, 2025], "annual_value_mm": [ cum_2022 - cum_2021, fut_2023 - cum_2022, fut_2024 - fut_2023, fut_2025 - fut_2024, ], } ) print("") print("Annual conversion demo for one sample") print(annual_demo.to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Annual conversion demo for one sample year annual_value_mm 2022 30.0101 2023 38.3611 2024 42.4313 2025 47.1625 .. GENERATED FROM PYTHON SOURCE LINES 401-411 Build one compact visual preview -------------------------------- This preview is not part of the production builder itself. It is a teaching aid for the gallery page. Left: hotspot counts by year and city. Right: the threshold T_0p9 by year and city. .. GENERATED FROM PYTHON SOURCE LINES 411-465 .. code-block:: Python fig, axes = plt.subplots( 1, 2, figsize=(9.2, 3.9), constrained_layout=True, ) # Hotspot counts by year and city ax = axes[0] cities = ["Nansha", "Zhongshan"] years = sorted(tab["Year"].unique()) x = np.arange(len(years)) w = 0.35 for i, city in enumerate(cities): vals = [] for year in years: sub = tab.loc[ (tab["City"] == city) & (tab["Year"] == year), "Hotspots_n", ] vals.append(float(sub.iloc[0]) if not sub.empty else np.nan) ax.bar( x + (i - 0.5) * w, vals, width=w, label=city, ) ax.set_title("Hotspot counts") ax.set_xlabel("Year") ax.set_ylabel("Hotspots_n") ax.set_xticks(x) ax.set_xticklabels([str(y) for y in years]) ax.legend(fontsize=8) # Threshold progression ax = axes[1] for city in cities: sub = tab.loc[tab["City"] == city].sort_values("Year") ax.plot( sub["Year"].to_numpy(int), sub["T_0p9"].to_numpy(float), marker="o", label=city, ) ax.set_title(r"Threshold progression: $T_{0.9}$") ax.set_xlabel("Year") ax.set_ylabel("Threshold [mm/yr]") ax.legend(fontsize=8) .. image-sg:: /auto_examples/tables_and_summaries/images/sphx_glr_compute_hotspots_001.png :alt: Hotspot counts, Threshold progression: $T_{0.9}$ :srcset: /auto_examples/tables_and_summaries/images/sphx_glr_compute_hotspots_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 466-491 Learn how to read the table --------------------------- Each row is one: - city - year combination. The most important columns are: - ``Hotspots_n``: how many locations exceeded the anomaly threshold; - ``s_hot_*``: the subsidence range among hotspot points; - ``d_hot_*``: the anomaly range among hotspot points; - ``mean_2022``: the city-wide baseline annual mean for 2022; - ``T_0p9``: the percentile threshold used to define the hotspots. So the table is not a geometry layer. It is the compact numerical summary that later reporting pages or figures can build on. .. GENERATED FROM PYTHON SOURCE LINES 493-508 What the hotspot rule means --------------------------- For each city and year, the builder computes: - the annual baseline series for 2022, - the annual forecast series for the requested year, - the absolute anomaly: :math:`\lvert future\_annual - baseline\_{2022,\mathrm{annual}} \rvert` Then it marks hotspots where that anomaly exceeds the chosen percentile threshold. With the default percentile of 90, this means the most anomalous ~10 percent of valid locations become hotspots for that city-year. .. GENERATED FROM PYTHON SOURCE LINES 510-526 Why cumulative vs rate mode matters ----------------------------------- This script supports two data conventions. ``rate`` or ``increment`` the CSV values are already annual values. ``cumulative`` the CSV values are cumulative totals, so the script first converts them back to annual increments before any hotspot calculation. That makes the builder useful for both: - already-annual forecast archives, - and future forecast exports stored in cumulative form. .. GENERATED FROM PYTHON SOURCE LINES 528-542 Why the baseline source matters ------------------------------- The baseline year can come from: - ``actual``: use observed 2022 subsidence when available; - ``q50``: use the median forecast from the eval CSV. In practice: - ``actual`` is better when the eval archive includes observations; - ``q50`` is a useful fallback when the builder is used on a purely predictive archive. .. GENERATED FROM PYTHON SOURCE LINES 544-563 Why this command belongs in tables_and_summaries ------------------------------------------------ This builder is a bridge between raw forecast CSVs and later presentation layers. A useful workflow is: 1. generate or calibrate the forecast CSVs, 2. build the hotspot summary table, 3. inspect counts and thresholds by city and year, 4. only then move on to hotspot maps or narrative reporting. That separation keeps: - forecast generation, - summary tabulation, - and visualization clearly distinct. .. GENERATED FROM PYTHON SOURCE LINES 565-613 Command-line version -------------------- The same lesson can be reproduced from the CLI. Legacy dispatcher: .. code-block:: bash python -m scripts compute-hotspots \ --ns-eval results/nansha_eval.csv \ --ns-future results/nansha_future.csv \ --zh-eval results/zhongshan_eval.csv \ --zh-future results/zhongshan_future.csv \ --baseline-year 2022 \ --subsidence-kind cumulative \ --quantile q50 \ --years 2023 2024 2025 \ --format both \ --out tab_hotspots Modern CLI: .. code-block:: bash geoprior build hotspots \ --ns-eval results/nansha_eval.csv \ --ns-future results/nansha_future.csv \ --zh-eval results/zhongshan_eval.csv \ --zh-future results/zhongshan_future.csv \ --baseline-year 2022 \ --subsidence-kind cumulative \ --quantile q50 \ --years 2023 2024 2025 \ --format both \ --out tab_hotspots Auto-discovery by city source folder: .. code-block:: bash geoprior build hotspots \ --ns-src results/nansha_run \ --zh-src results/zhongshan_run \ --baseline-year 2022 \ --n-years 2 \ --format csv The gallery page teaches the builder. The command line reproduces it in a workflow. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.311 seconds) .. _sphx_glr_download_auto_examples_tables_and_summaries_compute_hotspots.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: compute_hotspots.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: compute_hotspots.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: compute_hotspots.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_