.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/tables_and_summaries/build_extract_zones.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_tables_and_summaries_build_extract_zones.py: Extract threshold-based spatial zones with ``extract-zones`` ============================================================ This lesson teaches how to build compact zone tables from a spatial field using GeoPrior's ``extract-zones`` workflow. Unlike ROI extraction, which cuts out one rectangular window, this builder keeps only the points whose target value satisfies a threshold rule. That makes it a natural tool for: - hotspot extraction, - low-score or vulnerable-zone screening, - transition-band inspection, - and small zone tables that can be passed to later summaries or plots. Why this matters ---------------- Threshold-based extraction is one of the simplest and most practical ways to turn a continuous spatial score into a reusable table of points. In many applied workflows, the first useful question is not: "What does the full field look like?" It is: "Which points are above a critical level?" or "Which points fall in the lower-risk tail?" That is the role of ``extract-zones``. What this lesson teaches ------------------------ We will: 1. build a realistic synthetic spatial table, 2. save it as two separate input files, 3. extract three different zone types, 4. compare their thresholds and sizes, 5. build one compact visual preview, 6. end with direct command-line examples. .. GENERATED FROM PYTHON SOURCE LINES 49-53 Imports ------- We use the real zone-extraction helper from GeoPrior and the shared synthetic spatial helpers already used in the other gallery lessons. .. GENERATED FROM PYTHON SOURCE LINES 53-66 .. code-block:: Python from __future__ import annotations import tempfile from pathlib import Path import matplotlib.pyplot as plt import numpy as np import pandas as pd from geoprior.utils.spatial_utils import extract_zones_from from geoprior.scripts import utils as script_utils .. GENERATED FROM PYTHON SOURCE LINES 67-91 Step 1 - Build one reusable synthetic city support -------------------------------------------------- Instead of generating plain random coordinates, we reuse the shared support helper. That gives us a more city-like footprint and keeps the lesson visually consistent with the other build examples. Key parameters -------------- city: Only a label attached to the synthetic support. center_x, center_y: Center of the projected coordinate system. span_x, span_y: Half-width and half-height of the support extent. nx, ny: Mesh density before masking. jitter_x, jitter_y: Small perturbations so the support is not a perfect mesh. footprint: Shape of the retained support. keep_frac: Fraction of masked points to keep. seed: Reproducibility seed. .. GENERATED FROM PYTHON SOURCE LINES 91-109 .. code-block:: Python support = script_utils.make_spatial_support( script_utils.SpatialSupportSpec( city="ZonesDemo", center_x=5_000.0, center_y=3_100.0, span_x=4_800.0, span_y=3_400.0, nx=35, ny=27, jitter_x=30.0, jitter_y=24.0, footprint="nansha_like", keep_frac=0.92, seed=303, ) ) .. GENERATED FROM PYTHON SOURCE LINES 110-120 Step 2 - Turn the support into a synthetic spatial score table -------------------------------------------------------------- ``extract_zones_from`` works on one score variable ``z`` and optional coordinate variables ``x`` and ``y``. To make the lesson realistic, we build one continuous score field that looks like an interpretable urban response surface. The score can be read as a generic hazard, severity, or hotspot index. The exact semantics are not important here. The important point is that it is continuous, spatially structured, and suitable for thresholding. .. GENERATED FROM PYTHON SOURCE LINES 120-167 .. code-block:: Python rng = np.random.default_rng(303) base_field = script_utils.make_spatial_field( support, amplitude=2.45, drift_x=1.10, drift_y=0.35, phase=0.50, local_weight=0.18, ) spread_field = script_utils.make_spatial_scale( support, base=0.25, x_weight=0.10, hotspot_weight=0.07, ) city_df = support.to_frame().copy() city_df["district"] = np.where( city_df["x_norm"] > 0.62, "East corridor", np.where(city_df["y_norm"] > 0.58, "North belt", "Urban core"), ) city_df["susceptibility_score"] = ( 6.2 + 1.85 * base_field + 1.10 * spread_field + rng.normal(0.0, 0.18, len(city_df)) ) city_df["rainfall_mm"] = ( 1180 + 95 * city_df["y_norm"] + rng.normal(0.0, 10.0, len(city_df)) ) city_df["groundwater_depth_m"] = ( 8.4 + 0.9 * city_df["x_norm"] + 0.5 * city_df["y_norm"] + rng.normal(0.0, 0.08, len(city_df)) ) print("Synthetic table shape:", city_df.shape) print("") print(city_df.head(10).to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Synthetic table shape: (414, 10) sample_idx coord_x coord_y x_norm y_norm city district susceptibility_score rainfall_mm groundwater_depth_m 0 3654.7573 725.7191 0.3621 0.1582 ZonesDemo Urban core 7.8543 1196.5087 8.8016 1 3879.0054 716.6930 0.3851 0.1569 ZonesDemo Urban core 7.8991 1203.0117 8.8067 2 4127.1475 743.3290 0.4106 0.1608 ZonesDemo Urban core 7.8215 1196.6609 8.9188 3 4530.8041 743.3618 0.4521 0.1608 ZonesDemo Urban core 8.0307 1166.7556 8.7492 4 4741.9471 747.9166 0.4737 0.1615 ZonesDemo Urban core 7.8903 1208.5079 9.0510 5 4987.8599 751.7753 0.4990 0.1620 ZonesDemo Urban core 8.3513 1191.2383 8.8633 6 5290.1053 760.6047 0.5300 0.1633 ZonesDemo Urban core 7.9618 1193.3879 9.1051 7 5554.3981 728.7704 0.5572 0.1587 ZonesDemo Urban core 8.0325 1199.9865 9.0134 8 5886.9462 751.2118 0.5913 0.1619 ZonesDemo Urban core 8.1350 1184.2525 8.9630 9 6363.0598 730.2099 0.6402 0.1589 ZonesDemo East corridor 8.5844 1211.7271 9.0817 .. GENERATED FROM PYTHON SOURCE LINES 168-173 Step 3 - Save the table as two input files ------------------------------------------ The public build layer accepts one or many input tables. To make that visible in the lesson, we split the synthetic city into two files and conceptually rebuild the same combined table again. .. GENERATED FROM PYTHON SOURCE LINES 173-197 .. code-block:: Python tmp_dir = Path( tempfile.mkdtemp(prefix="gp_sg_extract_zones_") ) west_csv = tmp_dir / "zones_demo_west.csv" east_csv = tmp_dir / "zones_demo_east.csv" x_mid = float(city_df["coord_x"].median()) city_df.loc[city_df["coord_x"] <= x_mid].to_csv( west_csv, index=False, ) city_df.loc[city_df["coord_x"] > x_mid].to_csv( east_csv, index=False, ) print("") print("Input files") print(" -", west_csv.name) print(" -", east_csv.name) .. rst-class:: sphx-glr-script-out .. code-block:: none Input files - zones_demo_west.csv - zones_demo_east.csv .. GENERATED FROM PYTHON SOURCE LINES 198-214 Step 4 - Define three thresholding stories ------------------------------------------ We extract three different kinds of zones from the same score field. This makes the lesson more useful than a single one-off threshold. 1. ``low_auto`` lower-tail automatic thresholding using a percentile. 2. ``hotspot`` points above an explicit upper threshold. 3. ``transition`` points inside an intermediate score band. The explicit thresholds are derived from quantiles so the lesson stays stable even if the synthetic field is adjusted later. .. GENERATED FROM PYTHON SOURCE LINES 214-233 .. code-block:: Python z = city_df["susceptibility_score"].to_numpy(float) auto_percentile = 15 auto_threshold = float(np.percentile(z, auto_percentile)) hotspot_threshold = float(np.percentile(z, 82)) transition_band = ( float(np.percentile(z, 46)), float(np.percentile(z, 64)), ) print("") print("Threshold design") print(f" - auto lower threshold (p{auto_percentile}) = {auto_threshold:.3f}") print(f" - hotspot threshold = {hotspot_threshold:.3f}") print( " - transition band = " f"({transition_band[0]:.3f}, {transition_band[1]:.3f})" ) .. rst-class:: sphx-glr-script-out .. code-block:: none Threshold design - auto lower threshold (p15) = 8.080 - hotspot threshold = 10.528 - transition band = (8.865, 9.539) .. GENERATED FROM PYTHON SOURCE LINES 234-241 Step 5 - Extract the three zone tables -------------------------------------- We call the real extraction helper directly. The public CLI command is a thin convenience layer over the same threshold logic. ``extract_zones_from`` returns only the filtered coordinate and score columns, which is exactly what we want for compact zone exports. .. GENERATED FROM PYTHON SOURCE LINES 241-315 .. code-block:: Python def _zone_table( data: pd.DataFrame, *, threshold, condition: str, percentile: int | float = 10, use_negative_criteria: bool = True, ) -> pd.DataFrame: out = extract_zones_from( z="susceptibility_score", x="coord_x", y="coord_y", data=data, threshold=threshold, condition=condition, use_negative_criteria=use_negative_criteria, percentile=percentile, view=False, ) out = out.copy() out.columns = ["coord_x", "coord_y", "susceptibility_score"] return out low_auto = _zone_table( city_df, threshold="auto", condition="auto", percentile=auto_percentile, use_negative_criteria=True, ) hotspot = _zone_table( city_df, threshold=hotspot_threshold, condition="above", use_negative_criteria=False, ) transition = _zone_table( city_df, threshold=transition_band, condition="between", use_negative_criteria=False, ) low_csv = tmp_dir / "zones_demo_low_auto.csv" hotspot_csv = tmp_dir / "zones_demo_hotspot.csv" transition_csv = tmp_dir / "zones_demo_transition.csv" low_auto.to_csv(low_csv, index=False) hotspot.to_csv(hotspot_csv, index=False) transition.to_csv(transition_csv, index=False) print("") print("Written files") print(" -", low_csv.name) print(" -", hotspot_csv.name) print(" -", transition_csv.name) print("") print("Low-score auto zone") print(low_auto.head(8).to_string(index=False)) print("") print("Hotspot zone") print(hotspot.head(8).to_string(index=False)) print("") print("Transition band") print(transition.head(8).to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Written files - zones_demo_low_auto.csv - zones_demo_hotspot.csv - zones_demo_transition.csv Low-score auto zone coord_x coord_y susceptibility_score 3654.7573 725.7191 7.8543 3879.0054 716.6930 7.8991 4127.1475 743.3290 7.8215 4530.8041 743.3618 8.0307 4741.9471 747.9166 7.8903 5290.1053 760.6047 7.9618 5554.3981 728.7704 8.0325 2735.5954 1017.2204 7.8516 Hotspot zone coord_x coord_y susceptibility_score 5902.7448 1802.0862 11.1294 6135.7027 1827.0407 11.6021 6410.7167 1787.5672 11.4881 6684.9167 1765.4587 11.1619 6992.9990 1793.7800 11.4905 7333.2870 1788.1407 10.5601 5313.0400 2046.3373 10.6809 5624.3402 2059.1097 11.0479 Transition band coord_x coord_y susceptibility_score 6143.2959 980.3361 9.0711 6669.7483 1010.8933 9.0629 6911.9625 1018.0694 9.4021 7317.7457 1030.8998 8.8854 5362.6798 1274.7406 9.1439 6144.5651 1263.2123 9.3788 6678.2034 1285.9056 9.3800 6943.2837 1250.0980 9.2733 .. GENERATED FROM PYTHON SOURCE LINES 316-320 Step 6 - Summarize the three outputs ------------------------------------ A small summary table makes it easy to compare how the different threshold rules behave. .. GENERATED FROM PYTHON SOURCE LINES 320-365 .. code-block:: Python def summarize_zone( df: pd.DataFrame, *, label: str, threshold_text: str, ) -> dict[str, object]: return { "zone": label, "threshold_rule": threshold_text, "rows": int(len(df)), "score_min": float(df["susceptibility_score"].min()), "score_mean": float(df["susceptibility_score"].mean()), "score_max": float(df["susceptibility_score"].max()), } summary = pd.DataFrame( [ summarize_zone( low_auto, label="low_auto", threshold_text=f"auto below p{auto_percentile}", ), summarize_zone( hotspot, label="hotspot", threshold_text=f"> {hotspot_threshold:.2f}", ), summarize_zone( transition, label="transition", threshold_text=( f"between {transition_band[0]:.2f} and " f"{transition_band[1]:.2f}" ), ), ] ) print("") print("Zone summary") print(summary.to_string(index=False)) .. rst-class:: sphx-glr-script-out .. code-block:: none Zone summary zone threshold_rule rows score_min score_mean score_max low_auto auto below p15 62 6.7745 7.6908 8.0779 hotspot > 10.53 75 10.5367 11.6053 13.4455 transition between 8.87 and 9.54 75 8.8652 9.1966 9.5345 .. GENERATED FROM PYTHON SOURCE LINES 366-376 Step 7 - Build one compact visual preview ----------------------------------------- Left: full score field with the low-score zone overlaid. Middle: full score field with hotspot and transition points overlaid. Right: score histogram with the three threshold rules annotated. .. GENERATED FROM PYTHON SOURCE LINES 376-481 .. code-block:: Python fig, axes = plt.subplots( 1, 3, figsize=(15.0, 4.9), constrained_layout=True, ) # Panel 1 - lower-tail zone. ax = axes[0] sc = ax.scatter( city_df["coord_x"], city_df["coord_y"], c=city_df["susceptibility_score"], s=14, cmap="viridis", alpha=0.86, ) ax.scatter( low_auto["coord_x"], low_auto["coord_y"], s=32, facecolors="none", edgecolors="white", linewidths=1.2, label="low-score zone", ) ax.set_title("Automatic lower-tail zone") ax.set_xlabel("coord_x") ax.set_ylabel("coord_y") ax.legend(frameon=False, fontsize=8) ax.grid(True, linestyle=":", alpha=0.35) ax.set_aspect("equal", adjustable="box") # Panel 2 - hotspot and transition zones. ax = axes[1] ax.scatter( city_df["coord_x"], city_df["coord_y"], c=city_df["susceptibility_score"], s=12, cmap="viridis", alpha=0.20, ) ax.scatter( transition["coord_x"], transition["coord_y"], s=20, color="tab:orange", alpha=0.72, label="transition band", ) ax.scatter( hotspot["coord_x"], hotspot["coord_y"], s=28, color="tab:red", alpha=0.88, label="hotspots", ) ax.set_title("Intermediate band and hotspots") ax.set_xlabel("coord_x") ax.set_ylabel("coord_y") ax.legend(frameon=False, fontsize=8) ax.grid(True, linestyle=":", alpha=0.35) ax.set_aspect("equal", adjustable="box") # Panel 3 - score histogram and thresholds. ax = axes[2] ax.hist( city_df["susceptibility_score"], bins=24, alpha=0.75, edgecolor="white", ) ax.axvline( auto_threshold, linestyle="--", linewidth=1.6, label=f"auto p{auto_percentile}", ) ax.axvline( hotspot_threshold, linestyle="-.", linewidth=1.6, label="hotspot threshold", ) ax.axvspan( transition_band[0], transition_band[1], alpha=0.20, label="transition band", ) ax.set_title("Score distribution and thresholds") ax.set_xlabel("susceptibility_score") ax.set_ylabel("count") ax.legend(frameon=False, fontsize=8) ax.grid(True, axis="y", linestyle=":", alpha=0.35) # Add one shared colorbar for the spatial panels. cbar = fig.colorbar(sc, ax=axes[:2], fraction=0.035, pad=0.02) cbar.set_label("susceptibility_score") plt.show() .. image-sg:: /auto_examples/tables_and_summaries/images/sphx_glr_build_extract_zones_001.png :alt: Automatic lower-tail zone, Intermediate band and hotspots, Score distribution and thresholds :srcset: /auto_examples/tables_and_summaries/images/sphx_glr_build_extract_zones_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 482-494 How to read this output ----------------------- A useful reading order is: 1. inspect the continuous score field, 2. check which threshold rule was applied, 3. compare the selected point counts, 4. then compare the score ranges in the exported tables. The key idea is that the builder is not estimating a new field. It is simply turning one existing continuous score into one reusable point table based on a threshold criterion. .. GENERATED FROM PYTHON SOURCE LINES 496-507 Why this builder is useful in practice -------------------------------------- ``extract-zones`` is a strong support-layer builder when you need one threshold-defined subset rather than a full-city table. Typical uses include: - extracting hotspot candidates, - isolating the lower-risk or lower-response tail, - creating intermediate bands for sensitivity checks, - or preparing compact point tables for later reports. .. GENERATED FROM PYTHON SOURCE LINES 509-577 Command-line usage ------------------ The examples below show the same workflow as direct terminal commands. Explicit hotspot extraction ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash geoprior-build extract-zones \ zones_demo_west.csv zones_demo_east.csv \ --x-col coord_x \ --y-col coord_y \ --z-col susceptibility_score \ --threshold 8.43 \ --condition above \ --output zones_demo_hotspot.csv The same command through the root dispatcher ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash geoprior build extract-zones \ zones_demo_west.csv zones_demo_east.csv \ --x-col coord_x \ --y-col coord_y \ --z-col susceptibility_score \ --threshold 8.43 \ --condition above \ --output zones_demo_hotspot.csv Intermediate transition band ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash geoprior-build extract-zones \ zones_demo_west.csv zones_demo_east.csv \ --x-col coord_x \ --y-col coord_y \ --z-col susceptibility_score \ --threshold 6.45 7.28 \ --condition between \ --output zones_demo_transition.csv Lower-tail automatic extraction ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The underlying helper also supports percentile-driven automatic thresholds. When your local wrapper exposes the same options, a typical command looks like: .. code-block:: bash geoprior-build extract-zones \ zones_demo_west.csv zones_demo_east.csv \ --x-col coord_x \ --y-col coord_y \ --z-col susceptibility_score \ --threshold auto \ --percentile 15 \ --condition auto \ --output zones_demo_low_auto.csv Because the shared reader accepts one or many input tables, the same command family can also be used with CSV, TSV, Parquet, Excel, JSON, Feather, or Pickle inputs. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.526 seconds) .. _sphx_glr_download_auto_examples_tables_and_summaries_build_extract_zones.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: build_extract_zones.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: build_extract_zones.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: build_extract_zones.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_