.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/diagnostics/plot_r2_overview.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_diagnostics_plot_r2_overview.py: Understand regression agreement with ``plot_r2`` ================================================ This lesson explains how to use :func:`geoprior.plot.r2.plot_r2` to compare one reference target series against several competing prediction series. Why this plot matters --------------------- A single scalar R² value in a table is useful, but it hides the shape of the agreement between predictions and observations. Two models can have similar R² values while behaving quite differently: - one may follow the 1:1 line closely with a small spread, - one may systematically under-predict large values, - one may have a few extreme outliers that dominate the score. The ``plot_r2`` helper turns that abstract score into a visual comparison. It is especially useful when you want to compare several models or configurations against the same truth array. What this function is designed to do ------------------------------------ ``plot_r2`` takes: - one ``y_true`` array, - one or more ``y_pred`` arrays, - and then draws one subplot per prediction. Each subplot contains: 1. a scatter cloud of actual versus predicted values, 2. a perfect-fit diagonal, 3. an R² annotation, 4. and optional extra metrics such as RMSE or MAE. This makes the helper a natural *model-comparison* diagnostic when all candidates share the same target vector. .. GENERATED FROM PYTHON SOURCE LINES 46-55 .. code-block:: Python from __future__ import annotations import matplotlib.pyplot as plt import numpy as np from geoprior.plot.r2 import plot_r2 .. GENERATED FROM PYTHON SOURCE LINES 56-68 Build one truth series and three different prediction behaviors --------------------------------------------------------------- A good teaching example should not use three models that all look the same. Here we intentionally create: - a strong model, - a noisier model, - and a biased model. That way the R² annotations and the scatter geometry tell different stories. .. GENERATED FROM PYTHON SOURCE LINES 68-80 .. code-block:: Python rng = np.random.default_rng(42) n = 120 x = np.linspace(0.0, 1.0, n) y_true = 15.0 + 55.0 * x + 6.0 * np.sin(4.0 * np.pi * x) y_pred_strong = y_true + rng.normal(0.0, 2.2, n) y_pred_noisy = y_true + rng.normal(0.0, 5.0, n) y_pred_biased = 0.88 * y_true + 4.5 + rng.normal(0.0, 3.2, n) .. GENERATED FROM PYTHON SOURCE LINES 81-97 Start with the simplest reading pattern --------------------------------------- The most natural first use of ``plot_r2`` is to compare several prediction vectors against the same reference truth. A strong reading habit is: 1. look at how tightly the points cluster around the diagonal, 2. compare the annotated R² values, 3. check whether the scatter widens for large values, 4. look for consistent upward or downward bias. In practice, the best subplot is not only the one with the highest R². It is also the one whose scatter pattern looks believable and balanced around the perfect-fit line. .. GENERATED FROM PYTHON SOURCE LINES 97-115 .. code-block:: Python fig = plot_r2( y_true, y_pred_strong, y_pred_noisy, y_pred_biased, titles=["Strong model", "Noisy model", "Biased model"], xlabel="Observed subsidence", ylabel="Predicted subsidence", scatter_colors=["#1f77b4", "#ff7f0e", "#d62728"], line_colors=["#2f2f2f", "#2f2f2f", "#2f2f2f"], line_styles=["--", "--", "--"], annotate=True, show_grid=True, max_cols=2, ) .. image-sg:: /auto_examples/diagnostics/images/sphx_glr_plot_r2_overview_001.png :alt: Strong model, Noisy model, Biased model :srcset: /auto_examples/diagnostics/images/sphx_glr_plot_r2_overview_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 116-135 How to interpret the three panels --------------------------------- In this example, the difference is not only the R² annotation. The geometry matters too: **Strong model** The point cloud stays fairly close to the diagonal. That means the model is not only scoring well, but also preserving the amplitude of the target values. **Noisy model** The cloud is wider around the diagonal. That typically means the model preserves the general trend but has weaker precision. **Biased model** The cloud may still align roughly with the diagonal direction, but it is shifted. This is a useful reminder that R² alone does not tell the whole story about calibration or systematic bias. .. GENERATED FROM PYTHON SOURCE LINES 138-151 Add complementary scalar metrics on each subplot ------------------------------------------------ ``plot_r2`` also accepts ``other_metrics``. This is helpful when the user wants each subplot to carry more than one signal. A practical pattern is to show: - R² for explained variance, - RMSE for larger errors, - MAE for average absolute deviation. This makes each panel a compact diagnostic card. .. GENERATED FROM PYTHON SOURCE LINES 151-170 .. code-block:: Python fig = plot_r2( y_true, y_pred_strong, y_pred_noisy, y_pred_biased, titles=["Strong model", "Noisy model", "Biased model"], xlabel="Observed subsidence", ylabel="Predicted subsidence", scatter_colors=["#2ca02c", "#9467bd", "#8c564b"], line_colors=["#4d4d4d", "#4d4d4d", "#4d4d4d"], line_styles=[":", ":", ":"], other_metrics=["rmse", "mae"], annotate=True, show_grid=True, max_cols=3, ) .. image-sg:: /auto_examples/diagnostics/images/sphx_glr_plot_r2_overview_002.png :alt: Strong model, Noisy model, Biased model :srcset: /auto_examples/diagnostics/images/sphx_glr_plot_r2_overview_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 171-187 Why this matters for model selection ------------------------------------ In many workflows, a ranking table tells you *which* model scored highest, but this plot helps explain *why*. For example: - a model with slightly worse R² but tighter structure and fewer obvious outliers may still be the safer choice, - a model with a strong R² but clear bias may require recalibration, - and a model with weak R² and a broad cloud is usually not ready for reporting. This is why ``plot_r2`` belongs in diagnostics rather than in a pure metrics table. .. GENERATED FROM PYTHON SOURCE LINES 190-199 Demonstrate NaN handling implicitly ----------------------------------- The implementation removes NaNs jointly from ``y_true`` and all prediction arrays before plotting. This is useful in real workflows, because one missing value should not force the entire comparison to fail as long as enough valid rows remain. Here we insert a few NaNs to mimic imperfect evaluation exports. .. GENERATED FROM PYTHON SOURCE LINES 199-226 .. code-block:: Python y_true_nan = y_true.copy() y_pred_strong_nan = y_pred_strong.copy() y_pred_noisy_nan = y_pred_noisy.copy() y_true_nan[[8, 33]] = np.nan y_pred_strong_nan[[8, 61]] = np.nan y_pred_noisy_nan[[33, 61]] = np.nan fig = plot_r2( y_true_nan, y_pred_strong_nan, y_pred_noisy_nan, titles=["Strong model with NaNs removed", "Noisy model with NaNs removed"], xlabel="Observed subsidence", ylabel="Predicted subsidence", scatter_colors=["#17becf", "#bcbd22"], line_colors=["#3b3b3b", "#3b3b3b"], line_styles=["--", "--"], other_metrics=["rmse"], annotate=True, show_grid=True, max_cols=2, ) .. image-sg:: /auto_examples/diagnostics/images/sphx_glr_plot_r2_overview_003.png :alt: Strong model with NaNs removed, Noisy model with NaNs removed :srcset: /auto_examples/diagnostics/images/sphx_glr_plot_r2_overview_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 227-265 How to use this function on your own data ----------------------------------------- ``plot_r2`` is the right choice when you have **one truth vector** and several competing predictions for that same target. A typical workflow looks like this: 1. extract one observed vector, 2. extract several model prediction vectors aligned to the same rows, 3. call ``plot_r2(y_true, pred_a, pred_b, pred_c, ...)``. For example: .. code-block:: python y_true = df_eval["subsidence_actual"].to_numpy() pred_xtft = df_eval["subsidence_xtft_q50"].to_numpy() pred_pinn = df_eval["subsidence_pinn_q50"].to_numpy() pred_tft = df_eval["subsidence_tft_q50"].to_numpy() plot_r2( y_true, pred_xtft, pred_pinn, pred_tft, titles=["XTFT", "GeoPriorSubsNet", "TFT"], other_metrics=["rmse", "mae"], max_cols=2, ) A good rule is to use ``plot_r2`` when the comparison question is: *"How do several prediction series compare against the same truth?"* If instead you want to compare several independent ``(y_true, y_pred)`` pairs, the better helper is :func:`geoprior.plot.r2.plot_r2_in`. .. GENERATED FROM PYTHON SOURCE LINES 268-281 A compact reading checklist --------------------------- When reading a ``plot_r2`` figure, ask: - Which panel has the tightest cloud around the diagonal? - Which panel shows obvious bias or amplitude compression? - Does the R² ranking agree with the visual impression? - Do RMSE and MAE support the same conclusion? - Are outliers rare, or are they driving the score? That sequence turns the plot from a decorative scatter figure into a real diagnostic tool. .. GENERATED FROM PYTHON SOURCE LINES 281-283 .. code-block:: Python plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.050 seconds) .. _sphx_glr_download_auto_examples_diagnostics_plot_r2_overview.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_r2_overview.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_r2_overview.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_r2_overview.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_