.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/evaluation/plot_radar_scores_overview.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_evaluation_plot_radar_scores_overview.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_evaluation_plot_radar_scores_overview.py:


Compare compact score profiles with ``plot_radar_scores``
=========================================================

This lesson explains how to use
``geoprior.plot.evaluation.plot_radar_scores`` when you
want to summarize several scores as one compact profile.

Why this function matters
-------------------------
Many evaluation plots in this gallery answer one question
at a time:

- how error changes by horizon,
- how wide intervals are,
- how well quantiles are calibrated,
- how stable trajectories remain.

That is exactly the right way to diagnose a forecast.
But once those individual checks are available, users
usually want one more view:

*Can I look at several summary scores together and read
an overall performance shape?*

That is where ``plot_radar_scores`` helps.

This helper is useful when you already have a **small set
of interpretable summary values** and you want to compare
profiles quickly. It can also compute those values from
``y_true`` and ``y_pred`` directly when you give it one or
more metric functions.

This page is written as a **teaching guide**, not only as
an API example. We will show both supported workflows:

1. passing precomputed values directly,
2. letting the helper compute metric scores from
   ``y_true`` and ``y_pred``.

We will also finish with practical advice on when radar
plots are useful and when a bar chart is the safer choice.

.. GENERATED FROM PYTHON SOURCE LINES 49-59

.. code-block:: Python


    from __future__ import annotations

    import matplotlib.pyplot as plt
    import numpy as np
    from sklearn.metrics import mean_absolute_error, mean_squared_error

    from geoprior.plot.evaluation import plot_radar_scores


.. GENERATED FROM PYTHON SOURCE LINES 60-91

What this function expects
--------------------------

``plot_radar_scores`` supports two main workflows.

**Workflow 1: direct values**
  Pass ``data_values`` as:

  - a dict such as ``{"MAE": 3.4, "RMSE": 5.1, ...}``,
  - a list/array of values plus ``category_names``.

**Workflow 2: compute scores from data**
  Pass:

  - ``y_true`` and ``y_pred``,
  - ``metric_functions`` as one callable or a list of
    callables,
  - and optional ``category_names``.

A few practical rules matter when teaching this helper:

- each radar axis must correspond to one scalar value,
- the current helper draws **one radar line per call**,
- if you want to compare several models, the cleanest
  pattern is usually one subplot per model,
- and radar plots become much easier to read when all
  models share the same category order.

The implementation also warns that radar plots are not
ideal for fewer than three categories. That is good
advice in practice too.

.. GENERATED FROM PYTHON SOURCE LINES 94-110

Build realistic regression-style data
-------------------------------------

For the lesson, we create one observed series and three
prediction variants:

- a stronger model,
- a moderate model,
- and a weaker model.

We will use them twice:

1. first to compute scores from ``y_true`` and
   ``y_pred``,
2. then to build a manual score table for direct-value
   radar plots.

.. GENERATED FROM PYTHON SOURCE LINES 110-144

.. code-block:: Python


    rng = np.random.default_rng(27)

    n_samples = 240
    x = np.linspace(0.0, 6.0 * np.pi, n_samples)
    y_true = (
        20.0
        + 3.0 * np.sin(x / 2.5)
        + 1.3 * np.cos(x / 5.0)
        + rng.normal(scale=0.8, size=n_samples)
    )

    y_pred_strong = (
        20.0
        + 2.95 * np.sin(x / 2.55)
        + 1.2 * np.cos(x / 5.1)
        + rng.normal(scale=0.55, size=n_samples)
    )

    y_pred_mid = (
        20.2
        + 2.75 * np.sin(x / 2.7)
        + 1.0 * np.cos(x / 4.8)
        + rng.normal(scale=0.95, size=n_samples)
    )

    y_pred_weak = (
        20.7
        + 2.35 * np.sin(x / 3.0)
        + 0.7 * np.cos(x / 4.2)
        + rng.normal(scale=1.35, size=n_samples)
    )


.. GENERATED FROM PYTHON SOURCE LINES 145-160

Start with metric-driven radar plots
------------------------------------

The most educational starting point is to let the helper
compute scores for us from the raw arrays.

We define four metric functions:

- MAE,
- RMSE,
- Bias magnitude,
- correlation.

Using four axes is a good teaching choice because the
shape becomes easy to read without clutter.

.. GENERATED FROM PYTHON SOURCE LINES 160-243

.. code-block:: Python


    def rmse_metric(yt, yp):
        return np.sqrt(mean_squared_error(yt, yp))


    rmse_metric.__name__ = "RMSE"


    def bias_abs_metric(yt, yp):
        return float(np.abs(np.mean(yp - yt)))


    bias_abs_metric.__name__ = "Abs Bias"


    def corr_metric(yt, yp):
        return float(np.corrcoef(yt, yp)[0, 1])


    corr_metric.__name__ = "Corr"


    metric_functions = [
        mean_absolute_error,
        rmse_metric,
        bias_abs_metric,
        corr_metric,
    ]

    category_names = ["MAE", "RMSE", "Abs Bias", "Corr"]

    fig, axes = plt.subplots(
        1,
        3,
        figsize=(15.0, 5.2),
        subplot_kw={"polar": True},
        constrained_layout=True,
    )

    plot_radar_scores(
        y_true=y_true,
        y_pred=y_pred_strong,
        metric_functions=metric_functions,
        category_names=category_names,
        title="Strong model: raw metric scales",
        line_color="#2A9D8F",
        fill_alpha=0.18,
        marker="o",
        annotation_format="{:.3f}",
        legend_label="Strong",
        ax=axes[0],
    )

    plot_radar_scores(
        y_true=y_true,
        y_pred=y_pred_mid,
        metric_functions=metric_functions,
        category_names=category_names,
        title="Moderate model: raw metric scales",
        line_color="#E9C46A",
        fill_alpha=0.18,
        marker="s",
        annotation_format="{:.3f}",
        legend_label="Moderate",
        ax=axes[1],
    )

    plot_radar_scores(
        y_true=y_true,
        y_pred=y_pred_weak,
        metric_functions=metric_functions,
        category_names=category_names,
        title="Weak model: raw metric scales",
        line_color="#E76F51",
        fill_alpha=0.18,
        marker="^",
        annotation_format="{:.3f}",
        legend_label="Weak",
        ax=axes[2],
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_radar_scores_overview_001.png
   :alt: Strong model: raw metric scales, Moderate model: raw metric scales, Weak model: raw metric scales
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_radar_scores_overview_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    <PolarAxes: title={'center': 'Weak model: raw metric scales'}>


.. GENERATED FROM PYTHON SOURCE LINES 244-272

How to read the raw-scale version
---------------------------------

A very important reading rule comes first:

**not all radar axes have the same meaning.**

In this example:

- MAE, RMSE, and Abs Bias are better when smaller,
- correlation is better when larger.

That means the polygon is a quick profile, not a single
"bigger is always better" score.

This is why raw-scale radar plots are useful mainly for
compact expert reading, not for casual ranking.

Still, they are informative:

- the stronger model stays close to zero on the three
  error-like axes,
- while keeping a high correlation,
- the weaker model expands more strongly on MAE, RMSE,
  and bias,
- and its correlation drops.

That already tells a clean comparative story.

.. GENERATED FROM PYTHON SOURCE LINES 275-287

Normalize the values to compare shapes more easily
--------------------------------------------------

The helper also supports ``normalize_values=True``.

This does **not** mean the metrics become statistically
comparable in a deep sense. It only rescales the chosen
values to a common ``[0, 1]`` range within each radar so
the shape is easier to inspect visually.

This is helpful when some axes are naturally much larger
in scale than others.

.. GENERATED FROM PYTHON SOURCE LINES 287-342

.. code-block:: Python


    fig, axes = plt.subplots(
        1,
        3,
        figsize=(15.0, 5.2),
        subplot_kw={"polar": True},
        constrained_layout=True,
    )

    plot_radar_scores(
        y_true=y_true,
        y_pred=y_pred_strong,
        metric_functions=metric_functions,
        category_names=category_names,
        normalize_values=True,
        title="Strong model: normalized shape",
        line_color="#1D3557",
        fill_alpha=0.16,
        marker="o",
        r_ticks_count=4,
        legend_label="Strong",
        ax=axes[0],
    )

    plot_radar_scores(
        y_true=y_true,
        y_pred=y_pred_mid,
        metric_functions=metric_functions,
        category_names=category_names,
        normalize_values=True,
        title="Moderate model: normalized shape",
        line_color="#6D597A",
        fill_alpha=0.16,
        marker="D",
        r_ticks_count=4,
        legend_label="Moderate",
        ax=axes[1],
    )

    plot_radar_scores(
        y_true=y_true,
        y_pred=y_pred_weak,
        metric_functions=metric_functions,
        category_names=category_names,
        normalize_values=True,
        title="Weak model: normalized shape",
        line_color="#C1121F",
        fill_alpha=0.16,
        marker="^",
        r_ticks_count=4,
        legend_label="Weak",
        ax=axes[2],
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_radar_scores_overview_002.png
   :alt: Strong model: normalized shape, Moderate model: normalized shape, Weak model: normalized shape
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_radar_scores_overview_002.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    <PolarAxes: title={'center': 'Weak model: normalized shape'}>


.. GENERATED FROM PYTHON SOURCE LINES 343-358

Why normalization can help
--------------------------

Normalization is often useful when one axis dominates the
visual range.

In forecast evaluation, RMSE can easily be numerically
larger than MAE or correlation-style quantities, so a raw
radar may visually over-emphasize that axis.

The normalized view is therefore helpful for seeing the
**balance of strengths and weaknesses** inside one model.

But it should not replace the raw-scale reading. The
normalized plot is about shape, not about original units.

.. GENERATED FROM PYTHON SOURCE LINES 361-370

Use direct values when scores are already summarized
----------------------------------------------------

In many real workflows you already have a compact score
table from earlier analysis. In that case, it is simpler
to pass the values directly instead of recomputing them
from ``y_true`` and ``y_pred``.

That is exactly what ``data_values`` is for.

.. GENERATED FROM PYTHON SOURCE LINES 370-424

.. code-block:: Python


    score_dict_strong = {
        "MAE": float(mean_absolute_error(y_true, y_pred_strong)),
        "RMSE": float(rmse_metric(y_true, y_pred_strong)),
        "Abs Bias": float(bias_abs_metric(y_true, y_pred_strong)),
        "Corr": float(corr_metric(y_true, y_pred_strong)),
        "Skill": 0.92,
    }

    score_dict_mid = {
        "MAE": float(mean_absolute_error(y_true, y_pred_mid)),
        "RMSE": float(rmse_metric(y_true, y_pred_mid)),
        "Abs Bias": float(bias_abs_metric(y_true, y_pred_mid)),
        "Corr": float(corr_metric(y_true, y_pred_mid)),
        "Skill": 0.84,
    }

    print("Direct-value summary for the stronger model")
    for k, v in score_dict_strong.items():
        print(f"{k:>8s}: {v:0.4f}")

    fig, axes = plt.subplots(
        1,
        2,
        figsize=(10.8, 5.0),
        subplot_kw={"polar": True},
        constrained_layout=True,
    )

    plot_radar_scores(
        data_values=score_dict_strong,
        normalize_values=False,
        title="Direct-value radar: stronger summary",
        line_color="#118AB2",
        fill_alpha=0.20,
        marker="o",
        annotation_format="{:.3f}",
        legend_label="Strong summary",
        ax=axes[0],
    )

    plot_radar_scores(
        data_values=score_dict_mid,
        normalize_values=False,
        title="Direct-value radar: moderate summary",
        line_color="#EF476F",
        fill_alpha=0.20,
        marker="s",
        annotation_format="{:.3f}",
        legend_label="Moderate summary",
        ax=axes[1],
    )


.. image-sg:: /auto_examples/evaluation/images/sphx_glr_plot_radar_scores_overview_003.png
   :alt: Direct-value radar: stronger summary, Direct-value radar: moderate summary
   :srcset: /auto_examples/evaluation/images/sphx_glr_plot_radar_scores_overview_003.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Direct-value summary for the stronger model
         MAE: 0.8133
        RMSE: 1.0493
    Abs Bias: 0.0320
        Corr: 0.9285
       Skill: 0.9200

    <PolarAxes: title={'center': 'Direct-value radar: moderate summary'}>


.. GENERATED FROM PYTHON SOURCE LINES 425-440

When direct values are the best choice
--------------------------------------

Passing ``data_values`` directly is usually the best
pattern when:

- the scores were computed elsewhere,
- different categories are not all derived from the same
  raw arrays,
- or you want to include a manually curated score such as
  an operational utility index, a domain-specific skill
  number, or a calibrated acceptance score.

In other words, ``plot_radar_scores`` can work as a pure
visualization helper once the score table already exists.

.. GENERATED FROM PYTHON SOURCE LINES 443-462

A practical model-comparison reading pattern
--------------------------------------------

Because the current helper draws one radar line per call,
the cleanest comparison pattern is often:

1. create one polar subplot per model,
2. keep the same category order,
3. keep the same radial limits when possible,
4. use distinct colors,
5. and read the radars next to a bar chart or horizon
   plot rather than in isolation.

This is especially important in evaluation work. Radar
plots are compact and attractive, but they are not always
the clearest quantitative comparison tool.

They work best when the user already understands the
underlying metrics.

.. GENERATED FROM PYTHON SOURCE LINES 465-494

How to adapt this helper to your own data
-----------------------------------------

In a real workflow, you will usually be in one of these
situations.

**Situation 1: you already have arrays**

- collect ``y_true`` and ``y_pred``,
- choose a small list of metric functions,
- pass them in ``metric_functions``,
- and keep ``category_names`` in the same order.

**Situation 2: you already have a score table**

- create an ordered dict of values,
- or pass a list/array plus ``category_names``,
- then use ``data_values=...`` directly.

**Situation 3: you want to compare models**

- call the helper once per model,
- create one polar subplot per model,
- and use different colors rather than relying on
  Matplotlib defaults.

In all three cases, keep the number of categories small.
Four to six axes is usually a comfortable range for a
readable radar plot.

.. GENERATED FROM PYTHON SOURCE LINES 497-516

Final teaching summary
----------------------

``plot_radar_scores`` is best understood as a compact
**profile view**.

It is not the first plot to draw when diagnosing a model.
Instead, it becomes useful after you already know which
metrics matter and want to summarize them visually.

A good practice is therefore:

1. diagnose with clearer plots first,
2. compute or collect a small set of meaningful scores,
3. then use this radar helper to compare the summary
   profile.

That way the radar plot becomes a helpful synthesis,
not a decorative replacement for real diagnosis.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.312 seconds)


.. _sphx_glr_download_auto_examples_evaluation_plot_radar_scores_overview.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_radar_scores_overview.ipynb <plot_radar_scores_overview.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_radar_scores_overview.py <plot_radar_scores_overview.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_radar_scores_overview.zip <plot_radar_scores_overview.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_