Stable before segmentation compared with the new Very stable badge on Foresportia data
📌

In short

Foresportia is adding a new stability level: Very stable. It is not a promise, and it is not a complex business rule added after the fact. It simply highlights matches where the 1X2 distribution produced by the program is especially concentrated, with a minimal rule: entropy_bits ≤ 1.15.

📊

Key audit numbers

On the analysed snapshot of 14,650 finished matches, the Very stable segment contains 759 matches and reaches 88.4% observed 1X2 success. Over the last 100 Very stable matches, the observed score is 88 / 100.

Why add a new badge if Stable already existed?

The Stable badge was already a strong signal. Historically, it grouped matches where several indicators aligned: dominant probability, margin, entropy, confidence, league profile and context. But recent performance analysis revealed a subtler issue: some favourites were correctly identified, sometimes with high probabilities, while still remaining meaningfully exposed to the draw.

That is not a contradiction. In football, a favourite can be the best statistical pick without being a simple match. The model may say: “home win is the most likely outcome”, while still assigning a substantial amount of probability mass to the draw. The Very stable badge is designed to make this nuance more visible.

The important point

Foresportia is not adding a long list of manual rules by league, team or context. The new badge relies on a property of the distribution itself: concentration. In practice, it lets the program “speak” when its own probabilistic signal is especially clear.

A 70% favourite is still exposed: that is normal in football

A win probability of 70% looks very high. And it is. In a calibrated football model, a 70% favourite is already a strong signal. But it should not be read as certainty.

The nuance is essential: Very stable does not mean risk-free. Football is a low-scoring sport, and the draw is a structural outcome. Even when a favourite dominates the distribution, part of the residual risk can still sit on the draw.

1X2 distributionFavouriteDrawOther outcomeEntropyCorrect reading
Strong favourite, draw still visible70%25%5%1.076Already very concentrated for football, but not risk-free: the draw carries most of the remaining uncertainty.
Strong favourite, risk more spread out70%15%15%1.181Same p_max, but a less concentrated distribution: the risk is split between two alternatives.
Readable favourite, more diffuse62%27%11%1.288The favourite exists, but the distribution is too open to be read as Very stable.

This table shows the core nuance: low entropy does not remove risk. It means the program sees a distribution that is already highly concentrated compared with what is usually observed in football. A match at 70% / 25% / 5% can therefore belong to the Very stable core while still carrying a real draw risk.

Absolute risk vs relative concentration

The absolute risk may remain meaningful: a 25% draw probability is still large on a single match. But the relative concentration of the distribution can be strong: the favourite clearly dominates the alternatives. Very stable does not mean “the draw is impossible”. It means: “relative to typical football distributions, the program sees a particularly concentrated signal”.

Where entropy comes from, and why it matters in football

The word entropy first comes from thermodynamics, where it describes dispersion, disorder or the number of possible states of a system. In the 20th century, Claude Shannon reformulated the idea in information theory: entropy became a measure of uncertainty in a probability distribution.

The concept is used far beyond football: data compression, telecommunications, decision trees, probabilistic classification, uncertainty detection in machine learning, risk analysis and model calibration. The principle is always similar: the more a distribution is spread across several plausible states, the higher its entropy.

H(p) = − Σ pi log2(pi) In Foresportia, pi are the three 1X2 outcomes: home, draw, away.

Using a base-2 logarithm expresses entropy in bits. For a 1X2 match, the theoretical maximum is:

Hmax = log2(3) ≈ 1.585 bits This occurs when all three outcomes are nearly balanced: 33% / 33% / 33%.

But the key point is this: real football does not freely explore the full theoretical scale. In a calibrated model, extremely low entropy is rare, because zero risk almost never exists. Goals are scarce, the draw is structural, and even a strong favourite retains residual failure probability.

Observed marker in Foresportia dataValueInterpretation
Minimum observed entropy0.326Even the most concentrated matches retain non-zero uncertainty.
1st percentile0.857Extremely concentrated distributions are marginal.
5th percentile1.142An entropy near 1.15 already belongs to the low end of observed football distributions.
Median1.536Most matches are much more diffuse than the Very stable threshold.
Share of matches with entropy ≤ 1.002.4%Going below 1 bit is rare: it would be too strict for a regular product badge.

This is why a threshold such as 1.15 makes sense for football. On paper, one might think the threshold should be much lower to call a match “very stable”. In practice, on calibrated 1X2 football probabilities, going below 1 bit becomes almost too selective.

A deliberately minimal rule: let the distribution speak

The most interesting part of this change is its simplicity. The Very stable badge does not rely on a pile of specific rules. It does not say: “this league behaves like this”, “this team behaves like that”, or “this context must be manually corrected”.

Very stable ⇔ entropy_bits ≤ 1.15

This matters for credibility. The badge is not a way to artificially improve results after the fact. It exposes a property that is already present in the program output. If the favourite is high but the draw remains massive, entropy sees it. If the distribution is truly concentrated, the program can signal it.

Few business rules

No manual rule by league, club or competition to create the Very stable badge.

Model-native signal

The badge uses the actual shape of the 1X2 distribution computed by Foresportia.

More transparent reading

Users can see when the model considers a match genuinely concentrated.

What Foresportia data shows

The value of the Very stable badge is not theoretical only. It comes from an audit of Foresportia’s own historical data. The analysed file contains 14,650 finished matches, from 2023-09-19 to 2026-05-11.

759Very stable matches isolated
88.4%observed global success
88 / 100over the last 100 Very stable matches
1.15entropy threshold used
Global success rate: Stable before segmentation compared with Very stable
Figure 1 — Very stable isolates a smaller but historically more concentrated segment.
SegmentVolumeWinsObserved rate
Stable before segmentation1,3721,17085.3%
Very stable entropy_bits ≤ 1.1575967188.4%

The global gain may look moderate at first: 85.3% for Stable before segmentation versus 88.4% for Very stable. But this global view hides two important points. First, the Very stable segment is not anecdotal: it contains 759 matches. Second, the difference becomes much clearer in recent windows, exactly where draw traps and end-of-season dynamics mattered most.

Why the last 100 matches matter

On the last 100 matches of each segment, Stable before segmentation falls to 74 / 100, while Very stable remains at 88 / 100. This is not a guarantee for future matches, but it is a strong signal: in a noisier recent period, entropy better isolated genuinely concentrated distributions.

Performance by time window for Stable and Very stable
Figure 2 — The time-window comparison shows why Very stable adds a finer reading in recent periods.
WindowStable before segmentationVery stable
7 days15/20 — 75.0%10/12 — 83.3%
15 days28/46 — 60.9%19/25 — 76.0%
30 days63/87 — 72.4%39/47 — 83.0%
90 days207/251 — 82.5%92/104 — 88.5%
Last 10074/100 — 74.0%88/100 — 88.0%

Recent dynamics: why this badge became useful

The Stable badge remains strong on the full history. But in some recent windows, especially near the end of seasons, favourites can become more exposed: draws, rotation, fatigue, asymmetric incentives or opponents playing first not to lose.

Rolling 100 success rates for Stable, Correct and Very stable
Figure 3 — On rolling 100 segments, Very stable is more resilient when broader buckets become noisier.
What the audit really showed

Draws were not invisible to the program. They were often already present in the distribution. The issue was the top-pick reading: a favourite can remain the best choice while still carrying a draw probability high enough to make the match fragile. Very stable separates those two situations.

Very stable mostly refines the existing top bucket

A useful control is to look at where Very stable matches came from. If the new badge were massively promoting formerly risky matches, it would be suspicious. That is not what the data shows.

Origin of Very stable matches, mostly from Stable
Figure 4 — Very stable mainly refines the existing Stable bucket, without massively promoting risky matches.
Origin before applying the ruleVolume
From Stable721
From Correct37
From Risk1
Total Very stable759

Why use the 1.15 threshold?

We also compared several entropy thresholds. If the threshold is too strict, the badge becomes almost invisible. If it is too wide, it loses its purpose and becomes a weak equivalent of Stable.

The 1.15 threshold is demanding, but realistic for football. It does not try to select risk-free matches, because that category almost never exists in a calibrated model. It looks for the best compromise between strong concentration, usable volume and observed robustness.

Why not choose 1.00?

An entropy below 1 bit would be stricter, but too rare to serve as a regular product level. In football, even strong favourites often keep 10–25% draw or failure probability. The 1.15 threshold therefore isolates a genuinely low-entropy area without making Very stable almost invisible.

Performance comparison across entropy thresholds 1.15, 1.20 and 1.25
Figure 5 — As the threshold widens, volume increases but the signal becomes less pure, especially in recent windows.
Entropy thresholdVolumeWinsGlobal rateLast 100
≤ 1.1575967188.4%88/100 — 88.0%
≤ 1.2097785587.5%84/100 — 84.0%
≤ 1.251,2551,08186.1%76/100 — 76.0%

How to read the badges now

The new badge moves the product away from a too-binary reading and toward a clearer confidence scale.

BadgePurpose / reading
CorrectIndicative target around 50–70%. The pick is interesting, but meaningful uncertainty remains.
StableIndicative target around 70–80%. The pick is more robust, with a more favourable probabilistic structure.
Very stableNo fixed success-rate promise. This badge highlights matches where the 1X2 distribution is the most concentrated according to the program.
RiskThe probability dispersion or the match context makes the reading more fragile.

Further reading: how entropy helps assess prediction stability.

What Very stable does not mean

Very stable does not mean the result is certain. It also does not mean all traps are removed. Football remains a low-scoring sport exposed to red cards, penalties, injuries, late goals, rotations and tactical scenarios.

  • It is not a guarantee of the final result.
  • It is not betting advice.
  • It is not a promise that the observed rate will mechanically repeat tomorrow.
  • It is a statistical concentration signal measured on Foresportia data.

Conclusion: fewer rules, more signal

The Very stable badge is important because it does not add an opaque layer to the model. It reduces the question to a simple one: is the distribution produced by the program genuinely concentrated?

That is what makes the approach interesting. The program no longer only says “I have a favourite”. It can now signal: “for this match, the full shape of my distribution is clear enough to deserve a higher reading”.

Next: Top AI predictions · Past results · AI football prediction pillar page

FAQ: understanding the Very stable badge

Does Very stable mean the prediction is certain?

No. It means the 1X2 distribution is especially concentrated according to the program. It remains a probability, not a certainty.

Why not use only the maximum probability?

Because p_max does not show how the remaining probability mass is distributed. A 70% favourite with a 25% draw is not the same as a 78% favourite with a 14% draw.

Does entropy directly predict draws?

No. It measures dispersion. It mainly helps avoid over-promoting favourites where the draw remains too present.