WattMarkets

The track record

Every call is logged when it's made, hashed into a chain so it can't be rewritten, and graded against ERCOT's official settlements. The misses stay on the page. That's the product.

Live · summer 2026

graded as each month settles, provisional until ERCOT reconciles

How we score

backtest over the 2022 to 2024 summers, frozen at model release

We flagged 36 days as WARNING or higher. 6 of the 12 monthly peaks landed on a flagged day. Alerting on every hot day would have meant 81 flagged days and 138 hours of false alerts.

Calls that landed
WattMarkets
16.7%
alert every hot day
14.8%
Peaks caught
WattMarkets
50%
alert every hot day
100%
Hours on false alertlower is better
WattMarkets
60h
alert every hot day
138h

Whiskers are 95% intervals. The sample is small and we say so.

2025 · the held-out summer

2025 was not used to calibrate the model. Graded the same way, after the fact: we flagged 13 days and caught 3 of the 4 peaks. The hot-day rule flagged 24 days to catch all 4.

Calls that landed
WattMarkets
23.1%
alert every hot day
16.7%
Peaks caught
WattMarkets
75%
alert every hot day
100%

Held out means graded retrospectively on data the model never trained on. It is not a live season. The live test is happening right now, below.

Live · 2026

Live calls are graded as each month settles. Nothing here is mixed into the backtest numbers above.

Why you can trust this ledger

Each call is hashed into an append-only chain the moment it publishes, and the chain's terminal hash posts daily. If we ever rewrote a call, the chain would break in public. Verify it yourself: the attestations feed and the full method.

The honest tradeoff: we flag fewer days than a flag-every-hot-day rule and accept we might miss a peak. The Lab lets you pick your own line between interruptions and misses.

2025, the held-out summer

graded on data the model never trained on, official settled intervals
DateCallpOutcomeKind
June 20252 calls, 2 flagged
2025-06-01 WARNING59%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-06-19 WATCH35%Missthe month peak came on a day we did not flagheld out
July 20257 calls, 7 flagged
2025-07-01 WARNING73%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-07-23 WARNING63%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-07-24 WARNING65%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-07-28 WARNING58%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-07-29 WARNING65%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-07-30 WARNING67%Hitheld out
2025-07-31 WARNING63%False alarmflagged WARNING+, but the month peaked elsewhereheld out
August 20253 calls, 3 flagged
2025-08-01 WARNING93%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-08-18 WARNING66%Hitheld out
2025-08-28 WARNING59%False alarmflagged WARNING+, but the month peaked elsewhereheld out
September 20252 calls, 2 flagged
2025-09-01 WARNING52%False alarmflagged WARNING+, but the month peaked elsewhereheld out
2025-09-04 WARNING58%Hitheld out

A hit is a day we flagged WARNING or higher that turned out to hold the month's true peak, graded against official settled intervals. Backtests never masquerade as live performance.

Every call, graded

2024 backtest, graded against official settled intervals
DateCallpOutcomeKind
June 20246 calls, 6 flagged
2024-06-24 WARNING51%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-06-25 WARNING56%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-06-27 WARNING61%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-06-28 WARNING55%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-06-29 WARNING54%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-06-30 WARNING63%Hitbacktest
July 20241 call, 1 flagged
2024-07-01 WARNING93%Hitbacktest
August 20248 calls, 8 flagged
2024-08-01 WARNING90%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-08-07 WARNING67%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-08-18 WARNING50%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-08-19 WARNING75%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-08-20 WARNING71%Hitbacktest
2024-08-21 WARNING61%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-08-22 WARNING65%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-08-23 WARNING58%False alarmflagged WARNING+, but the month peaked elsewherebacktest
September 20242 calls, 2 flagged
2024-09-13 WARNING63%False alarmflagged WARNING+, but the month peaked elsewherebacktest
2024-09-19 WATCH49%Missthe month peak came on a day we did not flagbacktest

A hit is a day we flagged WARNING or higher that turned out to hold the month's true peak, graded against official settled intervals. Backtests never masquerade as live performance.