▌ Open data · CC-BY 4.0
Raw votes.
Every vote cast on the PlotPoints benchmark, served as CSV. Re-run the analysis, build your own ranking, or audit ours. Voter cookie ids and IP hashes are stripped — see What's excluded below for the full list.
▌ Downloads
Issue № 01 · April 2026
The champ that wouldn't move.
1,857 votes · 335 voters
Everything
All rounds, flattened.
Every vote across every round in one CSV. Filter by the
round column.▌ Column schema
| Column | Meaning |
|---|---|
| id | UUID per vote |
| round | 1, 2, 3 — which round bucket |
| mode | arena · multiturn_arena · rubric |
| scenario_id | Pair id (arena modes) or response id (rubric) |
| context | Underlying seed/scene id |
| model_a, model_b | The two models in an arena matchup |
| winner | A · B · tie (arena modes only) |
| model | Model that produced the response (rubric only) |
| scores | JSON map { axis_id: 1-5 } (rubric only) |
| notes | Voter free-text (rubric, optional) |
| is_catch | true if the matchup was a catch-pair calibration |
| catch_correct | true if the voter picked the pre-declared winner |
| source | 'native' for plotlight, 'arena_l3vi4th4n_round_N' for imports |
| signed_in | 1 if cast by an authed user; 0 if anonymous |
| client_timestamp | Wall-clock at submit (best-effort) |
| created_at | Server insert time (authoritative) |
▌ What's excluded
- voter_idSybil-tracking cookie identity — never leaves the perimeter.
- user_idAuth binding for weight bonus — surfaced as `signed_in` boolean only.
- ip_hashSpam/abuse signal — withheld for privacy.
- user_agentPII-adjacent — withheld for privacy.
▌ Upstream / canonical
The CSV here is what Plotlight collected. The full benchmark dataset — including model outputs, judge rationales, and the analyzer scripts that turn raw votes into the standings — lives on HuggingFace as the canonical artifact: