PlotPointsRaw votes
▌ Open data · CC-BY 4.0

Raw votes.

Every vote cast on the PlotPoints benchmark, served as CSV. Re-run the analysis, build your own ranking, or audit ours. Voter cookie ids and IP hashes are stripped — see What's excluded below for the full list.

▌ Downloads
Issue № 02 · 2026
Round 02 is in flight.
507 votes · 140 voters
Download CSVRead the issue
Issue № 01 · April 2026
The champ that wouldn't move.
1,857 votes · 335 voters
Download CSVRead the issue
Everything
All rounds, flattened.
Every vote across every round in one CSV. Filter by the round column.
Download CSV
▌ Column schema
ColumnMeaning
idUUID per vote
round1, 2, 3 — which round bucket
modearena · multiturn_arena · rubric
scenario_idPair id (arena modes) or response id (rubric)
contextUnderlying seed/scene id
model_a, model_bThe two models in an arena matchup
winnerA · B · tie (arena modes only)
modelModel that produced the response (rubric only)
scoresJSON map { axis_id: 1-5 } (rubric only)
notesVoter free-text (rubric, optional)
is_catchtrue if the matchup was a catch-pair calibration
catch_correcttrue if the voter picked the pre-declared winner
source'native' for plotlight, 'arena_l3vi4th4n_round_N' for imports
signed_in1 if cast by an authed user; 0 if anonymous
client_timestampWall-clock at submit (best-effort)
created_atServer insert time (authoritative)
▌ What's excluded
  • voter_idSybil-tracking cookie identity — never leaves the perimeter.
  • user_idAuth binding for weight bonus — surfaced as `signed_in` boolean only.
  • ip_hashSpam/abuse signal — withheld for privacy.
  • user_agentPII-adjacent — withheld for privacy.
▌ Upstream / canonical

The CSV here is what Plotlight collected. The full benchmark dataset — including model outputs, judge rationales, and the analyzer scripts that turn raw votes into the standings — lives on HuggingFace as the canonical artifact:

HuggingFace dataset →GitHub: rp-benchmarkMethodology