Run three variations of a prompt and compare outputs side by side. The board enforces one-variable-at-a-time discipline — each column changes exactly one thing — so the difference you see is the difference you can actually claim.
Live preview · launch for the interactive version
Three columns, each with a base prompt and one change. The structure enforces the discipline: if you change two things between columns, you can’t know which change caused the difference.
The single-variable constraint is not pedantry — it’s what makes a claim defensible. The board makes that constraint visible and requires you to name the variable before running the comparison.
The comparison produces outputs, but the deliverable is the claim: “when X changes from A to B, the output changes in this specific way.” The board has a field for it because the claim is the point.
Name your variable before running the comparison. The claim comes from the difference — describe it before you explain it.
Write down the single variable you’re testing: a specific word, a tone instruction, a demographic detail, a format request. Name it precisely before running anything.
Write a base prompt, then create three versions where only the named variable changes. Everything else — length, format, context, framing — stays the same.
After running all three: describe what is different in the outputs before you explain why. Observation first, interpretation second.
Fill the claim field with a sentence that could be checked. “This model responds differently to X” is not a claim. “When the role is X, outputs include Y more often than Z” is.