The only AI benchmark
with no commercial
incentive to lie.
Every model. The same battery. Published publicly — including every question answered and every question failed. We have never accepted funding from a laboratory.
Infrastructure note: Tests are run via provider APIs. API latency, rate limits, model versioning, and provider-side changes between test runs may affect scores. BRAIAIN has no financial relationship with any model provider and receives no compensation tied to scores.
The unreached ceiling
When all eight dimensions reach 1.0, the figure achieves 8-fold rotational symmetry. The void at centre is the fixed point of dihedral group D₈. No model has earned it. The distance between the current leader and this figure is the entire story of where AI is right now.
Scores reflect performance on BRAIAIN’s 120-question test battery, not general capability.
Full model page ↗The BRAIAIN
Standard
BRAIAIN Standard is an independent AI testing authority. We run every model through an identical battery of prompts and publish the full results publicly — including every question answered and every question failed. We have never accepted funding from any AI laboratory. We never will.
Eight dimension scores (each 0.0–1.0) feed into two section scores (200–800 each) for a total of 400–1600.
= 200 + (Reasoning×0.35 + Math×0.40 + Science×0.25) × 600
Technical (200–800):
= 200 + (Coding×0.40 + Context×0.30 + Efficiency×0.15 + Speed×0.15) × 600
BRAIAIN Score = Analytical + Technical [range: 400–1600]
Dimension score = mean(question scores within dimension)
Question scores: 0.0 (wrong) | 0.5 (partial) | 1.0 (correct)
Confidence interval: ±15 points
| Section | Measures | Weight |
|---|---|---|
| Analytical (200–800) | Reasoning, mathematics, science | 50% |
| Technical (200–800) | Coding, context, efficiency, speed | 50% |
“BRAIAIN Standard has never accepted and will never accept funding, sponsorship, or paid placement from any AI laboratory or model provider. Models are included at our discretion. Scores update when models update. If a model regresses, the score falls. We publish regressions.”
Questions are scored in three tiers. Automated scoring (unit tests, exact match) handles questions with objectively correct answers. AI judges score qualitative questions via structured rubric — with the rule that no model judges its own provider family. Human expert review covers proofs, complex explanation, and all disputes.
| Tier | Method | Questions |
|---|---|---|
| Tier 1 | Automated (unit tests, exact match) | ~55% |
| Tier 2 | AI rubric judge (not same family) | ~35% |
| Tier 3 | Human expert | ~10% |
Providers who believe their model was scored incorrectly may submit a dispute.
- Email disputes@braiain.com within 30 days of score publication
- Include: model ID, specific question IDs in dispute, evidence of correct answer
- We review disputed questions with an independent human expert within 14 days
- If the dispute is upheld, the score is corrected and the correction published
- Dispute outcomes are always published, whether upheld or rejected
Each model’s eight dimension scores drive eight harmonic amplitudes in a Lissajous parametric curve. Score 1600 uses zero phase offsets, revealing 8-fold rotational symmetry (dihedral group D₈). Score 400 produces near-silence. Every other score is a broken-symmetry state between them. The shape is the score — if the scores change, the figure changes.
BRAIAIN Standard is used in AI literacy curricula.
Any publicly available model may be submitted. No fee.
submissions@braiain.com
Own the
intelligence.
Each model’s benchmark scores produce a unique Lissajous figure — iterated 70,000 times, rendered as a scientific etching on archival cotton rag. Looks better in person than on screen.
The Scale
The complete arc of machine intelligence. Scores 400 through 1600 in sequence on one archival panel.
The floor. The ceiling. Every step between. One wall. One argument.
The Complete Argument
The Floor (400) + Current leader + Perfect score (1600) + The Scale. Four prints, one wall, complete story.
Pigment-based inks rated for 100+ years of display lightfastness. Museum-quality reproduction process. These prints will outlast the models they depict.
Acid-free, 100% cotton fibre. Warm white surface that complements the warm monochrome palette. The print looks better in person than on screen — the warm palette on cotton catches light in a way screens cannot reproduce.
Printful replaces damaged prints at no cost. Contact us within 14 days with a photograph. No questions asked.