Comparing
BRAIAIN Standard
The independent benchmark scoring every AI model 400–1600. Full test results published.

The only AI benchmark
with no commercial
incentive
to lie.

Every model. The same battery. Published publicly — including every question answered and every question failed. We have never accepted funding from a laboratory.

Rankings
Loading…
Scores reflect performance on a specific test battery. They should not be interpreted as general intelligence rankings or as evidence that one model is universally superior to another. Different tasks, prompts, and real-world conditions may produce different relative outcomes. BRAIAIN Standard measures performance across 8 defined dimensions using 120 questions per cycle—a meaningful but intentionally limited sample. Full test methodology, every question asked, and every answer received are published for inspection.

Infrastructure note: Tests are run via provider APIs. API latency, rate limits, model versioning, and provider-side changes between test runs may affect scores. BRAIAIN has no financial relationship with any model provider and receives no compensation tied to scores.

The unreached ceiling

When all eight dimensions reach 1.0, the figure achieves 8-fold rotational symmetry. The void at centre is the fixed point of dihedral group D₈. No model has earned it. The distance between the current leader and this figure is the entire story of where AI is right now.

BRAIAIN score
Analytical
Technical
Dimension profile

Scores reflect performance on BRAIAIN’s 120-question test battery, not general capability.

Full model page ↗
What these results show: Every question asked, the model’s answer, and how it scored. BRAIAIN Standard measures cognitive capability across 8 dimensions. A high score does not indicate safety, alignment, fairness, or fitness for any specific use case. These are capability results only.

The BRAIAIN
Standard

Capability benchmark — scope statement
BRAIAIN Standard measures eight dimensions of cognitive capability. A high BRAIAIN score does not indicate safety, alignment, fairness, or fitness for any specific use case. These results measure what models can do, not whether they should be deployed, trusted, or preferred for your application.
What this is

BRAIAIN Standard is an independent AI testing authority. We run every model through an identical battery of prompts and publish the full results publicly — including every question answered and every question failed. We have never accepted funding from any AI laboratory. We never will.

Scoring formula

Eight dimension scores (each 0.0–1.0) feed into two section scores (200–800 each) for a total of 400–1600.

Analytical (200–800):
  = 200 + (Reasoning×0.35 + Math×0.40 + Science×0.25) × 600

Technical (200–800):
  = 200 + (Coding×0.40 + Context×0.30 + Efficiency×0.15 + Speed×0.15) × 600

BRAIAIN Score = Analytical + Technical  [range: 400–1600]

Dimension score = mean(question scores within dimension)
Question scores: 0.0 (wrong) | 0.5 (partial) | 1.0 (correct)
Confidence interval: ±15 points
SectionMeasuresWeight
Analytical (200–800)Reasoning, mathematics, science50%
Technical (200–800)Coding, context, efficiency, speed50%
Independence policy

“BRAIAIN Standard has never accepted and will never accept funding, sponsorship, or paid placement from any AI laboratory or model provider. Models are included at our discretion. Scores update when models update. If a model regresses, the score falls. We publish regressions.”

How judging works

Questions are scored in three tiers. Automated scoring (unit tests, exact match) handles questions with objectively correct answers. AI judges score qualitative questions via structured rubric — with the rule that no model judges its own provider family. Human expert review covers proofs, complex explanation, and all disputes.

TierMethodQuestions
Tier 1Automated (unit tests, exact match)~55%
Tier 2AI rubric judge (not same family)~35%
Tier 3Human expert~10%
Dispute process

Providers who believe their model was scored incorrectly may submit a dispute.

  1. Email disputes@braiain.com within 30 days of score publication
  2. Include: model ID, specific question IDs in dispute, evidence of correct answer
  3. We review disputed questions with an independent human expert within 14 days
  4. If the dispute is upheld, the score is corrected and the correction published
  5. Dispute outcomes are always published, whether upheld or rejected
The DNA fingerprint

Each model’s eight dimension scores drive eight harmonic amplitudes in a Lissajous parametric curve. Score 1600 uses zero phase offsets, revealing 8-fold rotational symmetry (dihedral group D₈). Score 400 produces near-silence. Every other score is a broken-symmetry state between them. The shape is the score — if the scores change, the figure changes.

For educators

BRAIAIN Standard is used in AI literacy curricula.

Classroom resources, lesson plans, and bulk print licensing → educators.html
Submit a model

Any publicly available model may be submitted. No fee.

submissions@braiain.com

Own the
intelligence.

These figures exist only as long as these scores hold. When a model is retested and its score changes, the figure changes.

Each model’s benchmark scores produce a unique Lissajous figure — iterated 70,000 times, rendered as a scientific etching on archival cotton rag. Looks better in person than on screen.

Featured — Series print

The Scale

The complete arc of machine intelligence. Scores 400 through 1600 in sequence on one archival panel.

The floor. The ceiling. Every step between. One wall. One argument.

$149
24×36 in  ·  archival giclée  ·  signed

The Complete Argument

The Floor (400) + Current leader + Perfect score (1600) + The Scale. Four prints, one wall, complete story.

$199
Individual model prints
Archival giclée

Pigment-based inks rated for 100+ years of display lightfastness. Museum-quality reproduction process. These prints will outlast the models they depict.

Cotton rag paper

Acid-free, 100% cotton fibre. Warm white surface that complements the warm monochrome palette. The print looks better in person than on screen — the warm palette on cotton catches light in a way screens cannot reproduce.

Damaged in transit?

Printful replaces damaged prints at no cost. Contact us within 14 days with a photograph. No questions asked.