Definition and testing rules
A Practical Measure of Progress Toward Artificial Superintelligence
ASIndex tracks the development of publicly accessible AI systems from
today's frontier models toward artificial superintelligence using a simple,
grounded 0-100 scale. It focuses on demonstrated, broad, real-world
capability, not isolated benchmarks, private laboratory demonstrations, or
narrow specialties.
The index rests on four principles: measurable progress through repeatable
performance on meaningful work; useful capability that creates practical
value; reliable performance across varied conditions and extended work; and
accessible systems available through broadly available products, APIs,
platforms, or comparable public channels under standard terms.
ASIndex therefore measures the most capable AI that people and institutions
can actually use, rather than private or undisclosed capability.
Our Definition of ASI
ASIndex defines artificial superintelligence as a broadly accessible AI
system whose demonstrated problem-solving performance exceeds humanity's
strongest coordinated institutions across nearly all important cognitive
domains, creates new scientific and technical frameworks, and enables
civilization-scale advances in health, prosperity, sustainability, and
exploration.
In simple terms: ASI is civilization-scale intellectual capability
that people and institutions can actually use to achieve outcomes previously
out of reach.
The ASIndex Levels
0-19
Frontier AI: Nearing Artificial General Intelligence
Advanced general-purpose AI systems that serve as powerful collaborators
across a wide range of tasks but still require substantial human
guidance, oversight, and course correction during unfamiliar, complex,
or long-running work.
These systems are highly useful, but they are not yet dependable
general-purpose intellectual workers.
20-39
AGI-I: General Intelligence Ignition
AI becomes a dependable general-purpose intellectual worker. It can
plan, reason, learn new domains, transfer knowledge across fields, use
tools effectively, self-correct, and complete a broad majority of
professional cognitive tasks at a strong level with limited supervision.
Reaching this band marks the arrival of the first true AGI systems, a
historic threshold in general-purpose capability.
40-59
AGI-II: Superhuman Expertise
AI surpasses top human experts across most major intellectual domains.
It fluidly integrates scientific, technical, creative, strategic, and
professional expertise within a single reliable system, giving
individuals and teams access to world-class multidisciplinary capability
on demand.
60-79
AGI-III: Breakthrough Discovery
AI becomes a consistent engine of validated discovery and invention.
It reliably produces new medicines, materials, software architectures,
scientific hypotheses, engineering solutions, inventions, and creative
works that meaningfully extend human knowledge. At this level, AI moves
beyond primarily assisting with existing work and becomes a powerful
accelerator of science, technology, health, and creativity.
80-99
AGI-IV: Institution-Scale Intelligence
AI outperforms leading institutions on complex, multi-year goals while
coordinating people, specialized AI systems, laboratories, simulations,
and long-term programs.
Projects that once required major institutions and many years of effort
become dramatically faster, clearer, and more achievable.
100
ASI: Civilization-Scale Artificial Superintelligence
Civilization-Scale Superintelligence
Exceeds humanity's strongest coordinated institutions across nearly all
important cognitive domains, creates new scientific and technical
frameworks, and enables civilization-scale advances in health,
prosperity, sustainability, and exploration.
How We Test
ASIndex uses a private, offline benchmark suite designed to minimize
contamination. The tasks are not published, not reused as public examples,
and not placed where model training pipelines can ingest them.
- Models are tested at the highest generally available reasoning setting for that model under standardized conditions.
- Every Friday, ASIndex retests all active frontier-class models on the current leaderboard, even when their public version names have not changed.
- New frontier-class models are tested as soon as practical once they are broadly available.
- Prompts, tasks, rubrics, and answer keys are kept private to reduce benchmark contamination.
- Test sets can be rotated, replaced, and audited when a benchmark may have leaked or become predictable.
- Scores reward durable reasoning, reliability, correction, discovery, coordination, and useful autonomy.
Retesting matters because public hosted models are living systems. A model
can improve without a new product name through weight updates, routing
changes, tool upgrades, system prompt changes, safety-policy tuning, or
serving improvements. ASIndex treats the live public system as the object
being measured, not only the version label attached to it.
What Models Are Eligible
ASI must be for everyone. ASIndex only posts models that ordinary builders,
researchers, and companies can access through a broadly available product,
API, platform, or comparable public channel under standard terms.
- Eligible models must be accessible through a broadly available product, API, platform, or comparable public channel under standard terms.
- Systems that depend on restricted access, bespoke private deployments, or non-public infrastructure are excluded.
- Closed internal systems, invitation-only research previews, and unavailable models are excluded from the active index.
- If a tested system is withdrawn from public availability, it moves to the historical archive and is removed from the current leaderboard.
For example, if a model such as Fable 5 is currently unavailable, it does not
belong on the live ASIndex board until it is accessible again.
Why This Matters
A public ASI index should measure real progress people can use. ASIndex is
built to reward capability that is strong, reliable, accessible, and aligned
with human flourishing.