The AI Observatory is a program for the continuous measurement of generative systems.

1. NeoMundi AI Observatory

A public program for the continuous measurement of generative systems.

NeoMundi Research regularly publishes new observation cohorts of generative AI services.

Each cohort systematically measures five dimensions:

Stability (Stability): real-time measurement of generation stability;
Validity (Validity): evaluation of alignment with verifiable references or benchmarks;
Information Density (Information Density): measurement of the amount of useful information produced;
Risk (Risk): identification of drift signals, instability, or problematic responses;
Cost Efficiency (Cost Efficiency): analysis of the relationship between performance, cost, and operational efficiency.

The objective is not only to compare models, but to make their behavior observable, measurable, and verifiable over time.

The results are accompanied by data, documented methodology, and verification links to promote auditability, comparison, and public debate.

2. Latest Available Publication

Find the latest published cohort, previous cartographies, and methodological reviews of the NeoMundi Observatory.

Access the AI Observatory

3. Editorial Program

NeoMundi Research publishes a new observation cohort every two weeks.

Each publication includes a global cartography of the main measured signals and can be enriched with targeted analyses: comparisons between cohorts, specific segments, benchmarks, or singular cases.

4. Method, Data, Auditability

A public, documented, and verifiable method.

NeoMundi Research progressively publishes the elements necessary to examine its work:

The methodology;
Public data in CSV and JSON formats;
Successive versions of the protocols;
Interpretation limits;
Integrity manifests and checksums;
Source code when it can be published without exposing the proprietary components of the instrument.

The method has also undergone an independent methodological review covering the May 2026 comparative cohort.

This review examines what the real-time signal can and cannot detect, documents the current limitations of the approach, and presents the improvements now being tested.

→ Read the Methodology
→ Explore the Data on GitHub
→ Read the Independent Methodological Review

5. Position of Principle

Measuring does not mean certifying truth.

NeoMundi does not claim that a real-time stability signal guarantees the factual accuracy of a response.

An AI can produce a correct answer while displaying unstable generation.

It can also produce a fluent and stable answer while being factually incorrect.

Stable does not mean true.

This is why the Observatory distinguishes five complementary dimensions:

Stability;
Validity;
Information Density;
Risk;
Cost Efficiency.

The objective is not to reduce an AI system to a single score, but to make its behavior observable, interpretable, and verifiable over time.