Learning Outcomes Measurement: How OpenAI Tracks AI Impact

Learning Outcomes Measurement: How OpenAI Tracks AI Impact

OpenAI

4 mar 2026

In a well-lit classroom, diverse colleagues focus intently on a computer screen displaying charts and graphs, collaboratively discussing strategies for learning outcomes measurement and the impact of AI, with notebooks and documents spread across the wooden table.

¿No sabes por dónde empezar con la IA?Evalúa preparación, riesgos y prioridades en menos de una hora.

¿No sabes por dónde empezar con la IA?Evalúa preparación, riesgos y prioridades en menos de una hora.

➔ Descarga nuestro paquete gratuito de preparación para IA

OpenAI’s Learning Outcomes Measurement Suite is a framework for running longitudinal studies that measure how AI use affects learning over time. Built with the University of Tartu and Stanford’s SCALE Initiative, it combines student outcomes with de-identified interaction data so educators can evaluate whether AI improves understanding, not just short-term performance.

AI is moving into classrooms faster than education systems can evaluate it. Most studies still focus on short-term performance — test scores after a single intervention — which doesn’t tell you whether students actually learned something they can retain and apply later.

That gap is exactly what OpenAI’s Learning Outcomes Measurement Suite (LOMS) is designed to address: a structured way to measure learning outcomes over time, across different settings, with research partners and standardised methods.

In this post we’ll unpack what LOMS is, how it works, and what it means for schools, colleges, and education leaders who need evidence before scaling AI.

What is the Learning Outcomes Measurement Suite?

LOMS is a measurement framework for longitudinal research — a repeatable approach to evaluating whether AI tools contribute to sustained learning gains.

Rather than relying on a single assessment, the suite is designed to support studies that track outcomes across months, compare across contexts (schools, regions, curricula), and separate “better answers” from “better understanding”.

Why measuring learning outcomes is harder than it sounds

Educational impact isn’t one metric.

A student might:

  • complete work faster but retain less,

  • appear to perform better while relying heavily on AI,

  • improve motivation and persistence even if scores don’t move immediately.

If you only measure end-of-term grades, you miss the learning story.

LOMS aims to create a consistent way to capture a fuller picture — including durable learning outcomes — so decision-makers can compare like-for-like across trials.

How LOMS works?

LOMS brings together three elements:

  1. Outcome measures (partner-provided ground truth)

These are the things schools already care about: exam scores, teacher observations, classroom work, attendance or other local measures that indicate learning progress.

  1. De-identified AI interaction data

This captures how students engage with AI tools (prompts, patterns of use, iteration behaviour) in a way that protects identity.

  1. Longitudinal study design

The point is repetition: measure, observe, measure again — so you can see whether changes persist.

Put together, the suite is meant to help institutions answer the hard questions:

  • Are learning gains real — and do they last?

  • Who benefits most (and who is left behind)?

  • Which kinds of AI use correlate with deeper learning vs dependency?

Practical steps: how a school could use LOMS-style measurement

Even if the suite isn’t “plug-and-play” for every organisation yet, the approach maps to a sensible rollout plan.

Step 1: Pick a narrow, high-value use case

Start with one curriculum area or cohort where AI is already being used (e.g., revision support for 16–18 learners, writing feedback, maths practice).

Step 2: Establish a baseline

Before measuring AI impact, agree what “progress” means:

  • a baseline assessment or prior attainment data

  • teacher judgement rubrics

  • a retention check (for example, revisiting concepts after a time gap)

Step 3: Decide what evidence counts as learning

A healthy measurement mix includes:

  • performance (scores)

  • process (how students reached the answer)

  • persistence (did they keep trying?)

  • transfer (can they use the concept in a new situation?)

Step 4: Set governance and privacy expectations

Make sure you have:

  • transparent student/parent communications

  • de-identification and data minimisation

  • clear role accountability for reporting and oversight

Step 5: Report outcomes on a cadence that supports decision-making

Monthly reporting works well for pilots:

  • early indicators (engagement, completion)

  • learning checks (retention/transfer)

  • qualitative feedback from teachers and students

What this means for education leaders

LOMS is a signal that the conversation is shifting.

Instead of “AI increases productivity”, education leaders need to ask: does AI improve learning outcomes in a way that is equitable and durable?

If measurement becomes standardised, it also changes procurement. Vendors will be expected to show evidence that AI supports learning — not just usage metrics.

Summary

OpenAI’s Learning Outcomes Measurement Suite is designed to support longitudinal measurement of how AI affects learning across diverse educational contexts. The core value is a more consistent way to evaluate whether AI improves durable learning outcomes — not just short-term performance.

Next steps

If you’re exploring AI in education, Generation Digital can help you:

  • design pilots with measurable learning outcomes

  • build governance and reporting that protects trust

  • integrate AI tools with your existing workflows and platforms

FAQs

Q1: What is the Learning Outcomes Measurement Suite?

It’s a framework created by OpenAI and research partners to support longitudinal studies measuring how AI affects learning outcomes over time, across different educational settings.

Q2: What does the suite measure?

It combines outcome measures (like assessments and observations) with de-identified interaction data to evaluate whether AI supports durable learning — such as retention, transfer, and persistence — not just short-term performance.

Q3: Can it be used in all educational settings?

It’s designed to work across diverse contexts, but institutions should adapt measures to local curricula, age groups, and governance requirements.

Q4: Does this replace teacher judgement?

No. Teacher input and local outcome measures are core to the approach. The suite is meant to strengthen evaluation, not automate educational decisions.

Q5: What safeguards should schools put in place?

Clear consent and transparency practices, data minimisation, de-identification, human oversight of reporting, and explicit boundaries around how AI outputs are used.

OpenAI’s Learning Outcomes Measurement Suite is a framework for running longitudinal studies that measure how AI use affects learning over time. Built with the University of Tartu and Stanford’s SCALE Initiative, it combines student outcomes with de-identified interaction data so educators can evaluate whether AI improves understanding, not just short-term performance.

AI is moving into classrooms faster than education systems can evaluate it. Most studies still focus on short-term performance — test scores after a single intervention — which doesn’t tell you whether students actually learned something they can retain and apply later.

That gap is exactly what OpenAI’s Learning Outcomes Measurement Suite (LOMS) is designed to address: a structured way to measure learning outcomes over time, across different settings, with research partners and standardised methods.

In this post we’ll unpack what LOMS is, how it works, and what it means for schools, colleges, and education leaders who need evidence before scaling AI.

What is the Learning Outcomes Measurement Suite?

LOMS is a measurement framework for longitudinal research — a repeatable approach to evaluating whether AI tools contribute to sustained learning gains.

Rather than relying on a single assessment, the suite is designed to support studies that track outcomes across months, compare across contexts (schools, regions, curricula), and separate “better answers” from “better understanding”.

Why measuring learning outcomes is harder than it sounds

Educational impact isn’t one metric.

A student might:

  • complete work faster but retain less,

  • appear to perform better while relying heavily on AI,

  • improve motivation and persistence even if scores don’t move immediately.

If you only measure end-of-term grades, you miss the learning story.

LOMS aims to create a consistent way to capture a fuller picture — including durable learning outcomes — so decision-makers can compare like-for-like across trials.

How LOMS works?

LOMS brings together three elements:

  1. Outcome measures (partner-provided ground truth)

These are the things schools already care about: exam scores, teacher observations, classroom work, attendance or other local measures that indicate learning progress.

  1. De-identified AI interaction data

This captures how students engage with AI tools (prompts, patterns of use, iteration behaviour) in a way that protects identity.

  1. Longitudinal study design

The point is repetition: measure, observe, measure again — so you can see whether changes persist.

Put together, the suite is meant to help institutions answer the hard questions:

  • Are learning gains real — and do they last?

  • Who benefits most (and who is left behind)?

  • Which kinds of AI use correlate with deeper learning vs dependency?

Practical steps: how a school could use LOMS-style measurement

Even if the suite isn’t “plug-and-play” for every organisation yet, the approach maps to a sensible rollout plan.

Step 1: Pick a narrow, high-value use case

Start with one curriculum area or cohort where AI is already being used (e.g., revision support for 16–18 learners, writing feedback, maths practice).

Step 2: Establish a baseline

Before measuring AI impact, agree what “progress” means:

  • a baseline assessment or prior attainment data

  • teacher judgement rubrics

  • a retention check (for example, revisiting concepts after a time gap)

Step 3: Decide what evidence counts as learning

A healthy measurement mix includes:

  • performance (scores)

  • process (how students reached the answer)

  • persistence (did they keep trying?)

  • transfer (can they use the concept in a new situation?)

Step 4: Set governance and privacy expectations

Make sure you have:

  • transparent student/parent communications

  • de-identification and data minimisation

  • clear role accountability for reporting and oversight

Step 5: Report outcomes on a cadence that supports decision-making

Monthly reporting works well for pilots:

  • early indicators (engagement, completion)

  • learning checks (retention/transfer)

  • qualitative feedback from teachers and students

What this means for education leaders

LOMS is a signal that the conversation is shifting.

Instead of “AI increases productivity”, education leaders need to ask: does AI improve learning outcomes in a way that is equitable and durable?

If measurement becomes standardised, it also changes procurement. Vendors will be expected to show evidence that AI supports learning — not just usage metrics.

Summary

OpenAI’s Learning Outcomes Measurement Suite is designed to support longitudinal measurement of how AI affects learning across diverse educational contexts. The core value is a more consistent way to evaluate whether AI improves durable learning outcomes — not just short-term performance.

Next steps

If you’re exploring AI in education, Generation Digital can help you:

  • design pilots with measurable learning outcomes

  • build governance and reporting that protects trust

  • integrate AI tools with your existing workflows and platforms

FAQs

Q1: What is the Learning Outcomes Measurement Suite?

It’s a framework created by OpenAI and research partners to support longitudinal studies measuring how AI affects learning outcomes over time, across different educational settings.

Q2: What does the suite measure?

It combines outcome measures (like assessments and observations) with de-identified interaction data to evaluate whether AI supports durable learning — such as retention, transfer, and persistence — not just short-term performance.

Q3: Can it be used in all educational settings?

It’s designed to work across diverse contexts, but institutions should adapt measures to local curricula, age groups, and governance requirements.

Q4: Does this replace teacher judgement?

No. Teacher input and local outcome measures are core to the approach. The suite is meant to strengthen evaluation, not automate educational decisions.

Q5: What safeguards should schools put in place?

Clear consent and transparency practices, data minimisation, de-identification, human oversight of reporting, and explicit boundaries around how AI outputs are used.

Recibe noticias y consejos sobre IA cada semana en tu bandeja de entrada

Al suscribirte, das tu consentimiento para que Generation Digital almacene y procese tus datos de acuerdo con nuestra política de privacidad. Puedes leer la política completa en gend.co/privacy.

Generación
Digital

Oficina en Reino Unido

Generation Digital Ltd
33 Queen St,
Londres
EC4R 1AP
Reino Unido

Oficina en Canadá

Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canadá

Oficina en EE. UU.

Generation Digital Américas Inc
77 Sands St,
Brooklyn, NY 11201,
Estados Unidos

Oficina de la UE

Software Generación Digital
Edificio Elgee
Dundalk
A91 X2R3
Irlanda

Oficina en Medio Oriente

6994 Alsharq 3890,
An Narjis,
Riad 13343,
Arabia Saudita

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo (Background Removed)

Número de la empresa: 256 9431 77 | Derechos de autor 2026 | Términos y Condiciones | Política de Privacidad

Generación
Digital

Oficina en Reino Unido

Generation Digital Ltd
33 Queen St,
Londres
EC4R 1AP
Reino Unido

Oficina en Canadá

Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canadá

Oficina en EE. UU.

Generation Digital Américas Inc
77 Sands St,
Brooklyn, NY 11201,
Estados Unidos

Oficina de la UE

Software Generación Digital
Edificio Elgee
Dundalk
A91 X2R3
Irlanda

Oficina en Medio Oriente

6994 Alsharq 3890,
An Narjis,
Riad 13343,
Arabia Saudita

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo (Background Removed)


Número de Empresa: 256 9431 77
Términos y Condiciones
Política de Privacidad
Derechos de Autor 2026