Learning Outcomes Measurement: How OpenAI Tracks AI Impact
Learning Outcomes Measurement: How OpenAI Tracks AI Impact
OpenAI
4 mar 2026

¿No sabes por dónde empezar con la IA?Evalúa preparación, riesgos y prioridades en menos de una hora.
¿No sabes por dónde empezar con la IA?Evalúa preparación, riesgos y prioridades en menos de una hora.
➔ Descarga nuestro paquete gratuito de preparación para IA
OpenAI’s Learning Outcomes Measurement Suite is a framework for running longitudinal studies that measure how AI use affects learning over time. Built with the University of Tartu and Stanford’s SCALE Initiative, it combines student outcomes with de-identified interaction data so educators can evaluate whether AI improves understanding, not just short-term performance.
AI is moving into classrooms faster than education systems can evaluate it. Most studies still focus on short-term performance — test scores after a single intervention — which doesn’t tell you whether students actually learned something they can retain and apply later.
That gap is exactly what OpenAI’s Learning Outcomes Measurement Suite (LOMS) is designed to address: a structured way to measure learning outcomes over time, across different settings, with research partners and standardised methods.
In this post we’ll unpack what LOMS is, how it works, and what it means for schools, colleges, and education leaders who need evidence before scaling AI.
What is the Learning Outcomes Measurement Suite?
LOMS is a measurement framework for longitudinal research — a repeatable approach to evaluating whether AI tools contribute to sustained learning gains.
Rather than relying on a single assessment, the suite is designed to support studies that track outcomes across months, compare across contexts (schools, regions, curricula), and separate “better answers” from “better understanding”.
Why measuring learning outcomes is harder than it sounds
Educational impact isn’t one metric.
A student might:
complete work faster but retain less,
appear to perform better while relying heavily on AI,
improve motivation and persistence even if scores don’t move immediately.
If you only measure end-of-term grades, you miss the learning story.
LOMS aims to create a consistent way to capture a fuller picture — including durable learning outcomes — so decision-makers can compare like-for-like across trials.
How LOMS works?
LOMS brings together three elements:
Outcome measures (partner-provided ground truth)
These are the things schools already care about: exam scores, teacher observations, classroom work, attendance or other local measures that indicate learning progress.
De-identified AI interaction data
This captures how students engage with AI tools (prompts, patterns of use, iteration behaviour) in a way that protects identity.
Longitudinal study design
The point is repetition: measure, observe, measure again — so you can see whether changes persist.
Put together, the suite is meant to help institutions answer the hard questions:
Are learning gains real — and do they last?
Who benefits most (and who is left behind)?
Which kinds of AI use correlate with deeper learning vs dependency?
Practical steps: how a school could use LOMS-style measurement
Even if the suite isn’t “plug-and-play” for every organisation yet, the approach maps to a sensible rollout plan.
Step 1: Pick a narrow, high-value use case
Start with one curriculum area or cohort where AI is already being used (e.g., revision support for 16–18 learners, writing feedback, maths practice).
Step 2: Establish a baseline
Before measuring AI impact, agree what “progress” means:
a baseline assessment or prior attainment data
teacher judgement rubrics
a retention check (for example, revisiting concepts after a time gap)
Step 3: Decide what evidence counts as learning
A healthy measurement mix includes:
performance (scores)
process (how students reached the answer)
persistence (did they keep trying?)
transfer (can they use the concept in a new situation?)
Step 4: Set governance and privacy expectations
Make sure you have:
transparent student/parent communications
de-identification and data minimisation
clear role accountability for reporting and oversight
Step 5: Report outcomes on a cadence that supports decision-making
Monthly reporting works well for pilots:
early indicators (engagement, completion)
learning checks (retention/transfer)
qualitative feedback from teachers and students
What this means for education leaders
LOMS is a signal that the conversation is shifting.
Instead of “AI increases productivity”, education leaders need to ask: does AI improve learning outcomes in a way that is equitable and durable?
If measurement becomes standardised, it also changes procurement. Vendors will be expected to show evidence that AI supports learning — not just usage metrics.
Summary
OpenAI’s Learning Outcomes Measurement Suite is designed to support longitudinal measurement of how AI affects learning across diverse educational contexts. The core value is a more consistent way to evaluate whether AI improves durable learning outcomes — not just short-term performance.
Next steps
If you’re exploring AI in education, Generation Digital can help you:
design pilots with measurable learning outcomes
build governance and reporting that protects trust
integrate AI tools with your existing workflows and platforms
FAQs
Q1: What is the Learning Outcomes Measurement Suite?
It’s a framework created by OpenAI and research partners to support longitudinal studies measuring how AI affects learning outcomes over time, across different educational settings.
Q2: What does the suite measure?
It combines outcome measures (like assessments and observations) with de-identified interaction data to evaluate whether AI supports durable learning — such as retention, transfer, and persistence — not just short-term performance.
Q3: Can it be used in all educational settings?
It’s designed to work across diverse contexts, but institutions should adapt measures to local curricula, age groups, and governance requirements.
Q4: Does this replace teacher judgement?
No. Teacher input and local outcome measures are core to the approach. The suite is meant to strengthen evaluation, not automate educational decisions.
Q5: What safeguards should schools put in place?
Clear consent and transparency practices, data minimisation, de-identification, human oversight of reporting, and explicit boundaries around how AI outputs are used.
OpenAI’s Learning Outcomes Measurement Suite is a framework for running longitudinal studies that measure how AI use affects learning over time. Built with the University of Tartu and Stanford’s SCALE Initiative, it combines student outcomes with de-identified interaction data so educators can evaluate whether AI improves understanding, not just short-term performance.
AI is moving into classrooms faster than education systems can evaluate it. Most studies still focus on short-term performance — test scores after a single intervention — which doesn’t tell you whether students actually learned something they can retain and apply later.
That gap is exactly what OpenAI’s Learning Outcomes Measurement Suite (LOMS) is designed to address: a structured way to measure learning outcomes over time, across different settings, with research partners and standardised methods.
In this post we’ll unpack what LOMS is, how it works, and what it means for schools, colleges, and education leaders who need evidence before scaling AI.
What is the Learning Outcomes Measurement Suite?
LOMS is a measurement framework for longitudinal research — a repeatable approach to evaluating whether AI tools contribute to sustained learning gains.
Rather than relying on a single assessment, the suite is designed to support studies that track outcomes across months, compare across contexts (schools, regions, curricula), and separate “better answers” from “better understanding”.
Why measuring learning outcomes is harder than it sounds
Educational impact isn’t one metric.
A student might:
complete work faster but retain less,
appear to perform better while relying heavily on AI,
improve motivation and persistence even if scores don’t move immediately.
If you only measure end-of-term grades, you miss the learning story.
LOMS aims to create a consistent way to capture a fuller picture — including durable learning outcomes — so decision-makers can compare like-for-like across trials.
How LOMS works?
LOMS brings together three elements:
Outcome measures (partner-provided ground truth)
These are the things schools already care about: exam scores, teacher observations, classroom work, attendance or other local measures that indicate learning progress.
De-identified AI interaction data
This captures how students engage with AI tools (prompts, patterns of use, iteration behaviour) in a way that protects identity.
Longitudinal study design
The point is repetition: measure, observe, measure again — so you can see whether changes persist.
Put together, the suite is meant to help institutions answer the hard questions:
Are learning gains real — and do they last?
Who benefits most (and who is left behind)?
Which kinds of AI use correlate with deeper learning vs dependency?
Practical steps: how a school could use LOMS-style measurement
Even if the suite isn’t “plug-and-play” for every organisation yet, the approach maps to a sensible rollout plan.
Step 1: Pick a narrow, high-value use case
Start with one curriculum area or cohort where AI is already being used (e.g., revision support for 16–18 learners, writing feedback, maths practice).
Step 2: Establish a baseline
Before measuring AI impact, agree what “progress” means:
a baseline assessment or prior attainment data
teacher judgement rubrics
a retention check (for example, revisiting concepts after a time gap)
Step 3: Decide what evidence counts as learning
A healthy measurement mix includes:
performance (scores)
process (how students reached the answer)
persistence (did they keep trying?)
transfer (can they use the concept in a new situation?)
Step 4: Set governance and privacy expectations
Make sure you have:
transparent student/parent communications
de-identification and data minimisation
clear role accountability for reporting and oversight
Step 5: Report outcomes on a cadence that supports decision-making
Monthly reporting works well for pilots:
early indicators (engagement, completion)
learning checks (retention/transfer)
qualitative feedback from teachers and students
What this means for education leaders
LOMS is a signal that the conversation is shifting.
Instead of “AI increases productivity”, education leaders need to ask: does AI improve learning outcomes in a way that is equitable and durable?
If measurement becomes standardised, it also changes procurement. Vendors will be expected to show evidence that AI supports learning — not just usage metrics.
Summary
OpenAI’s Learning Outcomes Measurement Suite is designed to support longitudinal measurement of how AI affects learning across diverse educational contexts. The core value is a more consistent way to evaluate whether AI improves durable learning outcomes — not just short-term performance.
Next steps
If you’re exploring AI in education, Generation Digital can help you:
design pilots with measurable learning outcomes
build governance and reporting that protects trust
integrate AI tools with your existing workflows and platforms
FAQs
Q1: What is the Learning Outcomes Measurement Suite?
It’s a framework created by OpenAI and research partners to support longitudinal studies measuring how AI affects learning outcomes over time, across different educational settings.
Q2: What does the suite measure?
It combines outcome measures (like assessments and observations) with de-identified interaction data to evaluate whether AI supports durable learning — such as retention, transfer, and persistence — not just short-term performance.
Q3: Can it be used in all educational settings?
It’s designed to work across diverse contexts, but institutions should adapt measures to local curricula, age groups, and governance requirements.
Q4: Does this replace teacher judgement?
No. Teacher input and local outcome measures are core to the approach. The suite is meant to strengthen evaluation, not automate educational decisions.
Q5: What safeguards should schools put in place?
Clear consent and transparency practices, data minimisation, de-identification, human oversight of reporting, and explicit boundaries around how AI outputs are used.
Recibe noticias y consejos sobre IA cada semana en tu bandeja de entrada
Al suscribirte, das tu consentimiento para que Generation Digital almacene y procese tus datos de acuerdo con nuestra política de privacidad. Puedes leer la política completa en gend.co/privacy.
Generación
Digital

Oficina en Reino Unido
Generation Digital Ltd
33 Queen St,
Londres
EC4R 1AP
Reino Unido
Oficina en Canadá
Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canadá
Oficina en EE. UU.
Generation Digital Américas Inc
77 Sands St,
Brooklyn, NY 11201,
Estados Unidos
Oficina de la UE
Software Generación Digital
Edificio Elgee
Dundalk
A91 X2R3
Irlanda
Oficina en Medio Oriente
6994 Alsharq 3890,
An Narjis,
Riad 13343,
Arabia Saudita
Número de la empresa: 256 9431 77 | Derechos de autor 2026 | Términos y Condiciones | Política de Privacidad
Generación
Digital

Oficina en Reino Unido
Generation Digital Ltd
33 Queen St,
Londres
EC4R 1AP
Reino Unido
Oficina en Canadá
Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canadá
Oficina en EE. UU.
Generation Digital Américas Inc
77 Sands St,
Brooklyn, NY 11201,
Estados Unidos
Oficina de la UE
Software Generación Digital
Edificio Elgee
Dundalk
A91 X2R3
Irlanda
Oficina en Medio Oriente
6994 Alsharq 3890,
An Narjis,
Riad 13343,
Arabia Saudita









