GPT-5.1-Codex-Max: Long-Horizon Coding with Compaction

ChatGPT

OpenAI

Dec 3, 2025

The problem your team keeps hitting

For too long, AI has been excellent at short, single-step tasks, but struggles with multi-day software engineering projects that require deep context and sustained focus. Are you constantly restarting your coding agent because it lost the thread in a complex refactor?

Meet the model built for long-horizon coding

OpenAI’s new GPT-5.1-Codex-Max is specifically designed to overcome this limitation. This is a specialised agentic coding model built for long-running, project-scale work. Its foundational innovation is compaction.

How compaction sustains context

Compaction is a native training process that allows the model to prune its history while coherently preserving the most critical context across multiple context windows, effectively enabling it to work over millions of tokens. This allows the model to:

  • Sustain complex, iterative workflows like multi-file refactors and prolonged debugging.

  • Work autonomously for periods exceeding a day.

What this unlocks for engineering productivity

This capability transforms your engineering productivity by removing friction and maximising efficiency. Instead of manual context management or constantly fixing missteps, you get reliable, high-quality implementations with significant performance gains:

  • Faster, cheaper reasoning: Codex-Max uses approximately 30% fewer “thinking tokens” for similar reasoning effort compared to its predecessor, leading to cost and speed improvements.

  • Project-level coherence: It maintains a project-level perspective, eliminating the need to manually supply context across iterations.

  • Proven productivity uplift: Organisations adopting Codex have seen their engineers ship roughly 70% more pull requests.

This model helps your team achieve clarity from chaos by confidently delegating complex, long-horizon coding tasks.

How to put it into practice

To ensure your development programme benefits from this advanced agent:

  • Use stepwise instructions: Break down large coding jobs into a clear sequence of subtasks (e.g., “1) run tests 2) fix top 3 failing tests 3) summarise changes”).

  • Choose the right tool: Use Codex-Max for multi-file refactors and complex agentic workflows, reserving standard models for quick edits.

  • Secure implementation: Remember that enabling network access introduces prompt injection risks. Ensure agents run in secure, sandboxed environments by default.

Contact us to discuss integrating long-horizon agents into your software development life cycle.

FAQ

1) What exactly is “compaction”—is it just summarisation?
No. It’s a native training/process for working across multiple context windows, pruning and preserving critical state so a session can span millions of tokens coherently—beyond a single window’s size. OpenAI+1

2) How long can Codex-Max run by itself?
OpenAI reports internal runs exceeding 24 hours on long-horizon tasks. You should still gate merges with human review. OpenAI

3) Is it cheaper or faster than older Codex models?
At the same “reasoning effort”, Codex-Max uses ~30% fewer thinking tokens—often improving speed/cost for comparable outcomes. OpenAI

4) What are the security defaults?
Codex runs in a sandbox, with network access disabled unless you turn it on. Enabling web/search increases prompt-injection risk; use allow-lists and scanning. OpenAI

5) Where can I use it?
In Codex via ChatGPT plans (Plus/Pro/Business/Edu/Enterprise) across CLI, IDE extension, cloud, and code review; API availability is planned. OpenAI

Ready to get the support your organisation needs to successfully use AI?

Miro Solutions Partner
Asana Platinum Solutions Partner
Notion Platinum Solutions Partner
Glean Certified Partner

Ready to get the support your organisation needs to successfully use AI?

Miro Solutions Partner
Asana Platinum Solutions Partner
Notion Platinum Solutions Partner
Glean Certified Partner

Generation
Digital

UK Office
33 Queen St,
London
EC4R 1AP
United Kingdom

Canada Office
1 University Ave,
Toronto,
ON M5J 1T1,
Canada

NAMER Office
77 Sands St,
Brooklyn,
NY 11201,
United States

EMEA Office
Charlemont St, Saint Kevin's, Dublin,
D02 VN88,
Ireland

Middle East Office
6994 Alsharq 3890,
An Narjis,
Riyadh 13343,
Saudi Arabia

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo

Company No: 256 9431 77 | Copyright 2026 | Terms and Conditions | Privacy Policy

Generation
Digital

UK Office
33 Queen St,
London
EC4R 1AP
United Kingdom

Canada Office
1 University Ave,
Toronto,
ON M5J 1T1,
Canada

NAMER Office
77 Sands St,
Brooklyn,
NY 11201,
United States

EMEA Office
Charlemont St, Saint Kevin's, Dublin,
D02 VN88,
Ireland

Middle East Office
6994 Alsharq 3890,
An Narjis,
Riyadh 13343,
Saudi Arabia

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo


Company No: 256 9431 77
Terms and Conditions
Privacy Policy
Copyright 2026


{ "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "What exactly is compaction—is it just summarisation?", "acceptedAnswer": { "@type": "Answer", "text": "No. It’s a native training/process for working across multiple context windows, pruning and preserving critical state so a session can span millions of tokens coherently—beyond a single window’s size." } }, { "@type": "Question", "name": "How long can Codex-Max run by itself?", "acceptedAnswer": { "@type": "Answer", "text": "OpenAI reports internal runs exceeding 24 hours on long-horizon tasks. You should still gate merges with human review." } }, { "@type": "Question", "name": "Is it cheaper or faster than older Codex models?", "acceptedAnswer": { "@type": "Answer", "text": "At the same reasoning effort, Codex-Max uses ~30% fewer thinking tokens—often improving speed/cost for comparable outcomes." } }, { "@type": "Question", "name": "What are the security defaults?", "acceptedAnswer": { "@type": "Answer", "text": "Codex runs in a sandbox, with network access disabled unless you turn it on. Enabling web/search increases prompt-injection risk; use allow-lists and scanning." } }, { "@type": "Question", "name": "Where can I use it?", "acceptedAnswer": { "@type": "Answer", "text": "In Codex via ChatGPT plans across CLI, IDE extension, cloud, and code review; API availability is planned." } } ] }