How long can Codex-Max operate on its own?

OpenAI has reported internal operations lasting over 24 hours for tasks with long durations. However, it's advisable to review merges with human oversight.

Is it more cost-effective or quicker than previous Codex models?

With equivalent analytical effort, Codex-Max uses about 30% fewer processing tokens—often enhancing speed and reducing costs while delivering similar results.

Codex is accessible via ChatGPT plans across command-line interfaces, IDE extensions, cloud services, and code review; API access is in development.

GPT-5.1-Codex-Max: Extended Coding with Efficiency

Q: What exactly is compaction—is it just summarization?

No. It’s an advanced method designed for working across multiple contexts, refining and maintaining crucial information so a session can span millions of tokens smoothly—beyond the limits of a single context size.

Q: What are the security defaults?

Codex operates within a restricted environment, with network access turned off unless you activate it. Enabling web or search functions heightens the risk of prompt injection; utilize approved lists and rigorous scanning.

ChatGPT

OpenAI

Dec 3, 2025

Uncertain about how to get started with AI?Evaluate your readiness, potential risks, and key priorities in less than an hour.

➔ Download Our Free AI Preparedness Pack

The challenge your Canadian team faces

For too long, AI has excelled at short, single-step tasks but has struggled with multi-day software engineering projects that demand deep context and sustained focus. Do you find yourself repeatedly restarting your coding agent because it lost track during a complicated refactor?

Discover the model designed for long-term coding

OpenAI’s new GPT-5.1-Codex-Max is crafted to overcome this challenge. This is a specialized agentic coding model engineered for long-duration, project-scale work. Its key innovation is compaction.

How compaction maintains context

Compaction is an innate training process that allows the model to trim its history while preserving the most crucial context across multiple context windows, effectively enabling it to work over millions of tokens. This allows the model to:

Maintain complex, iterative workflows like multi-file refactors and extended debugging sessions.
Operate autonomously for periods exceeding 24 hours.

What this means for engineering productivity

This capability transforms engineering productivity by eliminating obstacles and maximizing efficiency. Instead of managing context manually or constantly correcting errors, you gain reliable, high-quality implementations with significant performance improvements:

Faster, more cost-effective reasoning: Codex-Max uses about 30% fewer “thinking tokens” for comparable reasoning effort, resulting in cost and speed enhancements.
Project-level coherence: It maintains a project-level view, removing the need to manually supply context across iterations.
Proven productivity boost: Organizations using Codex have seen their teams ship about 70% more pull requests.

This model empowers your team to confidently delegate complex, long-term coding tasks, achieving clarity from chaos.

How to implement it effectively

To ensure your development program benefits from this advanced agent:

Utilize step-by-step instructions: Break down large coding tasks into a clear sequence of subtasks (e.g., “1) run tests 2) fix the top 3 failing tests 3) summarize changes”).
Select the appropriate tool: Use Codex-Max for multi-file refactors and complex agentic workflows, reserving standard models for quicker edits.
Secure implementation: Keep in mind that enabling network access introduces prompt injection risks. Ensure agents operate in secure, sandboxed environments by default.

Contact us to discuss integrating long-term agents into your software development life cycle.

FAQ

1) What exactly is “compaction”—is it just summarization?
No. It’s a native training/process designed to work across multiple context windows, trimming and preserving critical states so a session can span millions of tokens coherently—beyond a single window’s capacity. OpenAI+1

2) How long can Codex-Max run independently?
OpenAI reports internal runs exceeding 24 hours on long-term tasks. It’s still advisable to gate merges with human review. OpenAI

3) Is it more cost-effective or faster than older Codex models?
With the same “reasoning effort”, Codex-Max uses ~30% fewer thinking tokens—often accelerating speed and reducing costs for similar outcomes. OpenAI

4) What are the security defaults?
Codex operates in a sandbox, with network access disabled unless actively enabled. Enabling web/search increases prompt-injection risk; ensure the use of allow-lists and scanning. OpenAI

5) Where is it available?
It’s available in Codex through ChatGPT plans (Plus/Pro/Business/Edu/Enterprise) across CLI, IDE extension, cloud, and code review; API availability is planned. OpenAI

The challenge your Canadian team faces

Discover the model designed for long-term coding

How compaction maintains context

Maintain complex, iterative workflows like multi-file refactors and extended debugging sessions.
Operate autonomously for periods exceeding 24 hours.

What this means for engineering productivity

Faster, more cost-effective reasoning: Codex-Max uses about 30% fewer “thinking tokens” for comparable reasoning effort, resulting in cost and speed enhancements.
Project-level coherence: It maintains a project-level view, removing the need to manually supply context across iterations.
Proven productivity boost: Organizations using Codex have seen their teams ship about 70% more pull requests.

This model empowers your team to confidently delegate complex, long-term coding tasks, achieving clarity from chaos.

How to implement it effectively

To ensure your development program benefits from this advanced agent:

Utilize step-by-step instructions: Break down large coding tasks into a clear sequence of subtasks (e.g., “1) run tests 2) fix the top 3 failing tests 3) summarize changes”).
Select the appropriate tool: Use Codex-Max for multi-file refactors and complex agentic workflows, reserving standard models for quicker edits.
Secure implementation: Keep in mind that enabling network access introduces prompt injection risks. Ensure agents operate in secure, sandboxed environments by default.

Contact us to discuss integrating long-term agents into your software development life cycle.

FAQ

2) How long can Codex-Max run independently?
OpenAI reports internal runs exceeding 24 hours on long-term tasks. It’s still advisable to gate merges with human review. OpenAI

‹ AI Planning Risk & Solutions: Strategic Alignment for Building Trust

Claude as Your Thought Partner: Memory, Expertise & Google →

Receive weekly AI news and advice straight to your inbox

By subscribing, you agree to allow Generation Digital to store and process your information according to our privacy policy. You can review the full policy at gend.co/privacy.