OpenAI Responses API: Secure Agents with Hosted Containers
OpenAI Responses API: Secure Agents with Hosted Containers
OpenAI
13 feb 2026

¿No sabes por dónde empezar con la IA?Evalúa preparación, riesgos y prioridades en menos de una hora.
¿No sabes por dónde empezar con la IA?Evalúa preparación, riesgos y prioridades en menos de una hora.
➔ Descarga nuestro paquete gratuito de preparación para IA
OpenAI’s Responses API can run secure, scalable agents by pairing tool orchestration with a hosted computer environment. With the shell tool and OpenAI-hosted containers, agents can execute commands, manage files, and maintain state across multi-step workflows. This improves reliability for long-running tasks while reducing security risk through controlled execution.
Agentic workflows become far more useful when they can do things: run scripts, transform files, install dependencies, validate outputs, and generate artefacts. But as soon as an agent can execute code, organisations face a familiar question: how do we get the benefit without creating a security problem?
OpenAI’s latest Responses API enhancements introduce a practical answer: a hosted “computer environment” that combines tool orchestration, a shell tool, and OpenAI-hosted containers so agents can execute tasks with stronger isolation, better state handling, and improved file management.
What’s new in the Responses API runtime
OpenAI positions this as an agent runtime built from several building blocks:
Responses API orchestration: the service manages the agent loop, tool calls, and multi-turn continuation.
Shell tool: the agent can propose shell commands to run.
Hosted containers: the shell runs inside an OpenAI-hosted container environment, creating isolation and a predictable execution surface.
Streaming output: command output is streamed back so the agent can react in near real time.
Long-running support: server-side “compaction” keeps agent runs going without context exploding.
Together, these features enable agents that can work for longer, handle files more reliably, and execute deterministic steps instead of guessing.
How it works (conceptually)
The flow is straightforward:
You send a request to the Responses API.
The model decides whether to answer directly or use tools.
If it chooses the shell tool, it returns one or more commands.
The Responses API runs those commands inside the hosted container and streams the output.
The model sees the output and either:
runs follow-up commands,
calls other tools,
or produces the final response.
This continues until the model stops requesting tools.
Why hosted containers matter for security
Hosted containers create a separation between agent execution and your core systems. That’s valuable because:
the runtime can be constrained (resources, execution context)
tool access can be controlled and audited
you can keep high-impact actions behind explicit, gated tools
This doesn’t eliminate risk, but it reduces blast radius and makes agent execution more governable.
File handling and state: what “stateful agents” really means
In practice, “state” shows up in two places:
Conversation state: the Responses API can carry structured state across turns.
Execution state: the hosted container can persist files and runtime context during the agent run, so the agent can create outputs (reports, transformed datasets, logs) and continue work without rebuilding everything each step.
For longer workflows, compaction helps reduce token growth while maintaining the essentials.
Practical workflows you can build
The new runtime is most valuable for tasks that benefit from deterministic execution.
1) Data transformation and validation
parse and clean CSV exports
validate ranges and completeness
generate summary tables and charts
2) Report and artefact generation
run scripts that produce outputs
generate markdown reports
package files for downstream workflows
3) Debugging and incident support
reproduce issues in a controlled environment
analyse logs and produce summaries
draft runbooks or post-incident write-ups
4) CI-style checks on agent outputs
verify calculations
check generated artefacts
run lightweight tests before humans approve actions
A sensible enterprise rollout pattern
If you want to deploy this safely, treat it like any other production capability.
Start read-only
Use the runtime for analysis and artefact generation first.Define tool boundaries
Separate low-risk tools from high-impact tools. Put approvals on writes.Instrument everything
Log tool calls, command outputs, and exceptions.Adopt approvals for risky actions
“Compute then propose” is the safe default.Build an evaluation harness
Test prompt injection scenarios and tool misuse attempts on your real workflows.
Where Generation Digital can help
Generation Digital helps teams move from “agent demos” to governed systems.
We can support:
selecting high-value workflows that benefit from hosted execution
designing safe tool patterns (allow-lists, approvals, identity boundaries)
evaluation and monitoring so agents scale responsibly
integrating outputs into your workflow stack
Related links
Explore Asana integration: /asana/
Discover Miro’s capabilities: /miro/
Learn about Notion features: /notion/
Glean insights: /glean/
Summary
OpenAI’s Responses API now supports a more complete agent runtime: hosted containers plus a shell tool for controlled execution, improved state and file handling, and mechanisms to keep long-running workflows stable. Used well, it enables agents that are more reliable and easier to govern.
Next steps: If you’re planning a secure agent pilot or want to scale one, speak with Generation Digital: https://www.gend.co/contact
FAQs
Q1: What is the primary benefit of the new Responses API?
It enables a stronger agent runtime: tool orchestration plus hosted execution so agents can run commands, handle files, and continue multi-step workflows more reliably.
Q2: How does the API ensure security?
By running shell execution inside hosted containers and encouraging controlled tool access, you can isolate execution and gate high-impact actions behind approvals and policy.
Q3: Can the Responses API handle large-scale operations?
Yes. The architecture supports parallel execution across container sessions and long-running workflows via state management and compaction.
Q4: Do I have to use hosted execution?
No. OpenAI supports both hosted shell containers and a local shell runtime you execute yourself, depending on how much control you need.
Q5: What’s the safest way to start?
Start with deterministic tasks (data transforms, validation, reporting) and keep anything that changes production systems behind explicit tools with human approval.
OpenAI’s Responses API can run secure, scalable agents by pairing tool orchestration with a hosted computer environment. With the shell tool and OpenAI-hosted containers, agents can execute commands, manage files, and maintain state across multi-step workflows. This improves reliability for long-running tasks while reducing security risk through controlled execution.
Agentic workflows become far more useful when they can do things: run scripts, transform files, install dependencies, validate outputs, and generate artefacts. But as soon as an agent can execute code, organisations face a familiar question: how do we get the benefit without creating a security problem?
OpenAI’s latest Responses API enhancements introduce a practical answer: a hosted “computer environment” that combines tool orchestration, a shell tool, and OpenAI-hosted containers so agents can execute tasks with stronger isolation, better state handling, and improved file management.
What’s new in the Responses API runtime
OpenAI positions this as an agent runtime built from several building blocks:
Responses API orchestration: the service manages the agent loop, tool calls, and multi-turn continuation.
Shell tool: the agent can propose shell commands to run.
Hosted containers: the shell runs inside an OpenAI-hosted container environment, creating isolation and a predictable execution surface.
Streaming output: command output is streamed back so the agent can react in near real time.
Long-running support: server-side “compaction” keeps agent runs going without context exploding.
Together, these features enable agents that can work for longer, handle files more reliably, and execute deterministic steps instead of guessing.
How it works (conceptually)
The flow is straightforward:
You send a request to the Responses API.
The model decides whether to answer directly or use tools.
If it chooses the shell tool, it returns one or more commands.
The Responses API runs those commands inside the hosted container and streams the output.
The model sees the output and either:
runs follow-up commands,
calls other tools,
or produces the final response.
This continues until the model stops requesting tools.
Why hosted containers matter for security
Hosted containers create a separation between agent execution and your core systems. That’s valuable because:
the runtime can be constrained (resources, execution context)
tool access can be controlled and audited
you can keep high-impact actions behind explicit, gated tools
This doesn’t eliminate risk, but it reduces blast radius and makes agent execution more governable.
File handling and state: what “stateful agents” really means
In practice, “state” shows up in two places:
Conversation state: the Responses API can carry structured state across turns.
Execution state: the hosted container can persist files and runtime context during the agent run, so the agent can create outputs (reports, transformed datasets, logs) and continue work without rebuilding everything each step.
For longer workflows, compaction helps reduce token growth while maintaining the essentials.
Practical workflows you can build
The new runtime is most valuable for tasks that benefit from deterministic execution.
1) Data transformation and validation
parse and clean CSV exports
validate ranges and completeness
generate summary tables and charts
2) Report and artefact generation
run scripts that produce outputs
generate markdown reports
package files for downstream workflows
3) Debugging and incident support
reproduce issues in a controlled environment
analyse logs and produce summaries
draft runbooks or post-incident write-ups
4) CI-style checks on agent outputs
verify calculations
check generated artefacts
run lightweight tests before humans approve actions
A sensible enterprise rollout pattern
If you want to deploy this safely, treat it like any other production capability.
Start read-only
Use the runtime for analysis and artefact generation first.Define tool boundaries
Separate low-risk tools from high-impact tools. Put approvals on writes.Instrument everything
Log tool calls, command outputs, and exceptions.Adopt approvals for risky actions
“Compute then propose” is the safe default.Build an evaluation harness
Test prompt injection scenarios and tool misuse attempts on your real workflows.
Where Generation Digital can help
Generation Digital helps teams move from “agent demos” to governed systems.
We can support:
selecting high-value workflows that benefit from hosted execution
designing safe tool patterns (allow-lists, approvals, identity boundaries)
evaluation and monitoring so agents scale responsibly
integrating outputs into your workflow stack
Related links
Explore Asana integration: /asana/
Discover Miro’s capabilities: /miro/
Learn about Notion features: /notion/
Glean insights: /glean/
Summary
OpenAI’s Responses API now supports a more complete agent runtime: hosted containers plus a shell tool for controlled execution, improved state and file handling, and mechanisms to keep long-running workflows stable. Used well, it enables agents that are more reliable and easier to govern.
Next steps: If you’re planning a secure agent pilot or want to scale one, speak with Generation Digital: https://www.gend.co/contact
FAQs
Q1: What is the primary benefit of the new Responses API?
It enables a stronger agent runtime: tool orchestration plus hosted execution so agents can run commands, handle files, and continue multi-step workflows more reliably.
Q2: How does the API ensure security?
By running shell execution inside hosted containers and encouraging controlled tool access, you can isolate execution and gate high-impact actions behind approvals and policy.
Q3: Can the Responses API handle large-scale operations?
Yes. The architecture supports parallel execution across container sessions and long-running workflows via state management and compaction.
Q4: Do I have to use hosted execution?
No. OpenAI supports both hosted shell containers and a local shell runtime you execute yourself, depending on how much control you need.
Q5: What’s the safest way to start?
Start with deterministic tasks (data transforms, validation, reporting) and keep anything that changes production systems behind explicit tools with human approval.
Recibe noticias y consejos sobre IA cada semana en tu bandeja de entrada
Al suscribirte, das tu consentimiento para que Generation Digital almacene y procese tus datos de acuerdo con nuestra política de privacidad. Puedes leer la política completa en gend.co/privacy.
Generación
Digital

Oficina en Reino Unido
Generation Digital Ltd
33 Queen St,
Londres
EC4R 1AP
Reino Unido
Oficina en Canadá
Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canadá
Oficina en EE. UU.
Generation Digital Américas Inc
77 Sands St,
Brooklyn, NY 11201,
Estados Unidos
Oficina de la UE
Software Generación Digital
Edificio Elgee
Dundalk
A91 X2R3
Irlanda
Oficina en Medio Oriente
6994 Alsharq 3890,
An Narjis,
Riad 13343,
Arabia Saudita
Número de la empresa: 256 9431 77 | Derechos de autor 2026 | Términos y Condiciones | Política de Privacidad
Generación
Digital

Oficina en Reino Unido
Generation Digital Ltd
33 Queen St,
Londres
EC4R 1AP
Reino Unido
Oficina en Canadá
Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canadá
Oficina en EE. UU.
Generation Digital Américas Inc
77 Sands St,
Brooklyn, NY 11201,
Estados Unidos
Oficina de la UE
Software Generación Digital
Edificio Elgee
Dundalk
A91 X2R3
Irlanda
Oficina en Medio Oriente
6994 Alsharq 3890,
An Narjis,
Riad 13343,
Arabia Saudita








