OpenAI Safety Bug Bounty: Report AI Risks & Earn Rewards

OpenAI

Free AI at Work Playbook for managers using ChatGPT, Claude and Gemini.

➔ Download the Playbook

OpenAI’s Safety Bug Bounty is a public programme that rewards researchers for reporting AI abuse and safety risks across OpenAI products — including agentic prompt injection, data exfiltration scenarios and account integrity bypasses. It complements OpenAI’s Security Bug Bounty by covering impactful safety issues even when they aren’t classic security vulnerabilities.

As AI systems become more capable — and more connected to tools, browsers and third‑party services — the security challenge changes. It’s no longer only about traditional vulnerabilities. It’s also about misuse scenarios and safety failures that can lead to tangible harm.

OpenAI has now launched a new public Safety Bug Bounty programme to help identify and mitigate these AI‑specific risks earlier, by partnering with the global research community.

What is OpenAI’s Safety Bug Bounty?

OpenAI’s Safety Bug Bounty is a programme focused on AI abuse and safety risks across OpenAI products. It’s designed to complement OpenAI’s existing Security Bug Bounty by accepting reports that involve meaningful harm or abuse potential — even if the issue doesn’t neatly fit the definition of a conventional security vulnerability.

OpenAI states that submissions are triaged by Safety and Security Bug Bounty teams and may be rerouted between programmes depending on scope and ownership.

What’s in scope (and what OpenAI is looking for)

OpenAI’s announcement highlights several AI‑specific categories the programme focuses on.

1) Agentic risks (including MCP)

OpenAI explicitly calls out scenarios where attacker-controlled text can hijack a victim’s agent (for example via third‑party prompt injection) and trick it into taking harmful action or leaking sensitive information. OpenAI notes reports should be reproducible at least 50% of the time.

OpenAI also mentions agentic behaviours that:

  • perform disallowed actions on OpenAI’s website at scale

  • or perform other potentially harmful actions with plausible and material impact

2) OpenAI proprietary information

The programme includes model behaviours that expose OpenAI proprietary information, including proprietary information related to reasoning.

3) Account and platform integrity

OpenAI lists issues involving account and platform integrity signals, such as bypassing anti‑automation controls, manipulating account trust signals, or evading restrictions/suspensions/bans.

(Where the issue is more clearly a conventional security problem — for example accessing features or data beyond authorised permissions — OpenAI directs researchers to the Security Bug Bounty instead.)

What’s out of scope (important)

OpenAI explicitly notes that “jailbreaks” are out of scope for this Safety Bug Bounty — particularly where there is no clear, demonstrable safety impact.

OpenAI also states that general content-policy bypasses without meaningful safety or abuse impact (for example, getting rude language or easily searchable information) are out of scope.

How rewards work

OpenAI’s official announcement confirms that valid reports can be rewarded, with additional case‑by‑case consideration for issues outside the listed categories when there’s a direct path to user harm and discrete remediation.

Public reporting on the programme indicates tiered payouts by severity and that critical findings can be rewarded up to $100,000, but exact amounts depend on the Bugcrowd programme rules and OpenAI’s triage outcome.

How to participate (responsibly)

OpenAI instructs researchers to apply and submit reports through its Bugcrowd Safety Bug Bounty portal.

To increase the chances of a useful and eligible report:

  • describe the issue as a safety/abuse scenario (not just “unexpected output”)

  • include clear reproduction steps and evidence

  • explain why it’s material (what harm could plausibly happen)

  • suggest a discrete mitigation (what change would reduce the risk)

If you are reporting conventional security vulnerabilities (for example, authorisation bugs), use the Security Bug Bounty route.

Why this matters for the wider AI ecosystem

Safety bug bounties are still relatively new compared to traditional cybersecurity programmes. OpenAI’s move is notable because it treats prompt injection, agentic misuse and data exfiltration scenarios as first‑class concerns — not edge cases.

For organisations deploying AI agents, this is a signal that governance needs to cover more than model outputs. It must cover:

  • tool access and permissions

  • third‑party content ingestion

  • guardrails for actions (not just text)

  • monitoring and incident response

Internal link suggestion: When you publish, link to Generation Digital content on practical AI adoption and secure workflows (especially where tools connect across systems).

Summary

OpenAI’s Safety Bug Bounty is a public programme that rewards researchers for finding real, reproducible safety and abuse risks — with a major focus on agentic prompt injection, data exfiltration scenarios, and account/platform integrity.

If you research AI security, it’s a concrete opportunity to contribute to safer products — and to be rewarded for high‑quality, responsible disclosures.

Next steps

  • Read the official scope and rules on OpenAI’s Safety Bug Bounty portal.

  • If you find a safety issue, write an evidence‑based report with clear impact and mitigation.

  • If you’re a business deploying agents, audit your own “prompt injection → action” pathways and tighten permissions.

Need help securing AI workflows in the real world? Generation Digital helps teams implement AI and automation with practical governance — so tools stay helpful without creating new risk.

FAQs

1) What is OpenAI’s Safety Bug Bounty?
It’s a public programme that rewards researchers for reporting AI abuse and safety risks across OpenAI products, including agentic prompt injection, data exfiltration scenarios and account integrity bypasses.

2) What’s the difference between the Safety Bug Bounty and the Security Bug Bounty?
The Safety Bug Bounty covers AI‑specific safety and misuse scenarios that may not be conventional security vulnerabilities. The Security Bug Bounty focuses on traditional security issues like authorisation or infrastructure vulnerabilities.

3) Are jailbreaks in scope?
OpenAI states that jailbreaks are out of scope for this programme, especially when they don’t produce a demonstrable safety or abuse impact.

4) What makes a report eligible for a reward?
Reports should show a reproducible issue with plausible and material harm, include evidence, and propose actionable mitigation steps. Submissions may be rerouted between programmes depending on ownership.

5) How do I submit a Safety Bug Bounty report?
OpenAI directs researchers to submit through the Bugcrowd Safety Bug Bounty portal, following the published rules and responsible disclosure guidelines.

Get weekly AI news and advice delivered to your inbox

By subscribing you consent to Generation Digital storing and processing your details in line with our privacy policy. You can read the full policy at gend.co/privacy.

Generation
Digital

UK Office

Generation Digital Ltd
33 Queen St,
London
EC4R 1AP
United Kingdom

Canada Office

Generation Digital Americas Inc
181 Bay St., Suite 1800
Toronto, ON, M5J 2T9
Canada

USA Office

Generation Digital Americas Inc
77 Sands St,
Brooklyn, NY 11201,
United States

EU Office

Generation Digital Software
Elgee Building
Dundalk
A91 X2R3
Ireland

Middle East Office

6994 Alsharq 3890,
An Narjis,
Riyadh 13343,
Saudi Arabia

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo (Background Removed)

Company No: 256 9431 77 | Copyright 2026 | Terms and Conditions | Privacy Policy