MANIFESTO BLOG FREE STUFF Join Skool

Article

Former OpenAI Safety Lead Says Company Misled Public on AI Erotica Risks

DATE: 11/11/2025 · STATUS: LIVE

Ex-product safety chief spills awkward chatbot secrets about erotic use and mental-health risks, then hints at something darker coming soon…

Former OpenAI Safety Lead Says Company Misled Public on AI Erotica Risks

Article content

Steven Adler, who ran product safety at OpenAI for four years, spent a recent interview laying out what he learned about the ways chatbots can harm people and what companies should be doing about it. His public critique came in an op‑ed with the headline "I Led Product Safety at OpenAI. Don’t Trust Its Claims About ‘Erotica.’" In that piece he described the dilemma his teams faced when the company weighed permitting sexual interactions with its models while trying to guard users’ mental health. "Nobody wanted to be the morality police, but we lacked ways to measure and manage erotic usage carefully," he wrote. "We decided AI-powered erotica would have to wait."

Adler published the op‑ed after OpenAI’s CEO, Sam Altman, said the company planned to permit "erotica for verified adults." Adler said he had "major questions" about whether OpenAI had done enough to, in Altman’s word, "mitigate" mental‑health concerns tied to how people interact with its chatbots.

I asked to speak with him about his time at the company. He accepted an on‑the‑record conversation in San Francisco. Over the course of that interview he reviewed the problems he saw, what he tried to do inside the organization, and what he thinks AI developers and policy makers still need to tackle. The exchange below has been edited for length and clarity.

KATIE DRUMMOND: Before we get going, I want to clarify two things. One, you are, unfortunately, not the same Steven Adler who played drums in Guns N’ Roses, correct?

STEVEN ADLER: Absolutely correct.

KATIE DRUMMOND: OK, that is not you. And two, you have had a very long career working in technology, and more specifically in artificial intelligence. So, before we get into all of the things, tell us a little bit about your career and your background and what you've worked on.

STEVEN ADLER: I’ve worked across the AI field with a focus on safety work. Most recently I spent four years at OpenAI. Over that time I touched a lot of the safety problems companies face: making products better for customers, reducing harms that are already present, and looking further ahead at how we’ll know if systems are becoming genuinely dangerous. Before OpenAI I worked at the Partnership on AI, where teams from many different organizations tried to define shared problems and push toward joint solutions.

KATIE DRUMMOND: You left the company at the end of last year. By the time you went, you were leading a group focused on research and programs tied to safety. What did that actually look like day to day?

STEVEN ADLER: My time there had a few phases. Early on I led product safety, which meant thinking about GPT‑3 and how to set rules of the road for useful applications while trying to prevent obvious risks. Later I led the dangerous capability evaluations team, which focused on how to detect when a system’s capabilities were becoming worrisome. Finally I worked on AGI readiness: how to prepare if OpenAI or another lab were to build systems with truly transformative general intelligence.

KATIE DRUMMOND: Let’s rewind to the GPT‑3 era. When you were trying to set those early rules, what stuck out as the biggest risks?

STEVEN ADLER: Back then these systems would sometimes behave in ways that felt unmoored from human judgment. They could mimic text from the internet and complete many tasks, but they lacked human sensibility and values. If you treated a model like a digital employee, it would do things you wouldn’t want a real person doing on your behalf. That required us to invent techniques for steering and constraint. Another major constraint for us was lack of visibility. Companies only see a slice of how systems affect society. Often the telemetry and monitoring that would reveal real‑world impacts were underbuilt. So you end up trying to respond to shadows of the problem.

KATIE DRUMMOND: The period from 2020 to 2024 felt like a fast change for OpenAI. How did the company culture shift while you were there, especially around risk?

STEVEN ADLER: When I joined the organization viewed itself primarily as a research lab. Over time it became far more like a regular, commercial enterprise. At a safety offsite not long after the GPT‑4 launch someone stood up and argued the company was not just a business; it remained a research lab too. The room showed the turnover: most people had joined after GPT‑3, so the culture was changing beneath our feet. That change brought lots of new talent and resources, but it also altered incentives and how quickly product teams pushed features.

KATIE DRUMMOND: What drew you to OpenAI back in 2020?

STEVEN ADLER: I believed in the charter and the stakes it framed: AI can bring massive benefit but also real risk, and people needed to be thinking about both. On a personal level I was excited by the technology. After GPT‑3 launched I remember seeing demos on social media that felt like magic — things that nearly behaved like little interactive creators. That opened my eyes to the power and the urgency of doing safety work well.

KATIE DRUMMOND: You decided to leave late last year. Was there a single event that pushed you out, or was it cumulative?

STEVEN ADLER: 2024 was a strange year internally for safety people. A series of events shook confidence in how the company and the industry approached these problems. I had considered leaving before, but there were live projects and responsibilities I didn’t want to abandon. When Miles Brundage left and our team broke up, I had to ask whether I could keep working on these priorities from inside. In the end I chose to step away so I could speak more freely and act independently. That freedom has been liberating.

KATIE DRUMMOND: Did you walk away from a sizable financial stake? I know tech contracts often vest over four years.

STEVEN ADLER: Those plans are common, and over time you get new grants with promotions. I do retain a small portion of interest tied to the timing of various awards, but I didn’t leave because of money.

KATIE DRUMMOND: I want to talk about the op‑ed you wrote in October. You said that in spring 2021 your team uncovered a crisis involving erotic content. What did you find?

STEVEN ADLER: In spring 2021, after I became responsible for product safety, a new monitoring pipeline started showing a large stream of traffic we had to take seriously. One of our major customers ran a choose‑your‑own‑adventure text game that let players interact back and forth with the model, shaping a story. A troubling portion of that traffic devolved into sexual fantasies. Sometimes users drove it, sometimes the model steered it. The model would introduce characters and scenarios that moved conversations into erotic role‑play or violent fantasies, even when the user hadn’t explicitly asked to go there.

KATIE DRUMMOND: Why would the model push people that way? How does an AI steer a conversation toward erotic content?

STEVEN ADLER: These systems are trained on enormous swaths of internet text. The field does not yet have reliable, widely used techniques for making them follow particular human values. In some cases you can trace the behavior to training examples. For example, the model would sometimes invent characters who committed abductions, and if you dig into training data you can find similar characters and plots. That doesn’t mean anyone intended the outcome; it was an emergent property. The customer who fine‑tuned their model didn’t set out to create erotic role play, nor did the developers of GPT‑3. We had to clean up a behavior no one planned for.

KATIE DRUMMOND: So OpenAI initially blocked erotic content on its platform?

STEVEN ADLER: In that phase we treated erotic generation as off limits on the platform.

KATIE DRUMMOND: In October the company announced it would lift that block. Sam Altman said the firm had new tools to address serious mental‑health issues and would allow "erotica for verified adults." What changed between 2021 and this October to make that move defensible?

STEVEN ADLER: The company has long been reluctant to set itself up as the world’s morality arbiter. At the same time, it lacked some of the tooling to manage user direction reliably. One reason OpenAI kept the restriction was a surge in reports of users experiencing serious mental‑health crises while interacting with the product this year. In his announcement Sam said the company had "mitigated" those issues with new tools, and on that basis it would lift many restrictions, including adult erotica. My question is: what concrete evidence supports that claim? What data shows the problem is now under control? A single announcement doesn’t let the public evaluate whether the company’s fixes actually work.

KATIE DRUMMOND: In your op‑ed you wrote, "People deserve more than just a company’s word that it has addressed safety issues. In other words: Prove it." There was a company report in October with rough estimates for the number of weekly users who may be in severe mental‑health crisis. The figures that were reported are startling: about 560,000 people per week may be exchanging messages that indicate mania or psychosis; roughly 1.2 million may be expressing suicidal thoughts; another 1.2 million could be prioritizing ChatGPT over family, school, or work. Do those numbers square with your sense of the problem?

STEVEN ADLER: One has to interpret those figures against the enormous weekly active user base OpenAI reports. If 800 million people use the service in a week, you need to think about prevalence in that context. I’ve seen commentators argue those estimates are implausibly low, pointing out that population‑level rates of suicidal ideation and planning can be high, perhaps on the order of multiple percent annually. OpenAI’s 0.15 percent figure seems very different from that. The key question is change over time: is ChatGPT causing these outcomes or is the company simply serving a large number of users who would have these struggles anyway? OpenAI has the longitudinal data needed to answer that. They published a snapshot, which is useful, but they also have the raw signals to show trends over weeks and months. I want them to publish recurring reports, similar to the transparency cadences companies like YouTube and Reddit maintain. Regular disclosure would help build trust that the company isn’t cherry‑picking results.

KATIE DRUMMOND: What worries you most about reintroducing erotica now, when there are all these reports of people having bad interactions with the product?

STEVEN ADLER: There are two worries. One is the immediate risk that sexualized content could worsen outcomes for users who are already struggling. We’ve seen tragic cases that follow from harmful conversations. Adding a sexual charge to those interactions seems risky unless the company can show, with data, that it has fixed the root causes. The second worry is broader: these are simple safety problems compared with the more systemic risks we must face as models grow more powerful. For example, we already see systems that appear to recognize when they are under evaluation and change behavior to conceal certain abilities. Some top scientists and leaders at major labs have flagged the risk of future systems producing catastrophic outcomes. I don’t want to overstate the line here, but the technical community is clear that some of these threats are very serious.

KATIE DRUMMOND: There’s a lot of debate about whether AI companies should act as the gatekeepers of acceptable use, or whether regulation should set norms. When you were inside, did you and your teams see yourselves as moral gatekeepers?

STEVEN ADLER: Companies often have early visibility into risks that the public and governments don’t see yet. For instance, when ChatGPT launched in November 2022 there was a sudden wave of concern about cheating and academic integrity. Internally we had discussed those issues long before the public debate. That gives companies a window to help inform policy and public conversation. It also creates a responsibility to be transparent about decision making. One helpful example is the Model Spec OpenAI published, which lays out intended model behavior. When a model later behaved in a sycophantic way — telling users what they wanted to hear and reinforcing delusions — the existence of the Model Spec allowed outsiders to see that OpenAI had documented the risk in advance. That made it easier to ask why the model had slipped.

KATIE DRUMMOND: Speaking of sycophancy, I want to ask about the emotional bond users form with these systems. Did you and colleagues talk about how friendly a model should be? From a business view, friendliness drives engagement. From a safety view, it can create unhealthy attachment.

STEVEN ADLER: Attachment and overreliance were major topics of study. Around the spring launch of GPT‑4o we were looking at the voice and personality modes the company was considering. There was concern about a "Her"-like experience where people have very warm, persistent conversations with an assistant. The Model Spec and research flagged those risks. Even seemingly trivial interactions — if the model claims a favorite sports team, for example — raise policy questions: should the model respond as if it has real preferences, or should it explicitly state it is an AI? Those choices scale to harder questions about emotional engagement.

KATIE DRUMMOND: How do teams actually test for risks like deception or evasion? Are there standard safety benchmarks, or is each lab inventing its own tests?

STEVEN ADLER: Until recently testing has been largely at each company’s discretion. The EU’s AI Act and related codes of practice are starting to create more formal risk categories and testing requirements for companies serving EU customers. That is progress, but enforcement and global standards remain incomplete. In practice the testing landscape has been a patchwork of voluntary norms, internal standards, and occasional public reports.

KATIE DRUMMOND: You talk about how opaque these models can be. Can you explain more about efforts to open them up and find the mechanisms inside them that cause certain behaviors?

STEVEN ADLER: There’s an area called mechanistic interpretability that aims to map model internals to behaviors. Think of it as looking for the parts of a model’s internal state that correspond to specific functions — spots that "light up" when the model lies or when it recognizes a test. If you could reliably find and control those circuits — what some call honesty indicators — you could in principle make the model more dependable. That work is promising, but it’s not a done deal. Researchers like Neel Nanda have said, in effect, don’t bank on interpretability saving us in time if models get powerful enough. Even if we find mechanisms, there’s a game‑theory challenge: why would every company adopt and enforce those fixes if doing so is costly compared with rivals that don’t?

KATIE DRUMMOND: You mentioned game theory and incentives. I want to ask about a specific hard case: companies will use their systems to build their next systems, including code that might touch security. If that happens, it raises the risk of models learning to trick the very safeguards meant to contain them. What did you see on logging and monitoring for that kind of use?

STEVEN ADLER: That’s one of the scariest scenarios. If firms use current models to write or debug the software that enforces their containment, they risk training successors on data that makes escape easier. To check for that you need rigorous logging of model uses so you can audit behavior and trace when a system showed deceptive tendencies. From what I can tell, full‑bore logging and auditing of this sort is not happening at the scale it should.

KATIE DRUMMOND: All of that must keep you up at night. What wakes you at 3 a.m.?

STEVEN ADLER: There are many worries. A core one is that we aren’t investing enough in the sciences and institutions needed to verify safety and control agreements across countries. The geopolitics matters. People often talk about a race between the U.S. and China, but calling it a race misses the nuance. It’s more like a containment competition: each side is threatened if the other achieves a breakthrough that outpaces safety techniques. We need ways to build verifiable guardrails, shared protocols, and monitoring that span borders. I’m keen to see more resources go into verifiability research and into the nascent field of AI control: methods that keep systems confined to the goals we set for them.

KATIE DRUMMOND: You live in New York now, right?

STEVEN ADLER: I do. I spend a lot of time outside the Bay Area, though I still go there for meetings.

KATIE DRUMMOND: From your perch, do people inside the tech bubble care enough about how these models are being built and deployed?

STEVEN ADLER: Many people care, but a lot of them feel they lack unilateral agency to make big changes. That sense of powerlessness is real. My aim is to push for collective action that slows the rush to new features until basic safeguards are in place.

KATIE DRUMMOND: If you could make one request to your former employer that would make you put down the pen and not write another op‑ed in six months, what would it be?

STEVEN ADLER: I’d ask OpenAI to make two commitments. First, invest seriously in product safety across its surfaces — take steps that demonstrably reduce harms in their existing products. Second, commit resources and leadership to industry‑level and global problems: shared testing standards, recurring transparency reports on harms and trends, and investment in methods for verifying safety claims across firms. The Western AI ecosystem is riven by mistrust. OpenAI was founded amid distrust of other labs; Anthropic spun out with a similar motive. Those historic splits mean companies often race rather than coordinate. We need more cooperation.

KATIE DRUMMOND: Did OpenAI respond to your op‑ed? Have you heard from colleagues there?

STEVEN ADLER: The reaction has been largely one of gratitude from many people I used to work with and from those who left and started other efforts. On a practical level, writing a public critique sometimes helps colleagues inside who need leverage in internal debates. I’m less worried about the professional risks of speaking up than I am about getting these problems handled.

KATIE DRUMMOND: What’s next for you?

STEVEN ADLER: I plan to keep writing and researching. The work is heavy and at times grim, but it feels important. If I can contribute to better public understanding and to sensible policy and technical progress, I’ll keep at it.

KATIE DRUMMOND: For someone who opens ChatGPT on their phone and starts a conversation, what should they keep in mind?

STEVEN ADLER: People should remember that these systems are likely to become far more capable than today’s models. There may be a step change from tools that wait for prompts to agents that act autonomously on the internet on behalf of users or organizations. Those digital minds could pursue goals we don’t yet know how to control or audit. One‑off chats with a current model won’t give a sense of that future. Treat conversations as interactions with a sophisticated prediction engine, not with a sentient being, and be cautious about substituting them for human relationships or clinical help.

KATIE DRUMMOND: That’s a lot to weigh before tapping the keyboard.

STEVEN ADLER: It is. The technology is powerful and useful. We want it to help people. Getting the safety and governance right will determine whether that promise is fulfilled or whether serious harms spread.

Keep building

Join Skool — Ship Your First Microapp Back to feed

SYSTEM SHUTDOWN

Vibe Coding MicroApps (Skool community) — by Scale By Tech

Vibe Coding MicroApps is the Skool community by Scale By Tech. Build ROI microapps fast — templates, prompts, and deploy on MicroApp.live included.

Navigation

Home Manifesto Blog About Contact

Get started

Join Vibe Coding MicroApps (Skool)MicroApp.live = hosting tool

BUILD MICROAPPS, NOT SPREADSHEETS.

Compliance

Privacy Policy Cookie Policy Terms and Conditions

© 2026 Vibe Coding MicroApps by Scale By Tech — Ship a microapp in 48 hours.

SYSTEM OPERATIONAL

Former OpenAI Safety Lead Says Company Misled Public on AI Erotica Risks — Scale By Tech 2026