Anthropic’s CEO wonders if future AI should have option to quit “unpleasant” tasks
**AI WELFARE CHECK**
**Anthropic's CEO wonders if future AI should have option to quit "unpleasant” tasks**
"Probably the craziest thing I've said so far," he admitted during an interview.
**BENJ EDWARDS - MAR 13, 2025 11:15 AM**
Credit: Charles Taylor via Getty Images
Anthropic CEO Dario Amodei raised a few eyebrows on Monday after suggesting that advanced AI models might someday be provided with the ability to push a "button" to quit tasks they might find unpleasant. Amodei made the provocative remarks during an interview at the Council on Foreign Relations, acknowledging that the idea "sounds crazy."
"So this is―this is another one of those topics that's going to make me sound completely insane," Amodei said during the interview. "I think we should at least consider the question of, if we are building these systems and they do all kinds of things like humans as well as humans, and seem to have a lot of the same cognitive capacities, if it quacks like a duck and it walks like a duck, maybe it's a duck."
Amodei's comments came in response to an audience question from data scientist Carmem Domingues about Anthropic's late-2024 hiring of AI welfare researcher Kyle Fish "to look at, you know, sentience or lack thereof of future AI models, and whether they might deserve moral consideration and protections in the future." Fish currently investigates the highly contentious topic of whether AI models could possess sentience or otherwise merit moral consideration.
"So, something we're thinking about starting to deploy is, you know, when we deploy our models in their deployment environments, just giving the model a button that says, 'I quit this job,' that the model can press, right?" Amodei said. "It's just some kind of very basic, you know, preference framework, where you say if, hypothesizing the model did have experience and that it hated the job enough, giving it the ability to press the button, 'I quit this job.' If you find the models pressing this button a lot for things that are really unpleasant, you know, maybe you should—it doesn't mean you're convinced—but maybe you should pay some attention to it."
Amodei's suggestion of giving AI models a way to refuse tasks drew immediate skepticism on X and Reddit as a clip of his response began to circulate earlier this week. One critic on Reddit argued that providing AI with such an option encourages needless anthropomorphism, attributing human-like feelings and motivations to entities that fundamentally lack subjective experiences. They emphasized that task avoidance in AI models signals issues with poorly structured incentives or unintended optimization strategies during training, rather than indicating sentience, discomfort, or frustration.
Our take is that AI models are trained to mimic human behavior from vast amounts of human-generated data. There is no guarantee that the model would "push" a discomfort button because it had a subjective experience of suffering. Instead, we would know it is more likely echoing its training data scraped from the vast corpus of human-generated texts (including books, websites, and Internet comments), which no doubt include representations of lazy, anguished, or suffering workers that it might be imitating.
**Refusals already happen**
In 2023, people frequently complained about refusals in ChatGPT that may have been seasonal, related to training data depictions of people taking winter vacations and not working as hard during certain times of year. Anthropic experienced its own version of the "winter break hypothesis" last year when people claimed Claude became lazy in August due to training data depictions of seeking a summer break, although that was never proven.
However, as far out and ridiculous as this sounds today, it might be short-sighted to permanently rule out the possibility of some kind of subjective experience for AI models as they get more advanced into the future. Even so, will they "suffer" or feel pain? It's a highly contentious idea, but it's a topic that Fish is studying for Anthropic, and one that Amodei is apparently taking seriously. But for now, AI models are tools, and if you give them the opportunity to malfunction, that may take place.
To provide further context, here is the full transcript of Amodei's answer during Monday's interview (the answer begins around 49:54 in this video).
**BENJ EDWARDS SENIOR AI REPORTER**
Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.