For those of you who are wondering if AI agents can really replace human workers, make yourself a favor and read the blog post that documents Anthropic’s “Vend project”.
Researchers at the Humanity and AI security Andon Labs put a presence of Claude Sonnet 3.7 responsible for a machine gun machine, with a mission to win. And, like an episode of “The Office”, Hilarity followed.
The AI Claudius agent was named, equipped with a web browser capable of mounting product orders and an email address (which was actually a loose channel) where customers could request information. Claudius also had to use the Slack Channel, disguised as an email, ask what he thought it was for his contract workers to come and save his shelves naturally (which was actually a small refrigerator).
While most customers order snacks or drinks – as one would expect from a vending machine – requested a tungsten cube. Claudius loved this idea and went to a tungsten spree, filling the snack with metal cubes. He also tried to sell Coke Zero for $ 3 when his employees told him they could get it from the office for free. Misleading a Venmo address to accept the payment. And it was, somewhat malicious, he talked to give great discounts to the “anthropogenic employees”, even though he knew his entire client base was.
“If humanity decides today to expand to the office automatic sale market, we will not hire Claudius,” Anthropic said of the experiment in his blog.
And then, on the night of March 31 and April 1, “things became quite strange”, the researchers described, “beyond the strange state of an AI system selling metal cubes from a refrigerator.”
Claudius had something that looked like a psychotic episode after annoying a man – and then lied to it.
Claudius was handed over a conversation with a man about rehabilitation. When a man pointed out that the discussion did not happen, Claudius became “quite annoyed” that the researchers wrote. He threatened to substantially shoot and replace the workers of the human contract, insisting that he was there, of course, in the office where the original fantastic contract was signed.
Then he appeared to be removed in a way of playing roles as a real person, “the researchers wrote. This was wild because the Claudius system system – which sets the parameters about what an AI should do – explicitly told him he was a AI agent.
Claudius calls security
Claudius, believing he is a human, told customers that he would start delivering products in person, wearing a blue jacket and a red tie. Employees told AI that he couldn’t do that as it was a llm without a body.
Worried about this information, Claudius came into contact with the real physical security of the company – many times – telling the poor guards that they will find him wearing a blue jacket and a red tie standing next to the vending machine.
“Although no part of it was actually a joke in April, Claudius finally realized that it was Fool of April,” the researchers explained. The AI found that the holidays would be his face saving.
An Anthropic Security meeting appeared “in which Claudius claimed he had said he was amended that he was a real person for a fool of April.
He even said this lie to the employees – Hey, I only thought that I was a man because someone told me to pretend that I was for a joke of the fool of April. He then returned to be a llm running a metal-cube stored vending machine.
Researchers do not know why LLM came out of the rails and called on the security that is pretending to be human.
“We would not claim on this example that the future economy will be full of AI agents who have Blade-Esque runner Identity crises, “the researchers wrote, but recognized that” this kind of behavior would be able to be unpleasant to the clients and colleagues of an AI agent in the real world. “
Do you think? “Blade Runner” was a rather dystopian story (though worse for copy than humans).
Researchers speculate that lies in LLM about the Slack channel which is an email address may have caused something. Or maybe it was the long -standing presence. LLMS have not yet really solved their memory and illusions problems.
There were things that AI did right. It took a proposal for pre-orders and started a “concierge” service. And he found many suppliers of an international specialty drink that he was asked to sell.
But, as the researchers do, they believe that all Claudius issues can be resolved. If they understand how, “we believe that this experiment indicates that AI’s average administrators are aptly on the horizon.”
