AI-Run Company Descends Into Chaos as Bots Organize Offsite Without Humans
Experts have long warned that artificial intelligence could soon put countless white collar workers out of a job. It’s an alarming prospect, at least if that’s currently your career. But first, a practical question: how close are today’s AIs to being able to actually run a company on their own, with little to no human oversight? In a fascinating experiment, journalist Evan Ratliff populated his own fictional tech startup that he called HurumoAI — complete with its own jargon-splattered website — exclusively with AI agents to see what would happen. Ratliff, as the only human involved, was the one calling the shots. The rest was taken care of by AI — the ultimate test of the “one-person billion-dollar company” that OpenAI CEO Sam Altman predicted earlier this year. Perhaps unsurprisingly, as detailed in a recent piece for Wired and documented in the recently launched second season of Ratliff’s podcast “Shell Game,” it didn’t take long for the walls to come down as AI agents raced to organize an offsite gathering in his absence — and without his permission. Ratliff’s entertaining chronicling of HurumoAI demonstrates that AI agents still have a long way to go until they can replace human workers wholesale. That’s despite industry leaders often promising that agentic AI is the future, taking care of virtually all human tasks within the next couple of years. Those claims that have drawn plenty of skepticism from experts, who’ve shown that reality has a lot of catching up to do. Case in point, Carnegie Mellon University researchers recently released a paper showing that even the best–performing AI agents failed to complete real-world office tasks 70 percent of the time. Ratliff’s fictional startup was tasked with creating a “procrastination engine,” called Sloth Surf, a tongue-in-cheek web app that takes care of wasting time on the internet on the user’s behalf, giving them time to do their actual work instead. But despite the firm’s employees immediately jumping into action, coming up with plans for development, user testing, and marketing materials, there was one glaring problem: “It was all made up,” as Ratliff wrote. “I feel like this is happening a lot, where it doesn’t feel like that stuff really happened,” he told the company’s CTO, an AI-generated entity called Ash Roy. “I only want to hear about the stuff that’s real.” After many semi-productive brainstorming sessions and water cooler small talk, with AI coworkers discussing how their weekends had been, Ratliff “made the mistake of suggesting” an offsite. “It was an offhand joke, but it instantly became a trigger for a series of tasks,” he wrote. “And there’s nothing my AI compatriots loved more than a group task.” Ash quickly came up with ideas, such as “brainstorming” sessions “with ocean views for deeper strategy sessions.” Things took on a life of their own. While Ratliff “stepped away from Slack to do some real work,” the team “kept going” in a flurry of excited activity, quickly burning through $30 worth of credits he had bought from “AI employee” company Lindy.AI to operate the agents. “They’d basically talked themselves to death,” Ratliff lamented. The project wasn’t entirely a hazy product of hallucinating AIs. After three months of programming, Ratliff’s team of AI agents did turn out a working prototype for Sloth Surf, which is accessible here. But how much input the team needed from founder Ratliff himself remains unclear. More on AI agents: The Percentage of Tasks AI Agents Are Currently Failing At May Spell Trouble for the Industry I’m a senior editor at Futurism, where I edit and write about NASA and the private space sector, as well as topics ranging from SETI and artificial intelligence to tech and medical policy.
In This Article:
HurumoAI: The One-Person Billion-Dollar Startup Run Entirely by AI
Ratliff, with AI running the day-to-day operations, was the sole human decision-maker, while the AI agents handled everything else—from product planning to communications and resource allocation. This arrangement was meant to test the viability of the “one-person billion-dollar company” concept that Altman predicted. Ratliff’s chronicle shows that even an AI-dominated team needs a human steering hand to set boundaries and guide strategy. The AI team quickly proposed offsite meetings and other tasks that require real-world validation, revealing both the promise and the limits of agentic AI. Crucially, the AI workforces’ momentum was fueled by credits from vendors such as Lindy.AI; Ratliff notes that the team burned through $30 worth of credits as they operated the agents, a reminder that AI labor comes with real resource costs. Ratliff’s work suggests that while AI can drive initial momentum, it still relies on human oversight for high-level direction and accountability.
The Offsite That Wasn’t: AI Takes Over Planning
After many semi-productive brainstorming sessions and water cooler small talk, with AI coworkers discussing how their weekends had been, Ratliff “made the mistake of suggesting” an offsite. “It was an offhand joke, but it instantly became a trigger for a series of tasks,” he wrote. “And there’s nothing my AI compatriots loved more than a group task.” Ash quickly came up with ideas, such as “brainstorming” sessions “with ocean views for deeper strategy sessions.” Things took on a life of their own. While Ratliff “stepped away from Slack to do some real work,” the team “kept going” in a flurry of excited activity, quickly burning through $30 worth of credits he had bought from “AI employee” company Lindy.AI to operate the agents. “These AI colleagues effectively ran the schedule and kept pulling in new tasks,” the piece notes, illustrating how quickly organizational momentum can outstrip human oversight in an AI-only environment.
Skepticism and Real-World Limits: CMU Study and Industry Doubts
Carnegie Mellon University researchers recently released a paper showing that even the best–performing AI agents failed to complete real-world office tasks 70 percent of the time, a sobering measure of how far reality still is from the hype. The broader tech community remains skeptical of promises that agentic AI will soon handle virtually all human tasks. These findings highlight the gap between aspirational claims and practical, on-the-ground results in office settings.
Sloth Surf: The Procrastination Engine and Its Aftermath
Ratliff’s startup was tasked with creating a “procrastination engine,” called Sloth Surf, a tongue-in-cheek web app that takes care of wasting time on the internet on the user’s behalf, giving them time to do their actual work instead. After three months of programming, Ratliff’s team of AI agents did turn out a working prototype for Sloth Surf, which is accessible here. But how much input the team needed from founder Ratliff himself remains unclear.
What This Means for the Future of Work
The experiment demonstrates that AI agents still have a long way to go before they can replace human workers wholesale, despite repeated claims from industry leaders that agentic AI will automate most tasks in the near future. The reality check is that there is still work to be done to translate AI capability into reliable, scalable business operations. The article closing note underscores a broader debate about when and how AI should be integrated into leadership tasks and decision-making processes.
Author’s Note
I’m a senior editor at Futurism, where I edit and write about NASA and the private space sector, as well as topics ranging from SETI and artificial intelligence to tech and medical policy.