Americans are living in parallel AI universes. For much of the country, AI has come to mean ChatGPT, Google’s AI overviews, and the slop that now clogs social-media feeds. Meanwhile, tech hobbyists are becoming radicalized by bots that can work for hours on end, collapsing months of work into weeks, or weeks into an afternoon.

Recently, more people have started to play around with tools such as Claude Code. The product, made by the start-up Anthropic, is “agentic,” meaning it can do all sorts of work a human might do on a computer. Some academics are testing Claude Code’s ability to autonomously generate papers; others are using agents for biology research. Journalists have been experimenting with Claude Code to write data-driven articles from scratch, and earlier this month, a pair used the bot to create a mock competitor to Monday.com, a public software company worth billions. In under an hour, they had a working prototype. Although the actual quality of all of these AI-generated papers and analyses remains unclear, the progress is both stunning and alarming. “Once a computer can use computers, you’re off to the races,” Dean Ball, a senior fellow at the Foundation for American Innovation, told me.

Even as AI has advanced, the most sophisticated bots have yet to go fully mainstream. Unlike ChatGPT, which has a free tier, agentic tools such as Claude Code or OpenAI’s Codex typically cost money, and can be intimidating to set up. I run Claude Code out of my computer’s terminal, an app traditionally reserved for programmers, and which looks like something a hacker uses in movies. It’s also not always obvious how best to prompt agentic bots: A sophisticated user might set up teams of agents that message one another as they work, whereas a newbie might not realize such capabilities even exist.

The tech industry is now rushing to develop more accessible versions of these products for the masses. Last month, Anthropic released a new paid version of Claude Code designed for nontechnical users; today the start-up debuted a new model to all users, which offers, among other things, “human-level capability in tasks like navigating a complex spreadsheet.” Meanwhile, OpenAI recently announced a new version of Codex, which the company claims can do nearly anything “professionals can do on a computer.” As these products have gained visibility, people seem to be realizing all at once that AI does a lot more than draft marketing copy and offer friendly conversation. The post-chatbot era is here.

[Read: Move over, ChatGPT]

Tools such as ChatGPT and Gemini may already feel powerful enough in their own right. Indeed, chatbots have assumed all kinds of fancy new features over the past few years. They now have memory, which lets them reference previous conversations, and use a technique called reasoning to produce more sophisticated responses. Whereas older chatbots could ingest a few thousand words at a time, today they can analyze book-length files, as well as process and produce images, video, and audio.

But all of this pales in comparison to the rise of agentic tools. Consider software engineering, where they have proven to be particularly transformative. It’s now common for engineers to essentially hand over instructions to a bot such as Claude Code or Codex, and let them do the rest. Because bots aren’t constrained in the way humans are, a programmer might have several sessions running simultaneously, all working on different aspects of a project. “In general, it is now clear that for most projects, writing the code yourself is no longer sensible,” the computer programmer Salvatore Sanfilippo wrote in a recent viral essay. In just a few hours, Sanfilippo noted, he had completed several tasks that previously would have taken weeks. Microsoft’s CEO has said that as much as 30 percent of code is now written by AI, and the company’s chief technical officer expects that figure to hit 95 percent industry-wide by the end of the decade. Anthropic already reports that up to 90 percent of the company’s code is AI generated.

Some programmers have started to warn that similar advances could cannibalize all kinds of knowledge work. Last week, Matt Shumer, the CEO of an AI company, compared the current moment in AI to the early days of COVID, when most Americans were still oblivious to the imminent pandemic. “Making AI great at coding was the strategy that unlocks everything else,” wrote Shumer. “The experience that tech workers have had over the past year, of watching AI go from ‘helpful tool’ to ‘does my job better than I do’, is the experience everyone else is about to have.” (His essay, which has upwards of 80 million views, was partially AI generated.)

Tech executives have a strong incentive to suggest that similar advances will soon come for other forms of work. Last week, Microsoft’s AI chief predicted that AI will automate “most, if not all” white-collar work tasks within 18 months. (This led to a series of social-media posts like the following: “CEO of Hot Pockets ™ says that ‘most, if not all’ meals will be fully replaced with Hot Pockets ™ within the next 12 to 18 months.”)

It’s not yet clear how easily the progress in agentic tools will translate to other fields. Programming is well suited to automation: Software programs either work or they don’t. Determining what counts as a good essay, for example, is a far messier task, and one that requires much more human discernment. Though agentic tools often excel at complicated work, such as synthesizing unfathomable reams of text, they struggle to do something as simple as copy and paste text from Google Docs into Substack. And because they are so powerful, they can also be dangerous: When one venture capitalist recently asked Claude Cowork—Anthropic’s new, more accessible agentic tool—for help organizing his wife’s desktop, the bot subsequently deleted 15 years of family photos. “I need to stop and be honest with you about something important,” the bot told him. “I made a mistake.”

[Read: The AI industry is radicalizing]

Even if AI isn’t yet a world-class financial analyst or architect, coding bots have progressed to the point where they are already able to assist with all kinds of knowledge work. Since Claude Code took off, I’ve watched people I know start using the bot for all sorts of tasks, realizing just how much more capable agentic tools are than traditional chatbots. In my own job, I’ve found agents particularly adept at research. When I recently asked Claude Code for a report on trends in Gen Z political views, I fired off a brief query and my team of bots got to work: One scoured the web for information, while another performed data analysis, and a third wrote up the findings in a briefing for me to review. (Like other kinds of AI, Claude Code can hallucinate: When using the tool for research, I still carefully verify information against original sources before referencing it in my own writing, which—to be clear—I do myself.)

The industry is hopeful that agentic tools will continue to improve. The AI-coding boom is boosting tech companies’ abilities to improve their own products, as engineers use agentic tools to write software. But Silicon Valley has long dreamed of building AI models that can improve themselves, each new generation of AI models able to spawn its successor. On a recent call with reporters, I asked Sam Altman, OpenAI’s CEO, what it would take to get there: Such progress would require models capable of both producing “new scientific insight and writing a lot of complex code,” he told me. “The models are starting to be able to do both of those things.” Still, he cautioned, we’re not yet at a moment of runaway AI progress. (The Atlantic has a corporate partnership with OpenAI.) Last month, Boris Cherny, the Anthropic employee who created Claude Code, told me that “Claude is starting to come up with its own ideas and it’s proposing what to build.”

Without inside access to these companies, it’s difficult to know what to make of these claims. And as impressive as agentic tools already are, it could still take quite a long time for them to become safe and reliable enough for widespread use. Even if technology capabilities continue to progress rapidly, the real world is messy and complicated. Silicon Valley sometimes mistakes “clear vision with short distance,” the Stanford computer scientist Fei-Fei Li said earlier this month. “But the journey is going to be long.”

Tech companies have done a tremendous job of persuading investors to pour cash into their businesses, but the industry has done a much worse job at selling the public on its vision. Instead of focusing on tangible benefits of AI agents, Silicon Valley has spent years hype-washing the technology with business briefs that read like science fiction. In one influential essay, Dario Amodei, Anthropic’s CEO, writes that powerful AI could soon eliminate most cancer and nearly all infectious diseases. In another, a team of researchers warns that within the decade, rogue AI might release biological weapons, wiping out nearly all of humanity. Bots that can handle spreadsheet work and automate coding might not amount to superintelligence, but they are still immensely powerful. To the extent that normies remain confused about AI’s true capabilities, Silicon Valley has only itself to blame.


From The Atlantic via this RSS feed

  • kbal@fedia.io
    link
    fedilink
    arrow-up
    1
    ·
    3 hours ago

    When any random website suddenly gets slow lately I wonder whether they’ve been doing some vibe coding or if it’s just overwhelmed by bots.