AI Agents Are Interns

AI Agents Are Interns

2026.04.23
Views
------
AIManagementEngineering

AI Agents Are Interns

In building with AI the last three years, I have seen teams tend to threat AI agents like they are either magic or useless. They struggle to identify what they can and can't do with them.

The best mental model I have come across for them is: AI agents are interns.

That's not an insult. Good interns save time, take work off your plate, and sometimes surprise you. But you don't let an intern rewrite your pricing strategy, handle an angry enterprise customer alone, or push mystery code to production on a Friday afternoon.

That is roughly where we are with agents. They can do real work. What they still cannot carry, at least not on their own, is the judgment, trust, and accountability people keep trying to hand them. Shoutout to my former colleague Kam for putting this idea in my head.

The Right Frame

This confusion around agents starts with using the wrong frame. Expect deterministic software and you will get frustrated. Expect a full employee and you will probably be disappointed.

The intern model lands closer to reality. Interns can handle real tasks: research, summarizing, organizing, drafting, classifying, following a process. They need scoped assignments, context, and review. The output gets better or worse depending on how clearly the work was framed.

It also helps technical and business teams talk about the same thing without getting stuck in vocabulary. Engineers think in permissions, guardrails, and rollback. Business leaders think in delegation, supervision, and performance. Different language, same underlying problem.

Capable but Uneven

Agents are capable but also uneven in ways that matter. So are interns.

They tend to do well on tasks with clear inputs, recognizable patterns, and visible success criteria. Summarizing meetings. Pulling details from support tickets. Drafting documentation. Writing code for narrow functions. Routing requests. Following checklists. None of that is fake work. In a lot of companies, it is a meaningful chunk of the week.

Where they begin to break down is usually the part that looks more senior than smart. Tradeoffs under uncertainty. Organizational context. Taste. Timing. Knowing when an instruction is technically correct but still wrong for the moment. They aren't a seasoned vet with years of inside knowledge and wisdom.

The common failure looks familiar. A team sees a strong demo, quietly assigns the agent a level of judgment it has not earned, then acts surprised when it behaves like a very fast junior.

Earning Trust

You do not trust an intern because they sound polished. You trust them because they have done a specific kind of work well, repeatedly, under known conditions.

Agents should work the same way. Trust should be narrow, task-specific, and evidence-based. It should not be extrapolated from a few good demos.

I ran into this recently. A team built an underwriting agent and expected it to do flawless financial math right away. When the numbers came back wrong, the reaction was disappointment. This thing can’t even calculate a debt service coverage ratio? But LLMs are not calculators. They are pattern matchers. Once we gave the agent proper math tools and added formula-based validation, the actual underwriters were freed from the grunt work of building spreadsheets and could spend their time reviewing and validating the output. The team had been expecting a senior underwriter. What they actually had was an intern, useful once managed correctly.

That pattern shows up a lot. People assign a level of competence the agent has not demonstrated, then blame the technology when it falls short. Trust is not a property of the model alone. It comes from the model, the task, the tools, and the controls around it.

Accountability Stays Human

Output can be delegated. Accountability cannot.

If an intern writes a flawed memo and it goes out, the manager who approved it still owns the result. Same with agents. This gets tricky because agent output arrives polished enough to feel finished, moves fast enough to feel ownerless, and doesn’t look like anyone wrote it.

But somebody still owns the instructions, the permissions, the review threshold, and the outcome when something goes wrong. You can scale output faster than you can scale judgment. That gap matters more than most teams expect.

Where It Breaks

The metaphor has limits.

Agents do not develop judgment the way people do. They do not absorb context from hallway conversations. They do not care about outcomes. They do not get embarrassed when they miss something obvious, which turns out to matter more than it sounds.

The other big difference is scale. A good intern handles one task at a time. A good agent can run hundreds in parallel across time zones without getting tired. That is part of the appeal. But a mediocre agent with broad access can make a huge mess just as fast: hundreds of bad messages, thousands of incorrect classifications, plausible nonsense pushed into downstream systems.

The upside and downside are both amplified. That is why intern-style controls matter even more at scale.

The Ceiling Will Rise

“Agents are interns” won’t stay true forever. In some narrow workflows, it already undersells what well-designed systems can do. Coding agents in particular are shipping high percentages of code at top-notch engineering teams.

Models will keep improving, and they are improving fast. Releases like Mythos are pushing into territory where standard benchmarks are getting saturated, making it harder to even measure the gap between models and humans on well-defined tasks. At the same time, models themselves are feeling more and more agentic out of the box, not just responding to prompts but planning, using tools, and recovering from mistakes with less hand-holding. Agents that could only draft are starting to evaluate. Agents that could only follow checklists are beginning to handle branching decisions. The ceiling will keep rising.

Still, what an agent can do is only part of the question. What you can trust it to do is a systems question: evals, permissions, monitoring, process design. The intern analogy is useful because it keeps people honest about that difference. Over time, the role agents fill will probably change faster than a real hire would, and across more domains at once.

The Right Mindset

Use agents. Give them real work. But manage them like interns: scope the assignments, keep permissions tight, review output in proportion to risk, and expand trust only after they have earned it in that specific lane.

The teams that get the most out of agents probably will not be the ones who trust them fastest. It will be the ones who learn how to supervise them well.

88.1 FM Dotcom