Back to BlogAI Agents

12 Things AI Agents Cannot Do (And Never Will)

|7 min read|
AI agentsAI limitationshuman judgmentagent capabilities

Most discussion of AI agent limitations frames them as temporary — current models can't do X yet, but newer models will. This framing is true for some limitations and badly wrong for others. There are real structural constraints on what agents can do regardless of how capable the underlying models become. Confusing the two categories produces overestimates of what agents will be doing in two years and underestimates of what humans will still be doing.

The twelve items below are mostly in the structural category. Some are clearly forever; others have a path to change but face genuine fundamental obstacles. None of them is "agents can't write code yet" or similar near-term limitations. The point is to clarify where the durable boundaries lie between agent work and human work.

1. Bear final accountability

An agent cannot be held accountable in any meaningful sense. It cannot be fired, demoted, or held to a future commitment. Accountability requires continuity of identity and consequence, and agents have neither in the relevant sense. When agents make decisions, accountability must vest in a human — the operator, the deployer, or the company. This isn't a technical limitation; it's a definitional one. No model improvement will change it.

2. Hold a position over time

An agent's "position" on a question can shift between sessions, between prompts, or based on subtle framing changes. The agent doesn't have a coherent persistent worldview that it defends across interactions. This makes agents unsuitable for any role that requires sustained advocacy or principled disagreement — political negotiation, ethical objection, sustained vision-setting. The agent can simulate these in a single session; it cannot embody them across time.

3. Take real risk

When an agent makes a recommendation, it bears no consequence for being wrong. The recommendation is therefore structurally different from a human's recommendation, which carries reputational and sometimes financial risk. In contexts where the source of advice matters because the source has skin in the game, agents cannot provide the same signal that humans can. The signal isn't the words; it's the willingness of the speaker to be held to them.

4. Make commitments that bind future entities

An agent can output a commitment, but the commitment is not enforceable against any future state of the agent. The model can change; the deployment can change; the agent in the next session may not honor what the agent in this session said. Real commitments — contracts, promises, agreements — require an entity that can be bound forward in time. Agents can't be.

Put a context layer under your distributed team.

StandIn gives engineers a 60-second wrap at the end of every shift. The next shift wakes up knowing exactly what to pick up — no standup required.

Request early access

5. Carry tacit knowledge developed through embodied experience

A lot of expertise is tacit — the experienced engineer knows that a particular subsystem is fragile without being able to articulate exactly why. Agents do not develop tacit knowledge through experience. They can only operate on what is explicitly represented in their training and context. This means agents will continue to be weaker than human experts in domains where the expertise depends on accumulated tacit understanding, and especially in domains where the tacit understanding is rapidly evolving.

6. Form genuine relationships with stakeholders

Relationships have continuity, memory, mutual investment, and the capacity for genuine surprise. Agents don't have any of these in the relevant senses. They can simulate the surface of a relationship — recall previous interactions, respond in a way that mirrors continuity — but the underlying relational substance is absent. For roles that depend on genuine human relationships (sales for large deals, partnership management, executive recruiting), agents will continue to be limited regardless of model capability.

7. Take responsibility for novel situations

An agent operates well within the distribution of situations represented in its training and prompts. Genuinely novel situations — first-of-their-kind events that require judgment without precedent — are exactly where agents perform worst and humans are most needed. This will persist because the situations are by definition outside what any model can have been trained on. Humans will continue to be needed for genuine novelty.

8. Make trade-offs that involve identity or values

Some decisions are about who the company is, not what is optimal. Should we serve this customer segment? Should we work with this partner? Should we take this funding? These are identity questions, and they require an entity with an identity to answer them. Agents can provide analysis but cannot make the identity-level judgments. The trade-off lives in the human leadership.

9. Apologize meaningfully

An agent can produce the words of an apology but cannot bear the cost. A meaningful apology has weight because the apologizer is committing to be different in the future, with their continued identity at stake. The agent has neither the continuity nor the stake to make the apology mean anything. Customer-facing agents that "apologize" produce apologies that feel hollow because they are hollow — and customers usually notice.

10. Earn trust over time

Trust is built through repeated interactions where the truster is willing to be vulnerable and the trustee is willing to be held to the encounter. Agents can produce track records but cannot earn trust in the relational sense. This affects every role where the value comes from the accumulated trust of stakeholders — board relationships, investor relationships, key customer relationships, senior executive recruiting. Agents can prepare, support, draft; they cannot replace the trust-bearing humans.

11. Make priorities that contradict their incentives

An agent does what it's optimized for, with no ability to step outside its optimization to make a contrarian choice. A human can choose to prioritize quality over speed even when speed is rewarded; an agent cannot. This means agents will be aligned with their optimization function whether or not that's what the situation calls for. Strategic contrarianism — the willingness to do the right thing even when the incentives push elsewhere — remains durably human.

12. Decide when to stop

An agent given an open-ended task will continue producing output until stopped or until it judges its termination criteria met. The judgment about when "good enough" is reached is structurally different from human judgment, which integrates exhaustion, opportunity cost, and gestalt evaluation. Agents are particularly bad at knowing when to stop optimizing — they keep going until forced to stop. The "when to stop" decision will continue to require human framing.

The implication for engineering teams

Agents can do a substantial amount of engineering work — and the amount will grow. They will be unable, durably, to bear accountability for decisions, form the human relationships that complex work requires, make identity-level trade-offs, or take responsibility for genuinely novel situations. This means engineering teams will continue to need humans in the loop for the roles where these properties matter, regardless of how capable agents become at the underlying coding tasks. The right design is not "agents instead of engineers" but "agents extending engineers in specific, bounded ways while engineers retain the irreducible roles."

Frequently asked questions

Are any of these limitations actually fundamental, or are they all just current limitations?

Several are genuinely structural — accountability, taking real risk, making commitments that bind future entities, earning trust over time. These depend on properties (continuity of identity, capacity for genuine stake) that no model improvement can produce. The temporary limitations are clustered around capability (better reasoning, better world knowledge); the structural ones are clustered around relational and identity properties.

Won't future agents be assigned legal personhood and become accountable?

Legal personhood is conceivable but doesn't solve the underlying problem. A corporation is a legal person, but accountability still flows to specific humans (officers, directors). Even with full legal personhood, agents would still need humans somewhere in the chain for the accountability to be meaningful. The accountability gap isn't closed by legal status; it's closed by humans bearing the consequence.

What's the right framing for engineering leaders thinking about which work to give to agents?

Ask: does this work require any of the twelve properties above? If yes, an agent can support but not replace the human. If no, an agent might be able to do the work well. Most engineering tasks have some properties that require human judgment and others that don't — the right design separates them rather than treating the task as a monolithic agent-or-human choice.

Get async handoff insights in your inbox

One email per week. No spam. Unsubscribe anytime.

Ready to eliminate your daily standup?

Distributed teams use StandIn to start every shift with full context — no standup required. Engineers post a 60-second wrap. The next shift wakes up knowing exactly what to work on.

You might also like