post

AI Commoditizes Features, But Judgment Compounds

· 8 min read

The Core Argument

Real professional engineers will still be needed for excellence. Not because AI can’t write code, but because excellence is qualitatively different from functional. The gap between the two requires judgment, taste, and accountability that no model improvement can close.

AI raises the floor dramatically. It does not flatten the ceiling.

BEFORE AI EXCELLENCE judgment, taste, accountability FUNCTIONAL code that runs AFTER AI UNCHANGED still judgment RAISED AI-assisted code that runs, passes tests, looks competent floor rises

The Dignity Problem (Most Underappreciated)

The threat to engineers isn’t only economic. When expert judgment stops being consulted, something human is lost: the social recognition that hard-won expertise deserves. A doctor who loses clinical judgment to AI loses the acknowledgment that fifteen years of intuition was worth something. The income is secondary.

This is a different kind of harm than job displacement and deserves its own conversation.

”Judgment + AI > Professionals” Is Weaponized Dunning-Kruger

Non-engineers with AI tools get just enough coherent output to look competent. This is more dangerous than obvious incompetence. It fools everyone longer, including the person doing it.

The professional sees second-order failures coming. The amateur with AI doesn’t know there is a second order.

The Tradesman Analogy: Where It Holds and Where It Breaks

What actually protects trades: physical reality checks (the fridge either gets cold or it doesn’t), special tool access, licensing for hazardous materials, liability, code inspections.

The problem with software: its output is pure information. A non-plumber’s solder joint actually leaks. A non-engineer’s AI-generated app can genuinely run. There is no physical reality to resist.

What the equivalent protection might look like in software:

  • Complex systems integration (AI-assisted apps work at 1,000 users; fail catastrophically at 100,000)
  • Security and adversarial reasoning (not “write code that works” but “what does a motivated attacker do with this?”)
  • Production debugging of distributed systems (race conditions, cascading failures)
  • Professional accountability: someone has to sign off, and their reputation is on the line

The protection isn’t tool access. It’s complexity that bites back. Systems that reward deep knowledge by visibly punishing its absence.

Excellence vs. Functional: A Qualitative Distinction

A better model raises the ceiling on “good enough.” It doesn’t touch excellence. Excellence requires:

  • Knowing when requirements are wrong. AI executes on requirements. It doesn’t challenge them.
  • Taste about what not to build. Knowing which feature request is a symptom of a deeper problem.
  • Conceptual integrity over time. Holding the whole system in your head and noticing when new pieces violate its coherence.
  • The 3am judgment call. Partial information, pressure, three competing hypotheses. You pick which risk to take and you own it.
  • Scar tissue. AI has no memory of the deployment that took down prod because of a race condition that looked fine in testing. The professional does.

The CEO Incentive Problem

CEOs optimize for what they can measure: headcount costs, features shipped, time to market. AI makes the visible part of engineering look cheap.

What isn’t on any dashboard: the judgment call that prevented a disaster nobody even knows about. An architectural decision from three years ago that’s the reason the system still works at scale. Taste. The product feels like one thing instead of six things glued together, and nobody can point to a line item that made it so.

The failure mode of cutting engineering judgment is lagged. It takes 12-18 months to show up, and by then it’s attributed to market conditions or leadership changes, not the original decision.

This is structurally identical to cutting cybersecurity. You don’t see the attacks that didn’t happen.

DECISION Cut engineering judgment INVISIBLE Coherence degrades. Nobody notices. 12-18 months FAILURE System breaks at scale. Outage. Data loss. Breach. BLAMED ON "Market conditions." "Leadership changes."

What could change the incentive structure:

  • Liability: if organizations faced professional accountability for AI-generated software failures the way they face accountability for accounting fraud, you’d need engineers the way you need licensed accountants
  • Attributed failures: high-profile disasters clearly traced to “they fired engineers and replaced with AI”
  • Credentialing and certification in safety-critical domains (medical, aviation, finance)

Arguments don’t change incentive structures. Consequences do.

The Pricing Gatekeeping Layer

Model providers have inserted themselves as a new kind of gatekeeper. The non-engineer thinks they’ve broken free from professional gatekeeping. They haven’t. They’ve changed gatekeepers to ones who charge by the token, can change pricing overnight, can deprecate models, and have no professional obligation to anyone.

The sustainability tension:

  • Major labs (Anthropic, OpenAI) are burning enormous capital; pricing reflects “keep the lights on while we figure this out”
  • Inference costs have dropped ~100x in under three years; open source is closing the capability gap
  • If prices collapse and open source wins → AI becomes nearly free, strengthening the “judgment + AI beats professionals” narrative
  • If prices stay high or providers fail → workflows built on top become hostages

The accountability that can’t be priced by the token: when the AI-assisted decision fails, someone professional has to answer for it.

How to Actually Help

Start with the conversation you’re already in. Make the specific argument, not the vague one. Not “we need engineers” but “here’s the failure that happened because judgment was absent.” Concrete and attributed. Name what the human contributed when AI does something well. Normalize crediting judgment, not just output.

In mentorship, the skills most worth transferring aren’t coding skills. They’re judgment skills: how to challenge a requirement, how to read a system for hidden brittleness, how to develop taste. These were assumed to come with seniority and were never made explicit. Make them explicit now.

At the organizational level, push to measure what actually matters: system health, architectural coherence, incident prevention. If these aren’t measured, they’ll keep being cut because they’re invisible.

The counter-narrative that’s true and needs more voices: AI commoditizes features, but judgment compounds. People who understand this need to say it plainly, with examples, to audiences who influence the decisions.

The structural work (liability frameworks, credentialing, regulation in safety-critical domains) is slow and requires collective action. But the individual can support and accelerate those conversations.