← Go back

The AI Is the Same for Everyone. The Judgment Is Not.

Written by

Olena Tkhorovska

on June 24, 2026

Andrew and Olena in the Vancouver office

Why access to AI no longer separates good work from generic work, and what does.

The point a senior engineer and a Wharton professor reached separately

On a recent weekly team call, our tech advisor, Yurii Rashkovskii, made a point that stayed with us. When everyone has AI, the only thing left to compete on is the people using it.

That same stretch of weeks, Wharton professor Ethan Mollick made nearly the same argument on Simon Sinek's podcast [1]. His version was that generically high quality with no variation means there is no competitive edge, and that the people who stand out are the ones who introduce variation through their own judgment and taste.

One is a practitioner who spends his days inside production code. The other studies how thousands of people actually use these tools. They arrived at the same conclusion from opposite directions, which is usually a sign the conclusion is worth taking seriously.

The argument is simple. When the tool is universal, the tool stops being the advantage. What remains is the person holding it.

When the tool is universal, the tool stops being the variable

Two things are true about AI output at the same time, and both matter here.

The first is that the baseline is genuinely good. OpenAI's GDPval study measured frontier models against industry professionals with an average of fourteen years of experience, on real deliverables across forty-four occupations [2]. The finding was that the best models are approaching expert quality on a large share of those tasks. This is not a tool that produces obvious junk anymore.

The second is that good output, produced everywhere, is not an advantage to anyone. A study by Doshi and Hauser found that generative AI improves the creativity of each individual writer, but reduces the collective diversity of what a group of writers produces [3]. Everyone gets better, and everyone gets more similar. The model is trained to produce what is statistically most likely, so it pulls toward the center of everything it has seen. Left alone, it produces competent, forgettable work, and it produces roughly the same competent, forgettable work for everyone.

The GDPval researchers were careful about where the value actually sits. Their result was not that models replace experts. It was that models paired with human oversight can do the work faster and cheaper than experts working unaided. The oversight is not a footnote in that sentence. It is the part that makes the rest of it true.

One useful way to picture this comes from a recent conversation between Dan Shipper and Every's Kieran Klaassen [4]. They describe an AI workflow as a sandwich. The agent is the filling that does the work in the middle. The human is the bread on both sides, responsible for framing the problem at the start and reviewing the output at the end. The middle has largely been solved. The two ends are where the judgment lives, and the two ends are what this post is about.

Human-judgement AI sandwich diagram

Experience beats "AI native"

There is a common assumption that younger people, raised on new technology, will naturally be better at AI. The evidence does not support it.

The clearest data comes from the Harvard and Boston Consulting Group study that introduced the idea of a jagged technological frontier, formally published in the journal Organization Science in March 2026 [5]. Working with seven hundred and fifty-eight consultants, the researchers found that AI improved performance on tasks inside its capability, but on tasks that fell outside it, consultants using AI performed nineteen percentage points worse than those working without it. The tool did not just fail to help. It actively pulled people below their own baseline, and most of them could not tell it was happening. The researchers called this mis-calibrated trust, an over-reliance on AI in exactly the places it was weakest.

A companion paper from the same research program is even more direct [6]. Its title is a finding in itself: do not expect junior professionals to teach senior ones how to use generative AI. When junior consultants were asked what they would advise senior colleagues, their recommendations were well-intentioned and, in important ways, wrong. They had adopted the tool, but they could not yet judge it.

This is the part that matters for anyone building software. Using AI and evaluating AI are different skills. A junior person can produce a beautiful report and have no reliable way to know whether it is correct. An experienced person can read the same output and recognize, often immediately, not only that something is wrong but why it is wrong and what was missing from the instruction. The value is not in how fast the model generates. It is in how fast a skilled person can tell whether the result is right. That speed of evaluation comes from experience, and experience is the one input AI cannot hand you.

This is the same pattern we described in Why AI Coding Tools Pay Off for Senior Engineers, and the same reason we argued in Small Audience, Big Standards that AI lowers the cost of building without lowering the standard the work has to meet.

When everyone can build, taste decides what gets built

Evaluation is the first half. The second half comes before any code is written. It is the decision about what is worth building at all.

Mollick frames this through the rise of the director. As production becomes cheap and anyone can generate a competent result, what people increasingly value is a point of view. You know roughly what you are getting from a particular director because of their taste, and that taste becomes the reason to choose them. The same logic is spreading into ordinary work. When the act of making is no longer the hard part, choosing what to make becomes the scarce skill.

Senior engineering leaders are saying this plainly. Cloudflare's chief technology officer, Dane Knecht, put it in one line: building has become easy, and "knowing what to build, and what not to, is the hard part" [7]. A video production studio, BearJam, made the same observation from inside a creative field, arguing that the gap between good and bad work is no longer access to AI but the quality of the decisions, and that what matters is having someone with taste decide what to point the tool at [8].

There is a mechanism underneath this, not just a sentiment. Because the model pulls toward the statistical center, the distinctive direction has to come from the person. Taste is precisely the thing that pulls work away from the generic middle and toward something specific and worth remembering. An AI can generate a hundred plausible options. It cannot tell you which one is right for this audience, this market, and this moment, because it has no stake in whether the choice lands.

This is the work we described in From Startup Idea to Software Architecture. AI coding tools generate code. They do not decide what the product should be, where the boundaries of the first version belong, or which trade-offs serve the business. Those are judgment calls, and they determine far more about the outcome than the speed of the typing.

The honest counterargument

Not everyone agrees that taste is safe from automation, and the disagreement deserves a fair hearing.

Matt Schumer, co-founder of OthersideAI, wrote that a recent model felt, for the first time, like it had something resembling judgment and taste [9]. His follow-on point was sharper: if taste can be learned from examples, then an AI can learn it too, and there is no reason to treat it as permanently human. This is not an unreasonable position. Models have already absorbed skills that people once assumed were beyond them.

The answer is not that AI can never approximate taste. It is that taste is inseparable from accountability. When a person decides to cut a feature, ship a design, or tell a client that an idea will not serve their users, they are making a bet and putting their judgment behind it. They carry the consequence if they are wrong. A model produces an output and carries nothing. In any work where the decision has real stakes for a real business, the person who is answerable for the choice is doing something the model is not, regardless of how well the model imitates the surface of it.

What this means for how you build

If the differentiator is the person, then how a team is composed is not a detail. It is the strategy.

The conclusion we keep reaching, and that the research keeps supporting, is that a small team of senior people is the structural answer to both halves of this. The judgment that catches AI's mistakes and the taste that decides what is worth building both come from experience, and a senior team has that experience already, rather than needing to grow it under a deadline. We made the fuller economic case for this in Fewer People, Better Results.

This is also why we describe our work as going deep before building. The building has become the fast part. The thinking that decides whether the building is worth doing has not, and it is where the quality of the result is now mostly determined.

AI has made capability common. It has not made judgment common. The teams and the people who will stand out are the ones who treat the tool as the easy part, and put their effort into the two things it cannot do for them: knowing whether the work is right, and knowing whether it was worth doing at all.

References

Ethan Mollick, interview on "A Bit of Optimism" with Simon Sinek, 2026.
Patwardhan, T., et al. GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks. OpenAI, 2025. https://arxiv.org/abs/2510.04374
Doshi, A., and Hauser, O. Generative AI Enhances Individual Creativity but Reduces the Collective Diversity of Novel Content. Science Advances, 2024. https://www.science.org/doi/10.1126/sciadv.adn5290
Dan Shipper, "The AI Sandwich: Where Humans Excel in an AI World," AI & I podcast (Every), with Kieran Klaassen, April 22, 2026. https://every.to/podcast/transcript-the-ai-sandwich-where-humans-excel-in-an-ai-world
Dell'Acqua, F., McFowland III, E., Mollick, E., et al. Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of Artificial Intelligence on Knowledge Worker Productivity and Quality. Organization Science, March 2026. https://pubsonline.informs.org/doi/10.1287/orsc.2025.21838
Kellogg, K., Lifshitz-Assaf, H., Randazzo, S., Mollick, E., Dell'Acqua, F., et al. Don't Expect Juniors to Teach Senior Professionals to Use Generative AI. Harvard Business School Working Paper 24-074, 2024.
Dane Knecht, Chief Technology Officer, Cloudflare, on X, reported by Fortune, February 2026. https://fortune.com/2026/02/27/openai-sam-altman-taste-get-jobseekers-hired-ai-jobpocalypse/
BearJam (James Hilditch), reported by Business Cheshire, June 2026. https://www.businesscheshire.co.uk/2026/06/10/ai-isnt-the-differentiator-anymore-creative-taste-is/
Matt Schumer, co-founder of OthersideAI, on X, reported by Fortune, February 2026. https://fortune.com/2026/02/27/openai-sam-altman-taste-get-jobseekers-hired-ai-jobpocalypse/

Olena Tkhorovska

CEO + Co-Founder