Generative AI Hits the Wall No Executive Wants to See

Generative AI Hits the Wall No Executive Wants to See

There is a bet that repeats itself in almost every boardroom that has spent two years talking about artificial intelligence: that technology will allow any professional to do the work of any other, with sufficient quality to justify a talent reorganization. It is a bet that feels good on paper. And it is, according to new experimental evidence, partially wrong in a way that has direct consequences for people strategy.

Valeria CruzValeria CruzMay 2, 20268 min
Share

Generative AI Hits the Wall No Executive Wants to See

There is a bet that repeats itself in almost every board meeting that has spent two years talking about artificial intelligence: that technology will allow any professional to do the work of any other, with sufficient quality to justify a reorganization of talent. It is a bet that feels good on paper. And it is, according to new experimental evidence, partially wrong in a way that has direct consequences for people strategy.

A field experiment conducted at IG, a UK-based fintech firm, with analysis by researchers from Harvard Business School, Stanford University, and the Stanford Digital Economy Lab, put that hypothesis to the test precisely. The results reveal a pattern that leaders who assume total fungibility of their workforce cannot afford to ignore.

The Experiment That Exposed the Invisible Gap

The design was deliberately simple. Three groups of employees received the same task: first, conceptualize an article for the company's website (structure, keywords, central points); then, execute the full article. The groups corresponded to different distances of knowledge: web analysts accustomed to creating that content, marketing specialists working in adjacent functions but who do not write articles, and technology specialists (data scientists and software developers) with no connection to content creation. Some participants had access to IG's generative AI tools; others did not.

The results in the conceptualization phase were decisive. Without AI, web analysts clearly outperformed the other two groups. With AI, all three groups produced conceptualizations that were statistically indistinguishable. The tool acted as a perfect equalizer for abstract and structured work, the kind that follows a reasonable template that even a non-expert can evaluate. Up to that point, the promise of the technology was fulfilled.

In the execution phase, the story changed. Marketing specialists equipped with AI were able to produce articles comparable in quality to those of the web analysts. Technology specialists, with access to exactly the same tools, could not. Post-experiment interviews exposed the mechanism: technology professionals lacked the mental model to judge the quality of the generated output. A data scientist removed calls to action because he considered them unnecessary. Another shortened articles below the optimal threshold for SEO because he preferred brevity. One admitted, with uncommon honesty: "I added things at random to make it look more like marketing." It was not a lack of technical capability. It was domain distance.

The authors named this phenomenon the "generative AI wall effect": the threshold beyond which the tool can no longer close the gap between the expert and the non-expert, regardless of how sophisticated it may be.

What the Wall Reveals About How We Manage Knowledge

The most uncomfortable finding does not lie in the experimental data. It lies in the conclusion that flows from it for organizational architecture: for years, many companies have confused technical skill with domain knowledge. And generative AI was helping them sustain that confusion.

The technology specialists in the experiment did not fail because they did not know how to use the tools. They failed because they lacked the criteria to evaluate whether the output was good. The difference between someone who uses AI to create marketing content effectively and someone who cannot does not lie in the interface or in the prompt. It lies in knowing what a converting article looks like, why a "sales tone" has value, what length best responds to search algorithms. That knowledge is not transferred in an AI training sprint.

What the experiment documents, in organizational terms, is that generative AI operates effectively on tasks that follow a logic of structured abstraction: outlining, classifying, organizing, generating options within a framework. In those tasks, the user's input can be minimal because the tool has enough structure to function. High-quality execution, on the other hand, requires what the researchers call tacit knowledge: the micro-judgments that a professional makes automatically about tone, emphasis, audience, and strategic intent, and which are impossible to delegate to a tool if the operator does not possess them internally.

This has a direct implication for the way executive teams are thinking about the return on their AI investments. If a company deploys sophisticated tools expecting its technical or administrative workforce to absorb work that previously belonged to marketing, communications, or design specialists, the likely result is not efficiency, but degraded output that no one in the chain has the knowledge to detect. The cost does not appear in an immediate productivity metric. It appears six months later, when content quality has declined, SEO has deteriorated, and no one can point to exactly where the problem occurred.

The Talent Mistake That Efficiency Conceals

There is an underlying organizational dynamic that the research does not name explicitly but that the experiment illustrates with precision: the tendency of leaders to design talent strategies from the logic of cost reduction, rather than from the logic of knowledge distance.

When a company decides that, with AI, it can reassign a software developer to produce marketing content, that decision generally does not pass through an analysis of how much domain knowledge separates the two functions. It passes through a spreadsheet that shows available hours and a budget that wants to be optimized. The problem is not the financial logic; the problem is that the financial logic is operating on assumptions of fungibility that the IG experiment has just falsified.

The authors of the study propose a distinction that proves useful for executive teams: AI can facilitate mobility between adjacent functions, where a shared knowledge base exists, but not between distant functions. A marketing coordinator who migrates toward content creation has the conceptual scaffolding to evaluate the generated output and refine it. A software developer making the same move does not, and the available tools do not transfer that scaffolding to him. That difference should be the axis of any redeployment decision before it becomes a visible problem.

The second implication, less obvious, concerns where companies invest their training budgets. The dominant tendency has been to train teams in the use of AI tools: how to structure prompts, how to iterate, how to integrate outputs into workflows. That is necessary but insufficient. The study suggests that the real bottleneck is not technical competence with the tool, but the domain knowledge that makes it possible to judge whether the output is good. Investing in the former without investing in the latter is building speed without direction.

The study also opens a more structural reading: to the extent that AI democratizes conceptualization and ideation, the weight of value shifts toward high-quality execution. And that execution will continue to be a function of accumulated knowledge, not of the sophistication of the interface. Leaders who understand this sooner will reorganize their talent investments accordingly. Those who do not will continue measuring the impact of AI in adoption metrics while the real output quietly deteriorates.

The maturity of an executive team is measured, among other things, by its capacity to build organizations where knowledge flows deliberately and where no critical result depends on a single person — or a single tool — to sustain it. That requires mapping honestly what each function knows, how far it is from the others, and how prepared it is to collaborate with systems that amplify what already exists, but cannot create from scratch what is not there. The organizations that manage to build that map and act on it will not need any particular executive to hold them together. They will have already built the system that allows them to scale on their own.

Share

You might also like