Why 95% of AI Pilots Fail to Deliver Results

Why 95% of AI Pilots Fail Before Producing a Single Result

There is a scene that repeats itself in almost every mid-sized company I know. The technology team presents an artificial intelligence pilot. The initial numbers are promising. The board approves the investment. And six months later, the pilot is still a pilot. Nobody officially kills it. It doesn't scale either. It simply… takes up space on the roadmap and in follow-up meetings.

Dennis Woodside, president and CEO of Freshworks, published an analysis a few days ago in Fortune that gives a name to that phenomenon. And although the article also serves as commercial positioning for his company, the diagnosis it offers deserves to be taken seriously for one simple reason: the external data he cites is uncomfortable for any C-Level executive who has spent more than a year promising AI results to their board.

MIT found that 95% of generative AI pilots fail before reaching production. Boston Consulting Group published in September 2025 that 60% of companies generate no material value from AI, and that percentage worsened compared to the previous year, despite the fact that the models improved and accumulated experience increased. Freshworks adds its own data point: a quarter of the AI budget in mid-sized companies is consumed by integration, data cleaning, and the effort of making systems talk to one another that were never designed to communicate in the first place.

What those three numbers have in common is not the AI model chosen. It is the state of the operational environment where implementation is being attempted.

The Decision That Separates Those Who Move Forward from Those Who Stagnate

Woodside describes the case of Seagate Technology with a precision that is useful precisely because it lacks glamour. The IT team had three months to migrate 30,000 employees to a new service management platform, forced by the expiration of a contract. The obvious decision — the one that almost any organization would take under that kind of pressure — was to move the existing configurations as they were and deal with the problems later. It is the safest path in the short term. It is also the one that guarantees that any layer of AI built on top will operate on flawed foundations.

The Seagate team chose the opposite. They rebuilt from scratch: restructured the service catalogue, established consistent service levels across regions, rewrote the category hierarchies so that tickets could route themselves without an agent having to guess. They did it within the same three-month window. A year later, the AI agent deployed on that foundation deflects approximately one third of incoming tickets, and first-contact resolution is 27% above the industry standard.

That decision — to rebuild rather than replicate — is the axis of Woodside's argument. And it carries an organizational reading that goes well beyond technology.

What Seagate did required that someone, at some point in the process, had a conversation that nobody wanted to have: the one that acknowledges that inherited processes are not simply inefficient, but are an active obstacle to any future improvement. That conversation has a political cost. Saying that the current processes will not be carried over means saying that years of configuration work, customization, and fine-tuning will not travel to the new environment. It means partially invalidating past decisions. Few organizations have any appetite for that under time pressure.

What distinguishes Seagate is not that they had more resources or more time. It is that they had the clarity — or the managerial courage — not to drag the past forward when the contract expired. That is the variable that does not appear in any AI implementation manual.

The Invisible Tax Paid by Those Who Do Not Look at Their Processes

Woodside introduces the concept of a "complexity tax" to describe what happens when a company attempts to implement AI on a fragmented architecture. This is not a decorative metaphor. It is a concrete financial mechanism.

If 25% of the AI budget is lost to integration and data cleaning before the model produces a single useful output, a company that allocates one million dollars to AI is effectively purchasing 750,000 dollars worth of capacity. The remaining 25% is absorbed by accumulated technical debt. For a large enterprise with transformation budgets in the hundreds of millions, that fraction can be tolerated. For a company with between 500 and 20,000 employees, with lean IT teams and tighter margins for maneuver, that loss can be the difference between an initiative that thrives and one that is quietly cancelled in the next budget cycle.

Woodside's argument about "agile companies" — his term for that range of mid-sized organizations — follows a logic that major media outlets tend to ignore because the segment is not as photogenic as the digital transformation stories of Fortune 500 companies. But it is precisely where the productivity battle that AI promises will be won or lost. SMEs represent the majority of the global business fabric. If AI does not work there, the promise of aggregate productivity does not materialize, regardless of what Google, Microsoft, or Amazon do with their own proprietary models.

What makes the analysis more interesting is that the problem does not lie in the selection of the model. It lies in an earlier and more difficult layer to resolve: the quality of the operational environment. Data scattered across systems that do not communicate with one another. Workflows defined by the company's history rather than by its logic. Ticket taxonomies, service categories, or product hierarchies that nobody reviewed because they had always "worked well enough." When an AI agent is asked to operate on that infrastructure, it does not fail because the model is bad. It fails because the environment delivers ambiguous, incomplete, or contradictory information to it — and no model can compensate for that.

Robert Lyons, chief technology officer of Katz Media Group, a business unit of 800 people within a 10,000-employee company, offers in Woodside's analysis what is perhaps the most practical advice in the entire article: before deploying any AI tool, his team cleaned and labelled the data, and ran an AI introduction seminar for all employees across the company — delivered not by the IT team but by an independent research firm. The distinction matters. When IT presents AI, it does so with the implicit bias of someone who has a stake in the outcome. When a neutral third party does it, the message lands differently and organizational resistance decreases.

Lyons also describes a value/effort matrix for prioritizing AI projects: ease of implementation on one axis, business value on the other. He starts in the high-value, low-effort quadrant. His warning — "don't start with your worst problem first, you won't generate value" — is a direct critique of a pattern I see frequently in organizations that treat AI as an opportunity to solve the problems that no other initiative was able to resolve. That logic is understandable but counterproductive. The most visible and ambitious AI projects are also the most fragile, because they operate on the most disorganized data environments and the least structured workflows.

What Nucor and New Balance Have in Common with a Steel Company

Woodside cites two comparisons that deserve separate attention. The first is between Nike and New Balance. Nike operates with 80,000 employees; New Balance with 9,000. Woodside argues that New Balance is gaining competitive ground by consolidating its IT infrastructure onto a single platform with a centralized source of truth, freeing up teams from maintenance work and reconfiguring how the business operates. The second comparison involves Nucor and Steel Dynamics, two of the four largest steel manufacturers in the United States, which according to Woodside have maintained decades of operational discipline that produces environments the AI can optimize directly.

The pattern connecting these cases is the same one that appears at Seagate: AI works where the operational environment was ready to receive it. Not perfect. Ready. Consolidated data, defined workflows, systems capable of exchanging information without manual intervention, and a measurable outcome that the AI agent needs to improve upon.

This has a managerial implication that few are naming clearly. The companies that are having the most difficulty implementing AI are not the ones that chose the wrong model or hired the wrong consultants. They are the ones that for years made technology decisions prioritizing operational continuity over architectural coherence. Every time someone said "let's add this system because it solves this problem now" without asking how that system was going to integrate with the rest, they were accumulating a liability that today is being collected in the form of AI budget consumed by integration work.

That liability is not a technical failure. It is the accumulated result of architecture conversations that were never had, of technical debt assessments that were postponed because the quarter demanded speed, of inherited configurations that nobody wanted to review because the political cost of questioning them was too high.

What the successful cases Woodside describes have in common is that someone, at some point, made the decision to pay down that liability. Seagate did it under the pressure of an expiring contract. New Balance did it as part of a strategic bet on speed. Nucor and Steel Dynamics did it over decades without knowing they were building the foundation for a competitive advantage in AI.

Those Who Lead Must Pay the Cost of Looking at What the Organization Avoids Naming

There is an element of Woodside's argument that the article touches on tangentially but that deserves to be named directly: most of the organizations that are stuck in AI pilots know it. It is not technical ignorance. It is that the conversation about the state of the operational environment carries a political cost that nobody wants to pay.

Admitting that 25% of the AI budget is lost to integration and data cleaning means admitting that past architectural decisions were costly. Admitting that inherited processes cannot be carried over to the new environment means admitting that years of configuration work will not survive the transition. Admitting that the data is in poor condition means admitting that data quality initiatives from recent years did not deliver what they promised.

Those admissions require something that the dynamics of many boards of directors actively discourage: the ability to name a structural problem without the person who names it becoming associated with the failure they are describing.

The work of those who lead in this context is not technical. It is creating the conditions for those conversations to happen without the messenger bearing the cost. The organizations that are generating results with AI — the cases that Woodside describes — do not have perfect environments. They have leaders who decided to pay the cost of clarity before paying the cost of a failed implementation.

That sequence is not intuitive under pressure. But it is the only one that produces results that do not disappear in the next budget review cycle.