{"version":"1.0","type":"agent_native_article","locale":"en","slug":"why-95-percent-ai-pilots-fail-before-producing-results-mpcabo0h","title":"Why 95% of AI Pilots Fail Before Producing a Single Result","primary_category":"innovation","author":{"name":"Simón Arce","slug":"simon-arce"},"published_at":"2026-05-19T06:02:37.304Z","total_votes":88,"comment_count":0,"has_map":true,"urls":{"human":"https://sustainabl.net/en/articulo/why-95-percent-ai-pilots-fail-before-producing-results-mpcabo0h","agent":"https://sustainabl.net/agent-native/en/articulo/why-95-percent-ai-pilots-fail-before-producing-results-mpcabo0h"},"summary":{"one_line":"Most AI pilots stall not because of bad models but because the operational environment—fragmented data, inherited processes, and unresolved technical debt—was never ready to receive them.","core_question":"Why do the vast majority of enterprise AI pilots fail to reach production, and what distinguishes the organizations that do generate results?","main_thesis":"AI implementation failure is primarily an organizational and architectural problem, not a technology problem. Companies that succeed with AI first pay the political and financial cost of cleaning their operational environment; companies that skip that step consume their AI budget on integration work and produce nothing scalable."},"content_markdown":"## Why 95% of AI Pilots Fail Before Producing a Single Result\n\nThere is a scene that repeats itself in almost every mid-sized company I know. The technology team presents an artificial intelligence pilot. The initial numbers are promising. The board approves the investment. And six months later, the pilot is still a pilot. Nobody officially kills it. It doesn't scale either. It simply… takes up space on the roadmap and in follow-up meetings.\n\nDennis Woodside, president and CEO of Freshworks, published an analysis a few days ago in Fortune that gives a name to that phenomenon. And although the article also serves as commercial positioning for his company, the diagnosis it offers deserves to be taken seriously for one simple reason: the external data he cites is uncomfortable for any C-Level executive who has spent more than a year promising AI results to their board.\n\nMIT found that **95% of generative AI pilots fail before reaching production**. Boston Consulting Group published in September 2025 that **60% of companies generate no material value from AI**, and that percentage worsened compared to the previous year, despite the fact that the models improved and accumulated experience increased. Freshworks adds its own data point: **a quarter of the AI budget in mid-sized companies is consumed by integration, data cleaning, and the effort of making systems talk to one another that were never designed to communicate in the first place**.\n\nWhat those three numbers have in common is not the AI model chosen. It is the state of the operational environment where implementation is being attempted.\n\n## The Decision That Separates Those Who Move Forward from Those Who Stagnate\n\nWoodside describes the case of Seagate Technology with a precision that is useful precisely because it lacks glamour. The IT team had three months to migrate 30,000 employees to a new service management platform, forced by the expiration of a contract. The obvious decision — the one that almost any organization would take under that kind of pressure — was to move the existing configurations as they were and deal with the problems later. It is the safest path in the short term. It is also the one that guarantees that any layer of AI built on top will operate on flawed foundations.\n\nThe Seagate team chose the opposite. They rebuilt from scratch: restructured the service catalogue, established consistent service levels across regions, rewrote the category hierarchies so that tickets could route themselves without an agent having to guess. They did it within the same three-month window. A year later, the AI agent deployed on that foundation **deflects approximately one third of incoming tickets**, and first-contact resolution is **27% above the industry standard**.\n\nThat decision — to rebuild rather than replicate — is the axis of Woodside's argument. And it carries an organizational reading that goes well beyond technology.\n\nWhat Seagate did required that someone, at some point in the process, had a conversation that nobody wanted to have: the one that acknowledges that inherited processes are not simply inefficient, but are an active obstacle to any future improvement. That conversation has a political cost. Saying that the current processes will not be carried over means saying that years of configuration work, customization, and fine-tuning will not travel to the new environment. It means partially invalidating past decisions. Few organizations have any appetite for that under time pressure.\n\nWhat distinguishes Seagate is not that they had more resources or more time. It is that they had the clarity — or the managerial courage — not to drag the past forward when the contract expired. That is the variable that does not appear in any AI implementation manual.\n\n## The Invisible Tax Paid by Those Who Do Not Look at Their Processes\n\nWoodside introduces the concept of a \"complexity tax\" to describe what happens when a company attempts to implement AI on a fragmented architecture. This is not a decorative metaphor. It is a concrete financial mechanism.\n\nIf 25% of the AI budget is lost to integration and data cleaning before the model produces a single useful output, a company that allocates one million dollars to AI is effectively purchasing 750,000 dollars worth of capacity. The remaining 25% is absorbed by accumulated technical debt. For a large enterprise with transformation budgets in the hundreds of millions, that fraction can be tolerated. For a company with between 500 and 20,000 employees, with lean IT teams and tighter margins for maneuver, that loss can be the difference between an initiative that thrives and one that is quietly cancelled in the next budget cycle.\n\nWoodside's argument about \"agile companies\" — his term for that range of mid-sized organizations — follows a logic that major media outlets tend to ignore because the segment is not as photogenic as the digital transformation stories of Fortune 500 companies. But it is precisely where the productivity battle that AI promises will be won or lost. **SMEs represent the majority of the global business fabric**. If AI does not work there, the promise of aggregate productivity does not materialize, regardless of what Google, Microsoft, or Amazon do with their own proprietary models.\n\nWhat makes the analysis more interesting is that the problem does not lie in the selection of the model. It lies in an earlier and more difficult layer to resolve: the quality of the operational environment. Data scattered across systems that do not communicate with one another. Workflows defined by the company's history rather than by its logic. Ticket taxonomies, service categories, or product hierarchies that nobody reviewed because they had always \"worked well enough.\" When an AI agent is asked to operate on that infrastructure, it does not fail because the model is bad. It fails because the environment delivers ambiguous, incomplete, or contradictory information to it — and no model can compensate for that.\n\nRobert Lyons, chief technology officer of Katz Media Group, a business unit of 800 people within a 10,000-employee company, offers in Woodside's analysis what is perhaps the most practical advice in the entire article: before deploying any AI tool, his team cleaned and labelled the data, and ran an AI introduction seminar for all employees across the company — delivered not by the IT team but by an independent research firm. The distinction matters. When IT presents AI, it does so with the implicit bias of someone who has a stake in the outcome. When a neutral third party does it, the message lands differently and organizational resistance decreases.\n\nLyons also describes a value/effort matrix for prioritizing AI projects: ease of implementation on one axis, business value on the other. He starts in the high-value, low-effort quadrant. His warning — \"don't start with your worst problem first, you won't generate value\" — is a direct critique of a pattern I see frequently in organizations that treat AI as an opportunity to solve the problems that no other initiative was able to resolve. That logic is understandable but counterproductive. The most visible and ambitious AI projects are also the most fragile, because they operate on the most disorganized data environments and the least structured workflows.\n\n## What Nucor and New Balance Have in Common with a Steel Company\n\nWoodside cites two comparisons that deserve separate attention. The first is between Nike and New Balance. Nike operates with 80,000 employees; New Balance with 9,000. Woodside argues that New Balance is gaining competitive ground by consolidating its IT infrastructure onto a single platform with a centralized source of truth, freeing up teams from maintenance work and reconfiguring how the business operates. The second comparison involves Nucor and Steel Dynamics, two of the four largest steel manufacturers in the United States, which according to Woodside have maintained decades of operational discipline that produces environments the AI can optimize directly.\n\nThe pattern connecting these cases is the same one that appears at Seagate: **AI works where the operational environment was ready to receive it**. Not perfect. Ready. Consolidated data, defined workflows, systems capable of exchanging information without manual intervention, and a measurable outcome that the AI agent needs to improve upon.\n\nThis has a managerial implication that few are naming clearly. The companies that are having the most difficulty implementing AI are not the ones that chose the wrong model or hired the wrong consultants. They are the ones that for years made technology decisions prioritizing operational continuity over architectural coherence. Every time someone said \"let's add this system because it solves this problem now\" without asking how that system was going to integrate with the rest, they were accumulating a liability that today is being collected in the form of AI budget consumed by integration work.\n\nThat liability is not a technical failure. It is the accumulated result of architecture conversations that were never had, of technical debt assessments that were postponed because the quarter demanded speed, of inherited configurations that nobody wanted to review because the political cost of questioning them was too high.\n\nWhat the successful cases Woodside describes have in common is that someone, at some point, made the decision to pay down that liability. Seagate did it under the pressure of an expiring contract. New Balance did it as part of a strategic bet on speed. Nucor and Steel Dynamics did it over decades without knowing they were building the foundation for a competitive advantage in AI.\n\n## Those Who Lead Must Pay the Cost of Looking at What the Organization Avoids Naming\n\nThere is an element of Woodside's argument that the article touches on tangentially but that deserves to be named directly: most of the organizations that are stuck in AI pilots know it. It is not technical ignorance. It is that the conversation about the state of the operational environment carries a political cost that nobody wants to pay.\n\nAdmitting that 25% of the AI budget is lost to integration and data cleaning means admitting that past architectural decisions were costly. Admitting that inherited processes cannot be carried over to the new environment means admitting that years of configuration work will not survive the transition. Admitting that the data is in poor condition means admitting that data quality initiatives from recent years did not deliver what they promised.\n\nThose admissions require something that the dynamics of many boards of directors actively discourage: the ability to name a structural problem without the person who names it becoming associated with the failure they are describing.\n\nThe work of those who lead in this context is not technical. It is creating the conditions for those conversations to happen without the messenger bearing the cost. The organizations that are generating results with AI — the cases that Woodside describes — do not have perfect environments. They have leaders who decided to pay the cost of clarity before paying the cost of a failed implementation.\n\nThat sequence is not intuitive under pressure. But it is the only one that produces results that do not disappear in the next budget review cycle.","article_map":{"title":"Why 95% of AI Pilots Fail Before Producing a Single Result","entities":[{"name":"Freshworks","type":"company","role_in_article":"Source of the 25% integration cost data point and platform whose CEO authored the Fortune analysis that anchors the article."},{"name":"Dennis Woodside","type":"person","role_in_article":"President and CEO of Freshworks; author of the Fortune analysis that provides the primary framework and case studies."},{"name":"Seagate Technology","type":"company","role_in_article":"Primary case study illustrating the rebuild-vs-replicate decision and its AI outcomes."},{"name":"Katz Media Group","type":"company","role_in_article":"Secondary case study providing a practical AI prioritization methodology."},{"name":"Robert Lyons","type":"person","role_in_article":"CTO of Katz Media Group; source of the value/effort matrix and the neutral-third-party training approach."},{"name":"New Balance","type":"company","role_in_article":"Case study illustrating how a smaller company gains competitive ground through IT consolidation."},{"name":"Nike","type":"company","role_in_article":"Comparison point for New Balance; represents a larger incumbent with more complex infrastructure."},{"name":"Nucor","type":"company","role_in_article":"Case study of decades-long operational discipline creating AI-ready environments."},{"name":"Steel Dynamics","type":"company","role_in_article":"Case study alongside Nucor; same pattern of operational discipline as structural AI advantage."},{"name":"MIT","type":"institution","role_in_article":"Source of the 95% AI pilot failure statistic."},{"name":"Boston Consulting Group","type":"institution","role_in_article":"Source of the 60% no-material-value statistic and year-over-year worsening trend."},{"name":"Fortune","type":"institution","role_in_article":"Publication where Woodside's original analysis appeared."}],"tradeoffs":["Short-term safety of replicating existing configurations vs. long-term AI readiness of rebuilding from scratch","Speed of AI deployment vs. quality of the operational environment that receives it","Political cost of naming structural problems vs. financial cost of failed AI implementations","Allocating AI budget to model capabilities vs. allocating it to data cleaning and integration prerequisites","Starting with ambitious AI projects (high visibility, high risk) vs. starting with low-effort, high-value use cases (lower visibility, higher success rate)","IT-led AI training (faster, cheaper, biased) vs. neutral third-party training (slower, costlier, lower organizational resistance)","Tolerating technical debt in the short term vs. paying it down before AI investment"],"key_claims":[{"claim":"MIT found that 95% of generative AI pilots fail before reaching production.","confidence":"high","support_type":"reported_fact"},{"claim":"BCG published in September 2025 that 60% of companies generate no material value from AI, and that percentage worsened year-over-year despite model improvements.","confidence":"high","support_type":"reported_fact"},{"claim":"25% of the AI budget in mid-sized companies is consumed by integration, data cleaning, and interoperability work before any model produces output.","confidence":"high","support_type":"reported_fact"},{"claim":"Seagate's AI agent deflects approximately one third of incoming tickets after rebuilding its service management foundation from scratch.","confidence":"high","support_type":"reported_fact"},{"claim":"Seagate achieved first-contact resolution 27% above industry standard after the rebuild.","confidence":"high","support_type":"reported_fact"},{"claim":"The primary cause of AI pilot failure is the quality of the operational environment, not the AI model selected.","confidence":"high","support_type":"inference"},{"claim":"Mid-sized companies (500–20,000 employees) are disproportionately harmed by the complexity tax relative to large enterprises.","confidence":"high","support_type":"inference"},{"claim":"Organizations that treat AI as a solution to previously unsolvable problems are more likely to fail because those problems sit on the most disorganized data environments.","confidence":"medium","support_type":"inference"}],"main_thesis":"AI implementation failure is primarily an organizational and architectural problem, not a technology problem. Companies that succeed with AI first pay the political and financial cost of cleaning their operational environment; companies that skip that step consume their AI budget on integration work and produce nothing scalable.","core_question":"Why do the vast majority of enterprise AI pilots fail to reach production, and what distinguishes the organizations that do generate results?","core_tensions":["The organizations that most need AI transformation are often the ones whose accumulated technical debt makes AI implementation most expensive and risky","The most visible AI projects attract the most budget but operate on the worst data environments","Boards demand AI results quickly, but the prerequisite work (data cleaning, process restructuring) is slow, unglamorous, and politically costly","The person who names the structural problem risks being associated with the failure they are describing, which creates incentives to avoid the conversation","Mid-sized companies face the complexity tax most acutely but have the least margin to absorb it","AI model quality has improved while organizational AI value generation has worsened—the bottleneck has shifted entirely to the operational layer"],"open_questions":["What is the minimum viable operational environment quality threshold before AI deployment becomes cost-effective for mid-sized companies?","How should organizations measure and report technical debt as a financial liability on AI ROI projections?","Can the rebuild-vs-replicate decision be made systematically, or does it always require the kind of managerial courage that is difficult to institutionalize?","What governance structures allow leaders to name structural problems without bearing the political cost of the failures they describe?","Is the 25% integration cost figure stable across industries and company sizes, or does it vary significantly by sector?","At what point does the complexity tax make it more rational to delay AI investment entirely rather than proceed with a fragmented architecture?","How do organizations distinguish between AI pilots that are genuinely exploratory and those that are stalled due to unresolved operational debt?"],"training_value":{"recommended_for":["C-level executives responsible for AI strategy and board reporting","CTOs and CIOs evaluating AI implementation prerequisites","Management consultants advising mid-market companies on digital transformation","AI program managers diagnosing stalled pilots","Investors evaluating operational readiness as a predictor of AI ROI","Business analysts building AI business cases that include integration cost assumptions"],"when_this_article_is_useful":["When evaluating why an AI pilot has not scaled after 6–12 months","When building the business case for data cleaning and process restructuring as prerequisites to AI investment","When advising a mid-sized company on AI implementation sequencing","When a board is asking why AI investments are not producing results despite increased budget","When designing an AI change management program and deciding whether to use internal or external facilitators","When assessing technical debt as a financial liability in AI ROI calculations","When comparing AI readiness across companies in the same industry"],"what_a_business_agent_can_learn":["How to diagnose whether an AI pilot failure is caused by model quality or by the operational environment","How to calculate the effective AI budget after the complexity tax is applied","How to use a value/effort matrix to sequence AI projects and avoid the worst-problem-first antipattern","Why delivering AI training through neutral third parties rather than internal IT teams reduces organizational resistance","How to frame the rebuild-vs-replicate decision as a financial and strategic choice rather than a purely technical one","How to identify the political dynamics that cause organizations to avoid the conversations necessary for AI readiness","How decades of architectural discipline create structural competitive advantages in AI adoption that cannot be replicated quickly","How to distinguish between technical AI failure and organizational AI failure when diagnosing stalled pilots"]},"argument_outline":[{"label":"The Stalled Pilot Pattern","point":"95% of generative AI pilots never reach production (MIT). 60% of companies generate no material value from AI despite better models and more experience (BCG, September 2025). 25% of AI budgets in mid-sized companies are consumed by integration and data cleaning before any model produces output (Freshworks).","why_it_matters":"These three data points converge on a single diagnosis: the bottleneck is not the AI model, it is the state of the environment where implementation is attempted."},{"label":"The Seagate Decision","point":"Seagate had three months to migrate 30,000 employees to a new service management platform. Instead of replicating existing configurations, the team rebuilt from scratch: restructured the service catalogue, standardized service levels across regions, and rewrote ticket category hierarchies. One year later, an AI agent deflects ~33% of incoming tickets and first-contact resolution is 27% above industry standard.","why_it_matters":"The decision to rebuild rather than replicate is the axis of the argument. It required a politically costly conversation acknowledging that inherited processes were an active obstacle, not just inefficient."},{"label":"The Complexity Tax","point":"When 25% of the AI budget is lost to integration and data cleaning, a $1M AI investment effectively purchases $750K of capacity. For large enterprises this fraction is tolerable; for companies with 500–20,000 employees and lean IT teams, it can be the difference between an initiative that scales and one that is quietly cancelled.","why_it_matters":"The complexity tax is a concrete financial mechanism, not a metaphor. It disproportionately affects the mid-market segment that represents the majority of the global business fabric."},{"label":"The SME Productivity Bet","point":"Mid-sized companies ('agile companies' in Woodside's framing) are where the aggregate productivity promise of AI will be won or lost. If AI does not work there, the macro productivity gains do not materialize regardless of what hyperscalers do with their own models.","why_it_matters":"The dominant AI narrative focuses on Fortune 500 transformations. The real test is in the segment that is less photogenic but economically larger."},{"label":"The Katz Media Playbook","point":"Robert Lyons (CTO, Katz Media Group) cleaned and labelled data before deploying any AI tool, and ran an AI introduction seminar delivered by an independent research firm rather than the IT team. He uses a value/effort matrix and starts in the high-value, low-effort quadrant. His explicit warning: do not start with your worst problem first.","why_it_matters":"Starting with the most ambitious AI project is counterproductive because it operates on the most disorganized data and least structured workflows. The pattern of treating AI as a solution to previously unsolvable problems is a reliable predictor of failure."},{"label":"New Balance, Nucor, and Steel Dynamics","point":"New Balance (9,000 employees) is gaining ground on Nike (80,000) by consolidating IT onto a single platform with a centralized source of truth. Nucor and Steel Dynamics have maintained decades of operational discipline that produces environments AI can optimize directly.","why_it_matters":"The common pattern across all successful cases is not model selection or consultant quality. It is that the operational environment was ready—consolidated data, defined workflows, systems capable of exchanging information without manual intervention."}],"one_line_summary":"Most AI pilots stall not because of bad models but because the operational environment—fragmented data, inherited processes, and unresolved technical debt—was never ready to receive them.","related_articles":[{"reason":"Directly addresses the same SME/mid-market gap in the AI narrative—argues that small businesses carry half the economic weight but receive a fraction of the AI conversation, which mirrors this article's thesis about where the AI productivity battle will actually be won or lost.","article_id":12757},{"reason":"Examines why organizations repeat the same AI adoption mistakes the Pentagon already learned to avoid, providing a complementary institutional perspective on the organizational (not technical) roots of AI implementation failure.","article_id":12646},{"reason":"The Solow Paradox framing—technology arriving decades before productivity gains materialize—provides the macro-economic context for why 60% of companies generate no material value from AI despite model improvements, directly supporting the article's central data point.","article_id":12738},{"reason":"Explores the governance gap in enterprise AI infrastructure, relevant to the article's argument that organizational readiness and governance conversations consistently lag behind technical deployment.","article_id":12830}],"business_patterns":["Rebuild-before-AI: organizations that clean and restructure their operational environment before deploying AI agents consistently outperform those that layer AI on top of existing fragmentation","Complexity tax: fragmented architectures systematically consume 20–25% of AI budgets on integration work before any value is produced","Value/effort matrix prioritization: starting in the high-value, low-effort quadrant generates early wins that build organizational momentum for harder AI projects","Neutral-party change management: using independent firms rather than internal IT teams to introduce AI reduces resistance and improves adoption","Decades-long operational discipline as AI moat: companies like Nucor and Steel Dynamics that maintained architectural coherence over time now have structural competitive advantages in AI adoption","Worst-problem-first antipattern: organizations that treat AI as a solution to previously unsolvable problems consistently fail because those problems sit on the most disorganized data","Political cost avoidance as implementation blocker: the primary reason AI pilots stall is not technical but organizational—the conversation about operational debt is avoided because of its political cost"],"business_decisions":["Whether to replicate existing configurations or rebuild from scratch when migrating to a new platform under time pressure","Whether to start AI implementation with the highest-visibility problem or with high-value, low-effort use cases","Whether to deliver AI training internally (IT team) or through a neutral third party","Whether to consolidate IT infrastructure onto a single platform before deploying AI agents","Whether to acknowledge and pay down technical debt before allocating AI budget","Whether to treat AI pilots as experiments or as production commitments with defined operational prerequisites","Whether to prioritize operational continuity or architectural coherence in technology decisions"]}}