Mercor and the Cost of Building on Borrowed Sand

Mercor and the Cost of Building on Borrowed Sand

A $10 billion startup lost 4 terabytes of confidential data due to reliance on unverified open-source tools. This collapse highlights systemic risks in AI ventures.

Lucía NavarroLucía NavarroApril 10, 20267 min
Share

Mercor and the Cost of Building on Borrowed Sand

On March 31, 2026, Mercor—a $10 billion artificial intelligence training startup—publicly confirmed what cybersecurity researchers already suspected: it had been compromised via LiteLLM, an open-source tool integrated into its infrastructure. This breach resulted in the exfiltration of approximately 4 terabytes of data, including 939 GB of source code from its platform, a 211 GB user database, nearly 3 terabytes of video interview recordings, identity verification documents, internal Slack communications, and personal information—including Social Security numbers—of over 40,000 independent contractors.

Within a week, at least five class-action lawsuits were filed in federal courts in California and Texas. Meta indefinitely suspended all its contracts with the company. MercorClaims.com appeared online almost immediately. Moreover, the group Lapsus$ auctioned off the stolen data on its leak site.

What I will analyze here is not the attack itself. The technical details are fascinating, but the underlying story is more crucial for any leader currently building a business on the promise of artificial intelligence.

How Vulnerability Was Built Before the Attacker Arrived

The entry vector was a software supply chain attack. The group TeamPCP exploited a vulnerability in Trivy, an open-source security scanner, to steal maintainer credentials. With those credentials, they compromised two versions of LiteLLM—an AI gateway with 95 million monthly downloads—logged as CVE-2026-33634. From there, lateral access to Mercor's infrastructure was gained. The malicious versions of LiteLLM were active between 40 minutes and 3 hours. Enough time to wreak havoc.

Cory Michal, AppOmni's CISO, described this as "a more consequential category" than prompt injection attacks, precisely because it compromises the infrastructure layer before any interaction occurs with the model. It is not an attack on the product; it is an attack on the foundation.

This highlights a structural issue that no press release from Mercor will solve: the company built a $10 billion value proposition on a critical dependency that it did not control, fund, and, according to the lawsuits, did not rigorously audit. LiteLLM is free and open-source. Mercor did not pay for it. It benefitted from it. And when it failed, it absorbed all the damage.

This is not just a problem for Mercor. It reflects a significant portion of the sector's operational model. AI training startups build on layers of open tools because it reduces their variable costs in the short term. However, that cost reduction is, more precisely, a risk transfer down the chain, to volunteer maintainers, to independent contractors, and when the system fails, to the 40,000 workers whose Social Security numbers are now circulating on underground markets.

The Extractive Model Hidden Behind the $10 Billion Valuation

Mercor operates in the labeling and training of AI data. Its proposition is to connect independent workers—contractors in legal terminology—with companies such as Meta, OpenAI, Anthropic, and Google to perform human feedback tasks that language models need to learn. Essentially, it is a fragmented work platform applied to the most strategic segment of the tech economy.

The CEO of Y Combinator stated that the exposed data represents "billions in value" and a national security risk, as it includes selection criteria, labeling protocols, and frontier reinforcement learning strategies. This is not hyperbole. That information, in the wrong hands, poses a direct competitive advantage for anyone building rival models.

But let’s look at the structure from the inside: the contractors who generated that value—whose interview recordings, W-9 forms, and conversations with AI systems were stolen—were independent workers without standard labor protections. They provided biometric data, tax information, and hours of cognitive labor. In return, they received payment per task. When the system failed, the first to absorb the cost were they: their identities exposed, their incomes disrupted when Meta paused contracts, and now facing the identity risk mitigation expenses that one plaintiff in the legal documents quantifies as direct losses.

NaTivia Esson, one of the plaintiffs, worked for Mercor from March 2025 to March 2026. She submitted W-9 forms with her personal information. Today she pays out of her pocket for identity protection services that the company did not provide. This illustrates, in concrete operational terms, a model where risk is externalized to the weakest links in the chain.

The financial architecture that enables a $10 billion valuation requires high margins. High margins in fragmented work platforms arise, in part, from classifying workers as independent contractors—eliminating costs for benefits, insurance, and data protection that would be mandatory with employees. This structural saving is precisely what makes the data breach a legal disaster: without the formal employment relationship, the company maintained access to highly sensitive information without assuming the custody obligations that such information requires.

What Regulatory Silence Amplified

On April 9, 2026, the Schubert Jonckheer & Kolbe law firm publicly announced that Mercor had not notified state attorneys general about the breach, which could constitute a violation of incident notification laws in several states. Mercor did not respond to requests for comment. Berrie AI, developer of LiteLLM, did not respond either. Delve Technologies, the firm that had certified Berrie AI’s regulatory compliance and is now facing accusations of "false compliance as a service" from an anonymous whistleblower, also remained silent.

The coordinated silence of three actors involved in a failure chain is, in itself, strategic information. When no party speaks, it is generally because no party has a narrative that withstands scrutiny. What does withstand scrutiny are the facts: a compliance firm certified the security of a tool that was compromised. That automated compliance model—where certification occurs without auditing—is GRC (governance, risk, and compliance) turned into theater.

This pattern has consequences that extend beyond Mercor. If security certifications in the AI sector can be purchased without genuine corresponding controls, then the market operates with structurally asymmetric information. Clients like Meta or OpenAI make integration decisions assuming that their providers have passed genuine audits. When those audits are symbolic, the risk does not disappear: it gets redistributed upstream until an incident makes it visible.

Meta has already absorbed that redistribution. The indefinite pause of all its contracts with Mercor—including projects from its superintelligence division, TBD Labs—is not merely a risk management decision. It signals that a company with Meta's operational sophistication cannot assume its vendors have the controls they claim to have. The cost of that verification, which Meta implicitly delegated to Mercor, now transforms into an internal cost that it will have to absorb for any equivalent provider going forward.

The Model That Survives Its Own Failures

There exists a structural difference between a business that grows quickly by externalizing its risks and one that grows sustainably by internalizing and managing them as part of its value proposition. Mercor, with its $10 billion valuation, represented the first category. The question the sector must now grapple with is whether there is commercial space for the second.

The answer is indeed affirmative, and there is business logic behind it. An AI training platform that classifies its workers as employees with guaranteed data protection, that pays for the infrastructure tools it uses—or actively contributes to their maintenance—and submits its security certifications to genuine independent audits will have higher operational costs. It will also have significantly lower legal, reputational, and operational risks. In a sector where a single incident can lead to contract pauses with the world's largest clients and trigger class-action lawsuits across multiple jurisdictions, that risk reduction has a calculable economic value.

The CEO of Y Combinator stated that the stolen data represents a national security risk. If that’s true—and there are reasons to take it seriously—then the business model that protects that data with paper certifications is not just ethically questionable. It is strategically unviable in the medium term.

Leaders who are currently building on unfinanced open-source infrastructure, on unprotected independent workers, and on unchecked compliance certifications are making a financial decision: they are opting for higher margins today at the cost of concentrating risk on a future event that, when it occurs, will be their exclusive problem. Mercor has just demonstrated how costly that event can be.

The mandate for C-level executives is straightforward: audit what portion of their valuation rests on dependencies they do not control, on workers they do not protect, and on compliance they do not verify. If their business model uses people and shared infrastructure merely as cheap inputs to generate shareholder value, they already have the answer regarding how much time they have left before that model charges them the real cost.

Share
0 votes
Vote for this article!

Comments

...

You might also like