AI doesn't fix a broken process. It scales the breakage.

AI applied to a broken process doesn’t fix it. It executes the process faster, at higher volume, with less friction to catch the errors. That’s not a metaphor. That’s the mechanism.

Some time ago, I was brought in on what looked like a data-extraction problem. A law firm wanted to automate client billing — extract data from timesheets, process it, issue the invoice. One field gave me pause before we wrote a line of code: additional costs, entered as free text. The surface problem was parsing: unstructured input, inconsistent formatting. A technical fix.

But I spent several billing cycles watching the process run before we touched anything. What I found underneath: sometimes the currency wasn’t the billing currency. Sometimes costs were entered with VAT included, so the system would calculate VAT on top of VAT. Sometimes what looked like an additional cost was actually a separate line item — of counsel work that should have been billed differently. None of that was visible from the outside. None of it would have been caught by automating the extraction.

The process had been working — just barely — because experienced people were quietly absorbing the inconsistencies at each step. Automate it, and you don’t remove the inconsistencies. You remove the humans who were catching them.

That’s the failure mode. In a manual process, a confused step produces a confused person who pauses, asks a question, maybe catches the error. Automated, the same step produces a wrong output at scale, consistently, until someone notices. The technology isn’t the problem. The sequence is.

The causes of failure are upstream of the model

The billing story is a specific instance of a general pattern: by the time you’re choosing which AI to use, most of the decisions that determine whether it will work have already been made — or avoided.

RAND’s 2024 study, based on interviews with 65 experienced practitioners, confirms the pattern. Across AI project failures, 84% of interviewees cited leadership-driven causes as primary: the wrong business problem was chosen, success metrics didn’t map to real outcomes, workflow fit was poor, and simpler tools would have worked better. The researchers’ summary is blunt: overcoming AI failure is “more about humans than the machines.”

Healthcare implementation literature shows the same structure. A systematic review of 92 studies found adoption barriers that were organizational before they were technical — workflow misalignment, inadequate training, weak leadership support, data quality. Model performance wasn’t the headline issue. It rarely is.

The European Commission’s Joint Research Centre goes further: there is a real gap between testing a system and embedding it in operational practice, and most organizations underestimate how much changes between a successful pilot and something that actually runs. The gap isn’t technical. It’s organizational, cultural, and governance-related.

None of this is surprising if you’ve sat inside a transformation program. What gets scaled is the existing logic of the process — its assumptions, its decision rules, its exceptions, its handoffs. If those are unclear, AI makes them execute more consistently. Consistently wrong is worse than intermittently wrong, because it takes longer to notice and longer to walk back.

The sequence I’ve learned to run

Here’s the order of operations I use. I’m not going to call it a framework. It’s closer to a discipline I’ve developed by watching the right things go wrong — across automation projects, not just AI ones. A VBA macro, an RPA deployment, an AI layer: the discipline is the same regardless of the technology. The failure mode is also the same. Which means the solution predates the current AI moment by a long way.

Understand before you build. Define the problem precisely — not “we want to use AI for claims processing” but: what is the actual decision being made? Who makes it? What information does it depend on? What does a good outcome look like, and how would you know if you got one? RAND’s finding that teams consistently optimize for the wrong objective is a direct consequence of skipping this step. The billing project forced me to spend time here before I could see the actual problem. That wasn’t delay — it was the work.

Make the information usable. You cannot model what you haven’t measured, and you cannot measure what you haven’t defined. French firm-level research on AI adoption found it was more likely to succeed where firms already had data security systems, ICT-trained staff, and digital channels for collecting customer information in place. AI adoption tends to follow a hierarchical trajectory that is typically preceded by the digitization of business information. The sequencing is empirical, not theoretical.

Simplify before you automate. Michael Hammer’s argument from 1990 still holds: technology investments disappoint when they mechanize old ways of working instead of redesigning them. The Ford accounts-payable case is the clean example — the breakthrough wasn’t faster invoice matching. It was redesigning the process so invoice matching largely disappeared. Complexity that exists because nobody examined it is a liability you do not want to automate.

Establish governance before scale. Clarify who is accountable for the output. Build in human override. Define what gets monitored and how often. Governance should be in place before deployment, not retrofitted after something goes wrong. In regulated environments, this isn’t optional hygiene — it’s the difference between a recoverable error and a systemic audit finding.

Only then choose the automation layer. Rules-based automation for stable, high-volume, low-variance tasks. AI where prediction, classification, language, or pattern recognition adds real value — and where the uncertainty that comes with probabilistic output can be governed. RAND is clear on this: many organizations reach for AI where simpler tools would work better. That’s not a technology failure. It’s a problem-definition failure that manifests as a technology choice.

I don’t have a name for this sequence, and I’m not sure it needs one. What it needs is to be run before the AI conversation starts — not during it.

The nuance worth keeping

There’s a version of “fix the process before you apply AI” that goes too far, and it’s worth naming so this piece isn’t read as a blanket refusal.

AI can sometimes help you see a broken process. That’s different from using it to run one.

A 2024 study by Brynjolfsson, Li, and Raymond found that a generative AI assistant in customer support increased productivity by 14% on average — and 34% for novice and lower-skilled workers. The researchers’ interpretation: the system captured the tacit knowledge of the best performers and distributed it to everyone else. It codified what good looked like and made it accessible. That’s AI improving a process, not scaling a broken one — because the goal was diagnostic and distributional, not just faster execution.

There’s also a real role for AI in messy documentation — clinical notes, requirements definitions, work artifacts. A 2025 early-adopter study on AI-assisted clinical documentation found that clinicians who benefited most were often those who hadn’t yet optimized their documentation workflows. The AI gave them enough structure to start. Not a finished process, but a scaffold.

So the more precise claim is this: AI can help reveal, diagnose, and codify a broken or implicit process. AI used to execute or scale that process before it’s understood is much more likely to amplify the defects than to fix them. The first is diagnostic. The second is operational. The failure mode lives in the second.

That distinction matters for how you scope a use case. If you’re using AI to analyze your current process and surface what’s inconsistent — that’s a reasonable starting point. If you’re using AI to run the process at scale before you’ve worked through the questions above — you’re accelerating toward a problem, not away from one.

The institutional version of this

In a regulated environment, the stakes are asymmetric in a specific way.

A slow, manual, slightly broken process produces localized failures. A person makes a wrong call. Someone catches it. It gets corrected. The exposure is bounded.

An automated or AI-assisted, slightly broken process produces consistent failures — at volume, with an audit trail that shows you knew what the system was doing. Australia’s Robodebt scheme is the clearest illustration. An automated data-matching system applied an income-averaging method that turned out to be unlawful, at scale. The Royal Commission found that officials knew the approach lacked legal basis and moved ahead anyway, enabling automated debt recovery with no human intervention. The automation didn’t create the legal problem. It removed the human checkpoints that had been — in whatever imperfect way — slowing things down enough for someone to notice.

The Dutch childcare-benefits scandal follows the same structure: an algorithmic risk-profiling system amplified a punitive enforcement process, including nationality as a risk factor, scaling discriminatory outcomes across thousands of families before the error became visible and undeniable.

These are extreme cases. But the mechanism is the same at smaller scale. If your process has unclear decision rules, contested data definitions, or exception-handling that lives in someone’s head — AI makes those problems faster, broader, and significantly harder to walk back. In a regulated environment, “significantly harder to walk back” can mean regulatory sanction, reputational damage, and a remediation program that costs ten times what the automation saved.

The Copilot version of this problem sits one layer up. My piece “The model was never the problem — context was” looks at what happens when you put a capable AI retrieval layer on top of documentation that was never organized for retrieval — the same family of mistake, from a different angle. A capable layer on unprepared foundations underdelivers. Here, the foundations are process. There, they’re data. The pattern is the same.

What to do with this

The next time an AI initiative lands on your desk — as a sponsor, a skeptic, or the person accountable for delivery — run it through the sequence before the conversation reaches model selection.

Can you describe the process precisely, including who decides what and how you’d know if the output was wrong? Is the data clean and structured enough to model? Has the workflow been simplified, or are you automating existing complexity? Is governance in place before scale, or scheduled for after launch?

If the answer to any of those is no, you don’t have an AI problem yet. You have a process problem — and the AI will find it for you at a scale you won’t enjoy.

Solve the process first. This discipline is older than AI, and it will outlast whatever comes after it. The technology changes. The sequence doesn’t.

The causes of failure are upstream of the model

The sequence I’ve learned to run

The nuance worth keeping

The institutional version of this

What to do with this

Similar Posts