The Spellchecker That Can't Fix a Typo

You run Solibri. You get 500 issues. Then someone still has to click through Revit and fix each one by hand.

Almost every QC tool in AEC finds problems. Yet almost none of them fix anything.

That’s the gap nobody’s talking about.

The Report That Creates More Work

Here’s a scene that plays out in every large firm, every week: A BIM manager runs a model audit. As expected, the tool does its job beautifully—naming violations, parameter mismatches, coordinate drift, missing fire ratings. Eventually, the report lands with 847 issues flagged.

And then what?

Someone opens Revit and clicks to the first issue. After fixing it manually, they move to the second. Eight hours later, they’ve addressed maybe 200 items. Meanwhile, the other 647 will wait until the next deadline panic, when someone runs the checker again and discovers half of them are still there, plus 300 new ones.

This is the state of the art in BIM quality control. Consider this: Solibri’s customers found 2.5 billion issues in their projects last year. Two and a half billion flags raised. How many were fixed automatically? Notably, the marketing doesn’t say—because the answer is essentially zero.

Solibri’s own documentation describes the workflow explicitly: rules flag issues, issues become slides or BCF files, and then “someone needs to fix this problem.” In other words, the tool checks while humans fix. That’s the design.

Think about that for a moment. Your word processor doesn’t just underline misspelled words in red—it offers to correct them. One click, fixed. By contrast, the most advanced QC tools in our industry are spellcheckers that can only underline. Consequently, the fixing is still entirely on you.

Why We Stopped at Detection

This isn’t because vendors are lazy or because the technology doesn’t exist. Rather, the industry stopped at detection for reasons that made sense at the time—and still make sense if you don’t think carefully about what’s actually possible.

First, there’s liability. When Solibri flags a wall with the wrong fire rating, it’s making an observation. However, when something changes that wall’s fire rating, it’s making a decision. Observations are safe, whereas decisions carry consequences.

The automated compliance literature is explicit about this dynamic: legal liability and ambiguity in regulations make vendors conservative. According to a 2025 review on automated compliance checking, automating decisions requires regulations to be precisely digitizable and raises complex coordination between regulators, legal experts, and automation providers. Understandably, nobody wants to be the vendor whose tool changed a fire rating incorrectly on a building that later failed inspection—or worse.

Second, there’s design intent. Not every flagged issue is actually wrong. Sometimes a naming convention violation is intentional—a one-off condition that the designer knows about and accepts. Similarly, a parameter mismatch might reflect a design decision that hasn’t been documented yet.

Research on BIM quality control and code checking stresses that many rule violations are context-dependent. Specifically, constraint violations can represent deliberate design choices, and systems often require human judgment to interpret them. Therefore, detection tools can flag deviations from rules, but they can’t know whether those deviations are mistakes or choices.

Third, there’s multi-party coordination. In AEC, the model isn’t owned by one person. Instead, architects, structural engineers, MEP consultants, and contractors all touch the same data, often under different contracts with different scopes.

Clash detection guides consistently frame resolution as a multi-stakeholder process: multiple disciplines review clashes, decide who moves what, and agree on changes before models are updated. As a result, an automated fix that changes something in “your” part of the model might break something in “their” part. Detection is safe precisely because it doesn’t cross those boundaries, while remediation is dangerous because it does.

These are real constraints, and they’re not going away. Consequently, they explain why, even as AI gets smarter and more capable, our QC tools remain stuck in detect-only mode.

The Difference Between Flagging and Fixing

Here’s where it gets technical—but stay with me, because this is the part that matters.

When a QC tool flags an issue, it’s doing pattern matching. Essentially, it looks at the model, compares it to a set of rules, and reports where the model doesn’t match. This process is computationally straightforward since the tool doesn’t need to understand the model deeply—it just needs to check conditions.

When something fixes an issue, however, it’s doing something fundamentally different. It’s not just identifying a gap between current state and desired state—it’s executing a transformation that moves the model from one to the other. Moreover, that transformation has to meet three critical requirements:

It must be deterministic. In practice, the same issue should produce the same fix every time. If you run the fixer twice on identical inputs, you should get identical outputs. Non-deterministic fixes—the kind that might do different things depending on context or randomness—are terrifying in a liability-heavy domain.

It must be bounded. Specifically, the fix should change only what needs to change, and nothing else. A fix for a naming violation shouldn’t accidentally modify geometry, nor should a fix for a parameter value cascade into unexpected changes elsewhere in the model.

It must be reversible. If the fix was wrong—perhaps that deviation was intentional, or the rule was misconfigured, or the fix broke something downstream—you need to be able to undo it cleanly. Not “restore yesterday’s backup,” but rather undo that specific change while keeping everything else that happened since.

This is where most people’s mental model breaks down. Typically, they think of “undo” as a file-level operation: you have today’s model and yesterday’s model, so if something goes wrong, you roll back to yesterday.

But that’s not rollback at the change level—that’s rollback at the time level. Unfortunately, it means you lose everything else that happened between yesterday and today: all the legitimate work, all the other fixes, all the design progress.

Cloud environments like BIM 360 and Autodesk Docs implement file-level version history, which means you can restore an earlier version of a model. Nevertheless, this is snapshot rollback, not operation-level patches. Research on BIM version control highlights exactly this limitation. For instance, papers by Esser and others propose graph-based diff-and-patch mechanisms for asynchronous BIM collaboration, explicitly modeling changes as patches that can be applied and merged. However, these remain research prototypes, not standard features in Revit, Solibri, or Navisworks today.

Real remediation requires transaction semantics. Each fix becomes a discrete operation with a before-state and an after-state. You can inspect what changed, approve or reject it, and reverse it independently of other changes. In essence, the model has a history of patches, not just a history of snapshots.

This is how databases work. This is how version control works. Unfortunately, this is not how BIM works today.

Why “Just Let the AI Fix It” Is the Wrong Answer

Here’s where the AI hype crashes into reality.

Yes, large language models can write code, understand context, and propose solutions to problems. But “propose” is not “execute safely.”

When an LLM suggests a fix, it’s generating a plausible response based on patterns in its training data. It might be right. Alternatively, it might be subtly wrong in ways that don’t become apparent until the model goes to coordination, or to the contractor, or to the building inspector. Worse still, it might be confidently wrong in ways that look right on the surface.

Gartner recently predicted that over 40% of agentic AI projects will be canceled by the end of 2027, due to escalating costs, unclear business value, or inadequate risk controls. Additionally, they noted that many vendors are engaging in “agent washing”—rebranding existing products like chatbots and RPA tools without substantial agentic capabilities. Of the thousands of agentic AI vendors, Gartner estimates only about 130 are real.

The problem isn’t that AI can’t help with remediation. Instead, the problem is that AI without rails is AI without accountability. If you can’t see exactly what it changed, if you can’t verify the change against explicit rules, if you can’t reverse it cleanly—then you haven’t automated remediation. You’ve just created a new category of risk.

The firms that will actually deploy agents on production data aren’t the ones chasing the most powerful AI. Rather, they’re the ones building the most robust governance: preview before apply, explicit approval gates, full audit trails, and granular rollback.

Think of it this way: the AI is the engine, while the governance is the steering wheel and the brakes. Nobody wants a car that’s all engine.

What This Means for Your Stack

If you’re a digital practice leader looking at your QC and automation stack, here’s the honest assessment:

Detection is mature and commoditized. Solibri, Navisworks, model checkers, custom Dynamo scripts—you already have plenty of ways to find problems. The technology is established, and the workflows are standard. As a result, you’re not going to get order-of-magnitude gains from yet another checker.

Remediation, by contrast, is largely unsolved. Not because it’s impossible, but because it requires a different architecture than detection. It requires treating model changes as transactions, implementing deterministic and bounded operations, and building governance that lets humans stay in control while automation does the heavy lifting.

Admittedly, there are niche tools and research prototypes that implement limited forms of auto-correction—some Dynamo workflows, specialized plugins, and academic systems for geometric error correction in IFC. Nevertheless, there is no widely adopted, governed, patch-based remediation layer equivalent to what production systems in other industries take for granted.

The gap between detection and remediation is where all the manual labor lives. It’s where your BIM managers spend their days clicking through issues one by one. Furthermore, it’s where standards drift happens, because enforcement depends on human vigilance that can’t scale. According to industry studies, up to 70% of all rework in construction can be traced back to engineering and design-related errors—exactly the domain where governed remediation could make a difference.

If your “AI QC tool” generates a report but can’t execute a fix, it’s not automation. It’s simply a fancier way to create a to-do list.

The question isn’t whether AI will eventually help with remediation—it will. The real question is whether it will help through ungoverned, non-deterministic suggestions that someone still has to manually implement, or through governed, reversible, auditable patches that actually close the loop.

The spellchecker that can fix typos exists in every other domain. We just haven’t built it for BIM yet.

The Spellchecker That Can’t Fix a Typo

The Report That Creates More Work

Why We Stopped at Detection

The Difference Between Flagging and Fixing

Why “Just Let the AI Fix It” Is the Wrong Answer

What This Means for Your Stack