AI in compliance work: what it can do today, and where humans still own the call

There is a lot of heat in the market about AI in compliance tooling, and not always a lot of light. This article is an attempt to describe, as honestly as we can, what AI does well in GRC work today, what it does not do well, and what a sensible governance posture looks like for organizations adopting AI-augmented tools in their compliance programs.

An honest state of the world

In 2026, large language models are genuinely useful for a specific set of compliance tasks: reading long regulator documents, drafting structured requirements from them, suggesting mappings between controls and requirements, classifying evidence by type and category, flagging anomalies in structured data, and summarizing long documents in plain language. They are not useful — or not reliable enough to be useful without heavy human review — for novel regulation interpretation, judgment calls about what is and is not compliant, negotiation with regulators or auditors, and anything that requires the model to take accountability for an outcome. The boundary between the two categories is not arbitrary; it maps to whether the task has a clear correct answer that can be verified from the source material.

Six tasks AI does well

Extraction: turning a regulator's prose into structured requirements with citations. Summarization: producing a plain-language summary of a long document, with the original always available for verification. Mapping suggestion: proposing which controls might satisfy a new requirement, based on an existing library. Classification: tagging new evidence with a suggested control and category. Drafting: producing a first draft of a control description, a finding, or a policy section for a human to edit. Anomaly flagging: noticing when something in structured data looks unlike the patterns in the rest of the data. All of these are tasks where a human review pass is both fast and valuable, and where the model's output can be verified against source material in minutes, not hours.

Four tasks AI does not do well

Judgment: deciding whether a particular edge case is in scope or not. Accountability: standing behind a decision in front of a regulator or an auditor. Novel regulation interpretation: deciding what a regulator means by a newly published phrase that has no precedent. Negotiation: talking to auditors or regulators about findings. These tasks share a property: the correct answer cannot be verified purely from source material, because the answer depends on context the model does not have and cannot have.

The human in the loop pattern

The only responsible way to deploy AI in a compliance program, today, is with a human review gate. The pattern is simple: the AI produces an output. The output lands in a review queue, attributed to the model run that produced it. A named human — usually a compliance manager — reviews the output, edits it if needed, and either approves or rejects it. Only after approval does the output affect the canonical record. Rejection feedback is captured but not used to silently mutate the model behavior. Every approval is timestamped and attributed. When an auditor later asks whether AI affected a specific control, the honest answer is in the audit trail.

A governance checklist

When adopting an AI-augmented GRC tool, ask these questions. Does every AI output land in a review queue before affecting canonical data? Is every AI output attributed to the run that produced it? Can users reject AI suggestions with a reason, and is that reason recorded? Is there an individual feature opt-out? Does the vendor use customer data to train shared models without explicit written consent? Can you run a quality report on the AI features your organization uses? Are the vendor's incident response processes for AI behavior documented? These questions do not need fancy answers; they need honest ones. A vendor that cannot answer them should not be shipping AI in a compliance tool.

What "safe AI" does not mean

Safe AI does not mean the model never makes a mistake. It does not mean the model is 100% accurate on every task. It does not mean the vendor can predict every failure mode. Safe AI means the mistakes are caught before they affect the canonical record, because a human is in the loop; it means failure modes are instrumented and monitored, because the vendor invests in governance; and it means the organization using the AI is never forced to take accountability for an AI output they did not see.

Key takeaways

In 2026, AI is useful in GRC for extraction, summarization, mapping suggestion, classification, drafting, and anomaly flagging.
It is not reliable for judgment, accountability, novel interpretation, or negotiation.
The human in the loop pattern is the only responsible deployment model today.
Ask vendors six specific governance questions before deploying their AI features.