Why Human Risk Management Is the Control Plane for AI at Work
Executive Summary As AI moves from experimentation into everyday work, a familiar question keeps surfacing:
As AI becomes part of everyday work, many leaders find themselves in a familiar place — a sense that something important is changing faster than their ability to see or manage it.
It’s not that organizations lack data. Most can tell you how accurate a model is, how often it runs, or how much it costs.
What’s harder is seeing what’s happening around the technology.
Where people rely on AI without checking it. Where unofficial tools appear because the sanctioned ones don’t quite fit. Where confidence quietly erodes, or escalation stops because “the system already decided.”
This is a common moment, not a failure.
Research from MIT Sloan Management Review shows that more than 70% of AI initiatives fail to deliver expected value, largely due to organizational and decision-design challenges rather than technical limitations (https://sloanreview.mit.edu/article/why-ai-projects-fail/).
This article explores what it actually means to measure human risk in AI‑driven work, why traditional metrics struggle here, and how organizations are beginning to surface the signals that matter — thoughtfully, proportionately, and without turning measurement into surveillance.
Most existing metrics were built for a world where technology behaved predictably and risk could be traced to discrete events.
Security teams learned to count incidents. Learning teams learned to count completions. IT teams learned to count uptime and performance.
AI changes the shape of risk.
Risk now emerges between people and systems — inside judgment calls, handoffs, trust decisions, and moments of pressure. These are not signs of failure; they are normal features of complex work.
Gartner’s research on AI governance notes that many AI-related risk events stem not from model malfunction, but from how humans interpret, trust, and act on AI outputs, particularly where accountability and escalation are unclear (https://www.gartner.com/en/articles/ai-governance-why-it-matters).
It’s why leaders often sense something is off long before they can prove it.
Measurement, quite naturally, lags reality.
This point matters, and it’s worth being explicit.
Measuring human risk is not about scoring individuals, tracking productivity, or policing behavior. Those concerns come up often — and they’re reasonable.
At its best, measurement focuses on systems of work, not people. It looks for patterns, trends, and conditions that make good decisions easier — or harder — to sustain.
This approach mirrors decades of human factors research in safety‑critical industries like aviation and healthcare, where understanding context has proven far more effective than assigning blame.
When measurement stays systemic and purpose‑driven, trust tends to follow.
As AI becomes part of daily decision‑making, several new classes of human risk signal begin to appear.
When official tools don’t quite fit the reality of work, people adapt.
They paste data into public models. They run parallel tools. They automate tasks quietly.
This behavior is often described as “shadow AI,” but it’s rarely malicious.
Deloitte’s State of AI in the Enterprise research shows that employees most often turn to unsanctioned tools to maintain productivity and meet expectations — not to bypass controls (https://www.deloitte.com/global/en/our-thinking/insights/topics/analytics/state-of-ai-in-the-enterprise.html).
Seen this way, shadow AI isn’t a failure of policy. It’s a signal — that work design, tooling, or governance hasn’t yet caught up with how work actually gets done.
Organizations that measure where and why shadow AI appears tend to learn far more than those that focus solely on shutting it down.
One of the most important — and least visible — signals is how often AI outputs are checked, questioned, or overridden.
In healthy environments, verification varies with risk. High‑impact decisions invite more scrutiny. Challenge feels acceptable. Escalation paths are used without stigma.
Gartner notes that over‑reliance on AI outputs, combined with insufficient human challenge, is a common contributor to AI risk incidents — especially where accountability is unclear (https://www.gartner.com/en/articles/ai-governance-why-it-matters).
Measuring verification behavior doesn’t slow work. It reveals whether trust is well‑calibrated or quietly drifting.
AI is often positioned as reducing effort. In practice, it can shift cognitive load in subtle ways.
People may move faster while feeling less confident — or feel confident without fully understanding why. Both patterns matter.
Research in cognitive psychology shows that rising mental load and declining decision confidence often precede errors, particularly in complex, AI‑supported tasks.
Signals such as hesitation, repeated confirmation, or avoidance of ownership are not performance issues. They are early indicators of risk — and opportunities to adjust before problems compound.
In AI‑driven work, the speed of escalation matters as much as the existence of escalation paths.
How long does it take for someone to raise a concern? What happens after the first signal?
Delays here are rarely technical. More often, they reflect culture, clarity, and psychological safety.
Measuring escalation latency helps organizations see where risk is being absorbed quietly instead of surfaced early. In practice, this often starts with simple questions: how long it takes from an AI‑related concern being noticed to it being raised; who it is raised with; and what happens next. Some organizations instrument this through incident and ticketing data, others through post‑decision reviews, pulse surveys, or facilitated debriefs that ask teams when they hesitated to escalate and why. The goal is not perfect precision, but enough visibility to spot patterns where culture, role clarity, or trust are slowing the flow of critical information.
Organizations further along the curve tend to shift away from blunt metrics and toward behavioral indicators:
These measures don’t promise certainty. They offer visibility.
This is where measurement connects back to Human Risk Management as the AI control plane.
Metrics are not the goal. They are the feedback loop. When leaders can see how humans and AI are actually interacting, they can:
Without this loop, governance stays theoretical.
As organizations begin to explore measurement, there are a few patterns we see come up again and again — not because teams are careless, but because this is genuinely new territory.
One common trap is focusing on intent rather than behavior: what people say they do, instead of what actually happens under pressure. Another is collecting data without creating the space or mechanisms to act on it, which can leave teams feeling observed but unsupported.
We also see measurement struggle when it’s framed as enforcement rather than learning, or when a single set of indicators is applied uniformly across very different roles and contexts.
Measurement creates value when it’s treated as a feedback loop — something that helps teams reflect, adapt, and improve how work gets done over time.
Most organizations don’t need a perfect measurement system. They need a starting point.
The most effective first step is usually to identify:
From there, measurement can grow iteratively—guided by real risk, not abstract models.
Instead of asking: “What metrics should we report?”
Leaders are beginning to ask: “What signals would tell us early that AI‑driven work is drifting out of alignment?”
That shift is where meaningful measurement begins.
No. Effective measurement focuses on systems, workflows, and conditions—not individual performance.
Some signals can be captured automatically. Others require qualitative input and reflection.
Measurement provides the evidence governance needs to move from principle to practice.
Yes. Measurement is the foundation that allows structured AI risk and culture models to function.
This article is part of the AI Workforce Transformation series. Next:
Because in AI‑driven work, risk doesn’t announce itself. It shows up quietly—unless you know how to look.
Executive Summary As AI moves from experimentation into everyday work, a familiar question keeps surfacing:
5 min read
As AI reshapes how work gets done, one uncomfortable question keeps surfacing inside organizations:
6 min read
Executive Summary AI governance is having a moment.
5 min read
Subscribe to our newsletters for the latest news and insights.
Stay updated with best practices to enhance your workforce.
Get the latest on strategic risk for Executives and Managers.