The Productivity Dividend vs the Judgement Deficit
AI is raising output. Leaders must ensure we aren’t lowering judgment.
Written by Adrian Maharaj
(Views are my own, not my employer’s.)
The newest usage datasets from OpenAI and Anthropic confirm what operators see every day: AI isn’t just for grunt work. It’s handling cognitively expensive work writing, information interpretation, practical guidance, systems analysis, and programming. In OpenAI’s classification, Practical Guidance, Seeking Information, and Writing account for nearly 4/5ths of conversations, with writing the dominant work task. Anthropic’s mapping of millions of Claude chats to O*NET skills shows a similar pattern: cognitive skills such as critical thinking and writing are everywhere; manual tasks are rare. Anthropic Brand Portal+3OpenAI+3OpenAI+3
What the data means for leaders
This is positive: the biggest gains are landing where knowledge work actually creates value thinking through ambiguity, structuring language, and turning analysis into action. But there’s a second order effect that the dashboards don’t show: if we perpetually “ask the model” first, we risk outsourcing the very skills we need to evaluate, redirect, and sometimes reject its outputs.
We’ve seen this movie. Calculators improved accuracy and speed but eroded number sense where teams stopped practicing mental math. GPS transformed fleet productivity and ETA reliability, but everyday wayfinding faded for drivers who never built a mental map. Spellcheck and autocomplete brought professional polish to more writers, while some lost fluency with grammar and structure when they stopped drafting before editing.
AI is poised to do both at once. It will dramatically raise the ceiling on throughput and polish while quietly lowering the floor on judgment if we let the tool become the thinker instead of the amplifier.
Three horizons
0–2 years: Throughput. Assistants draft, summarize, and pattern‑match across messy inputs. Not adopting is the bigger risk here; the guardrail is to preserve problem framing and “sense making” by doing a first pass before the model.
3–5 years: Decision shape. As assistants compress options and propose “likely best” paths, they start to frame the decision. This is where counterfactual thinking and dissent become protective assets. Without them, fluent outputs launder thin logic.
5–10 years: Memory + simulation. Assistants become institutional memory and scenario engines. Organizations that keep judgment muscles strong will compound; those that don’t will generate immaculate artifacts built on stale priors.
Some open thoughts on preserving judgement
Manual‑mode sprints. One significant task per week is solved without AI, then compared to an AI‑assisted version. This preserves problem setup and builds calibration.
Explain‑your‑why. For material decisions, require a brief rationale plus at least one credible alternative. Reward teams that can articulate why the model’s frame might be wrong.
Apprenticeship loops. Juniors draft first, then use AI for critique and iteration, then get a human debrief. The goal is independent judgment, not prettier decks.
Provenance. Mark what the model did, what the human did, and why. This turns outputs into teachable assets and makes audits possible.
The metric to watch
Track judgment density: the number of explicit trade‑offs, alternatives, and assumptions captured in the artifacts that matter (PRDs, Exec briefs, client presentations). If judgment density falls while outputs look more polished, you’re borrowing cognition you don’t own. On the flip, that’s where the tools can help accelerate, if the user inquires (on their own) “why that, or what does that mean”, the optimist in me hopes that it accelerates the curiosity curve to learning.
A final caution
Across domains, research on automation bias shows that over‑reliance on decision aids can reduce vigilance and induce omission/commission errors. The lesson is not “avoid automation,” it’s to design workflows where humans routinely verify and reframe automated suggestions. PMC+1
Sources (primary)
OpenAI, How People Use ChatGPT (working paper + overview).OpenAI+2NBER+2
Anthropic, Which Economic Tasks Are Performed with AI? Evidence from Millions of Claude Conversations.Anthropic Brand Portal+1
Aviation skill‑decay evidence.Office of Inspector General+1
GPS and spatial memory.Nature+1
“Google effect” on memory.Science