The failure mode I see most isn't blind trust — it's
unverifiable trust. The output looks confident and coherent,
so there's no obvious signal to distrust it.
Building an agent debugging tool taught me this concretely:
LLMs will return structurally valid, fluent garbage mid-loop
and the agent keeps running. No error. No warning. Just wrong
output propagating forward silently.
The people who deal with LLMs best treat them like a junior
dev who writes clean-looking code that hasn't been tested.
You don't distrust everything they say — you just never skip
the review step.
Building an agent debugging tool taught me this concretely: LLMs will return structurally valid, fluent garbage mid-loop and the agent keeps running. No error. No warning. Just wrong output propagating forward silently.
The people who deal with LLMs best treat them like a junior dev who writes clean-looking code that hasn't been tested. You don't distrust everything they say — you just never skip the review step.