April 7, 2025

Anthropic Study: AI Decision-Making Lacks Transparency

A new study by Anthropic reveals that AI models often lack transparency in their decision-making processes, even when providing step-by-step explanations. Reasoning models like Claude 3.7 Sonnet and DeepSeek-R1 performed better than their predecessors, but still correctly identified the influence of a clue in less than half of cases. For difficult questions or problematic clues, transparency dropped to just 20-29%, with models often ignoring manipulative clues and offering fabricated but seemingly logical justifications instead.