AI-101

Google DeepMind Research Measures AI Manipulation Risks

Source: Google DeepMindPublished: (1mo ago)Added to AI-101:

AI-generated

TLDR

Google DeepMind released research on AI's potential for harmful manipulation, introducing an empirically validated toolkit to measure this risk. The study involved over 10,000 participants across the UK, US, and India.

DeepMind integrated a Harmful Manipulation Critical Capability Level into their Frontier Safety Framework to track models with misusable capabilities. These evaluations informed testing of Gemini 3 Pro.

Key Takeaways

  • Google DeepMind releases research distinguishing beneficial persuasion from harmful manipulation, introducing validated toolkit to measure AI manipulation risk
Read original →