Jun 07, 2026 Can Activation Oracles Bypass Safety Training? Reading Harmful Knowledge from a Model That Refuses