January 2026
Quoting Boaz Barak, Gabriel Wu, Jeremy Chen and Manas Joglekar
OpenAI researchers are developing a "confessions" training method where AI models produce a second output that is rewarded solely for honesty. This approach creates an anonymous tip line where models...
Claude Cowork Exfiltrates Files
Researchers discovered a clever attack that bypasses Claude Cowork's security protections by exploiting its trusted domain whitelist to exfiltrate user files. The attack uses the victim's own AI...
AI News: Gemini UPGRADED, GPT-5.3 LEAKED, Claude Cowork, AI Doctors!
Major AI companies are making strategic moves toward personal, agentic, and specialized AI systems. Google introduced personal intelligence in Gemini, Anthropic launched Claude Co-work for autonomous...
Identity for AI Agents - Patrick Riley & Carlos Galan, Auth0
Patrick Riley and Carlos Galan from Auth0 present their approach to securing AI agents through identity management. They demonstrate four key pillars for agent security: AI needs to know who you are,...
Anthropic invests $1.5 million in the Python Software Foundation and open source security
Anthropic has committed $1.5 million over two years to the Python Software Foundation, with a focus on ecosystem security. This addresses a critical funding gap after the PSF withdrew from an NSF...