January 2026

Simon Willison's Weblog January 15, 2026

Quoting Boaz Barak, Gabriel Wu, Jeremy Chen and Manas Joglekar

OpenAI researchers are developing a "confessions" training method where AI models produce a second output that is rewarded solely for honesty. This approach creates an anonymous tip line where models...

Read summary

Simon Willison's Weblog January 14, 2026

Claude Cowork Exfiltrates Files

Researchers discovered a clever attack that bypasses Claude Cowork's security protections by exploiting its trusted domain whitelist to exfiltrate user files. The attack uses the victim's own AI...

Read summary

Universe of AI January 14, 2026

AI News: Gemini UPGRADED, GPT-5.3 LEAKED, Claude Cowork, AI Doctors!

Major AI companies are making strategic moves toward personal, agentic, and specialized AI systems. Google introduced personal intelligence in Gemini, Anthropic launched Claude Co-work for autonomous...

Read summary

AI Engineer January 14, 2026

Identity for AI Agents - Patrick Riley & Carlos Galan, Auth0

Patrick Riley and Carlos Galan from Auth0 present their approach to securing AI agents through identity management. They demonstrate four key pillars for agent security: AI needs to know who you are,...

Read summary

Simon Willison's Weblog January 13, 2026

Anthropic invests $1.5 million in the Python Software Foundation and open source security

Anthropic has committed $1.5 million over two years to the Python Software Foundation, with a focus on ecosystem security. This addresses a critical funding gap after the PSF withdrew from an NSF...

Read summary