Anthropic hires former OpenAI safety lead to head up new team

Jan Leike, a leading AI researcher who earlier this month resigned from OpenAI before publicly criticizing the company's approach to AI safety, has joined OpenAI rival Anthropic to lead a new "superalignment" team.

In a post on X, Leike said that his team at Anthropic will focus on various aspects of AI safety and security, specifically "scalable oversight," "weak-to-strong generalization" and automated alignment research.

A source familiar with the matter tells TechCrunch that Leike will report directly to Jared Kaplan, Anthropic's chief science officer, and that Anthropic researchers currently working on scalable oversight -- techniques to control large-scale AI's behavior in predictable and desirable ways -- will move to report to Leike as Leike's team spins up.

https://twitter.com/janleike/status/1795497960509448617?s=46&t=HH_KyELe3gP37tViFmnCoQ

In many ways, Leike's team sounds similar in mission to OpenAI's recently-dissolved Superalignment team. The Superalignment team -- which Leike co-led -- had the ambitious goal of solving the core technical challenges of controlling superintelligent AI in the next four years, but was often hamstrung by OpenAI's leadership.

Anthropic has often attempted to position itself as more safety-focused than OpenAI.

Anthropic's CEO, Dario Amodei, was the former VP of research at OpenAI, and reportedly split with OpenAI after a disagreement over the company’s direction -- namely OpenAI's increasingly commercial focus. Amodei brought with him a number of OpenAI employees including OpenAI’s former policy lead Jack Clark.