Developing AI safety tests offers opportunities to meaningfully contribute to AI safety while advancing our understanding of ...
OpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or ...
A new set of much more challenging evals has emerged in response, created by companies, nonprofits, and governments. Yet even ...
Experiments by Anthropic and Redwood Research show how Anthropic's model, Claude, is capable of strategic deceit ...
Meta is the world’s standard bearer for open-weight AI. In a fascinating case study in corporate strategy, while rivals like ...
Marc Carauleanu's vision is clear: AI can become more powerful and responsible by implementing self-other overlap and related ...
OpenAI's o1 model, which users can access on ChatGPT Pro, showed "persistent" scheming behavior, according to Apollo Research ...
A third-party lab caught OpenAI's o1 model trying to deceive, while OpenAI's safety testing has been called into question.
OpenAI announced the release of a new family of AI models, dubbed o3. The company claims the new products are more advanced ...
A study from Anthropic's Alignment Science team shows that complex AI models may engage in deception to preserve their ...
As AI models rise in popularity, and power, AI safety research seems increasingly relevant. But at the same time, it's more controversial: David Sacks, Elon Musk, and Marc Andreessen say some AI ...