News
Anthropic is treating its new Claude Opus 4 language model as safety-critical after tests revealed some troubling behavior, including escape attempts, blackmail, and autonomous whistleblowing. After ...
A research team from Meta and several French clinics has studied how children develop neural language representations—and found that AI models go through similar learning stages as the human brain.
OpenAI is rolling out Codex, a cloud-based AI agent for software development that automates tasks like bug fixes and feature implementation. The company says Codex is designed to eventually establish ...
ChatGPT can now access company data through a SharePoint connector. The beta feature is available for ChatGPT Plus, Pro, and Team users. With the new integration, ChatGPT can analyze and summarize ...
Anthropic has released its next generation of AI models, Claude Opus 4 and Claude Sonnet 4, and is introducing new safety measures designed to prevent their use in developing chemical, biological, ...
Meta has released OMol25, the largest open dataset to date for AI-driven chemistry, and introduced UMA, a universal AI model designed to predict chemical properties of molecules and materials. OMol25 ...
About 100 hours after its initial launch, Google is opening up its AI video model Veo 3 to users in 71 additional countries. The news came from Josh Woodward, Vice President of Gemini at Google, who ...
Google is reportedly getting ready to unveil several new AI features at its upcoming I/O developer conference. Among them is an AI assistant aimed at software development, known internally as the ...
Microsoft used its Build 2025 event to showcase a range of new tools for AI agents, open interfaces, and developer workflows. The focus this year is on new research platforms, multi-agent systems for ...
A recent AI research paper touting dramatic productivity boosts has come under fire for questionable data and unverified claims—prompting MIT to publicly distance itself from the study. Published in ...
OpenAI has introduced HealthBench, a new benchmark for evaluating AI in healthcare, built on 5,000 simulated doctor-patient conversations and measured by 48,000 medically relevant criteria. The latest ...
Google is testing a new experimental mode for Gemini 2.5 Pro that adds deeper reasoning capabilities and native audio output. The new mode, called "Deep Think," is designed to help the model evaluate ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results