OpenAI pledges to publish AI safety test results more often

May 14, 2025

Open AI Chief Executive Officer Sam Altman speaks during the Kakao media day in Seoul.

OpenAI is transferring to publish the outcomes of its inner AI mannequin security evaluations extra usually in what the outfit is saying is an effort to extend transparency.

On Wednesday, OpenAI launched the Security evaluations hub, an online web page displaying how the corporate’s fashions rating on varied exams for dangerous content material technology, jailbreaks, and hallucinations. OpenAI says that it’ll use the hub to share metrics on an “ongoing foundation” and that it intends to replace the hub with “main mannequin updates” going ahead.

Introducing the Security Evaluations Hub—a useful resource to discover security outcomes for our fashions.

Whereas system playing cards share security metrics at launch, the Hub can be up to date periodically as a part of our efforts to speak proactively about security.https://t.co/c8NgmXlC2Y

— OpenAI (@OpenAI) May 14, 2025

“Because the science of AI analysis evolves, we purpose to share our progress on growing extra scalable methods to measure mannequin functionality and security,” wrote OpenAI in a weblog publish. “By sharing a subset of our security analysis outcomes right here, we hope this is not going to solely make it simpler to know the protection efficiency of OpenAI techniques over time, but in addition help group efforts⁠ to extend transparency throughout the sphere.”

OpenAI says that it could add extra evaluations to the hub over time.

In latest months, OpenAI has raised the ire of some ethicists for reportedly speeding the protection testing of sure flagship fashions and failing to launch technical experiences for others. The corporate’s CEO, Sam Altman, additionally stands accused of deceptive OpenAI executives about mannequin security critiques previous to his temporary ouster in November 2023.

Late final month, OpenAI was compelled to roll again an replace to the default mannequin powering ChatGPT, GPT-4o, after customers started reporting that it responded in a very validating and agreeable approach. X turned flooded with screenshots of ChatGPT applauding all kinds of problematic, harmful selections and concepts.

OpenAI stated that it could implement a number of fixes and adjustments to stop future such incidents, together with introducing an opt-in “alpha part” for some fashions that may permit sure ChatGPT customers to check the fashions and provides suggestions earlier than launch.

OpenAI pledges to publish AI safety test results more often

LEAVE A REPLY Cancel reply

EDITOR PICKS

From MAHA to TACO: A Guide to the Acronyms of Trump’s Second Term

Record 45 Million to Travel Memorial Day Weekend

Iconic Portland House Made Famous in Twilight Is a LEGO Set

Russell & Bromley embarks on turnaround plan

EVEN MORE NEWS

Balenciaga’s Winter 2025 “Clients” Include Nicole Kidman

Little Mistress to land in Shaws stores across Ireland

adidas Megaride S2 Release Info

POPULAR CATEGORY