21 place 0

881 Frontier models are failing one in three production attempts — and getting harder to audit

VentureBeat
VentureBeat · 04/15/2026 15:35 EDT

AI agents are now embedded in real enterprise workflows, and they're still failing roughly one in three attempts on structured benchmarks. That gap between capability and reliability is the defining operational challenge for IT leaders in 2026, according to Stanford HAI's ninth annual AI Index report.This uneven, unpredictable performance is what the AI Index calls the "jagged frontier," a term coined by AI researcher Ethan Mollick to describe the boundary where AI excels and then suddenly fails.“AI models.

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat VentureBeat
Silicon Valley
George Avalos @ Silicon Valley 1 place · 02/07/2106 01:28 EDT

Newark apartment complex bought for much less than prior value

An East Bay apartment complex has been bought at a price that's well below its prior value. Read more

0

🔮
15.06.2026 ♎︎ Horoscope for Libras Today Dear Libras, today your day may bring not the brightest emotions... Read more ›
Silicon Valley
George Avalos @ Silicon Valley 2 place · 02/07/2106 01:28 EDT

PG&E buys San Jose building to bolster South Bay operations

A PG&E Corp. unit has bought a San Jose building in a move to bolster the utility's South Bay operations. Read more

0

Inc42 Media
Lokesh Choudhary @ Inc42 Media 1 place · today 12:22 EDT

Zetwerk FY26 Revenue Jumps 24% YoY To ₹15,900 Cr

IPO-bound manufacturing startup Zetwerk reported an estimated operating revenue of ₹15,900 Cr in FY26, according to a recent rating rationale… Read more

0 newcommer

Business Insider
James Faris @ Business Insider 1 place · today 12:19 EDT

This chart shows how Fox buying Roku could create a streaming giant with more TV viewership than Netflix

Fox is buying Roku, giving it a bigger advantage in free, ad-supported streaming as Hollywood looks to boost engagement. Read more

0 newcommer

Habr
ersilh0x @ Habr 1 place · today 12:17 EDT

Безопасность Bitrix без иллюзий — 10 проблем для которых не нужен новый CVE

Когда говорят о безопасности, обсуждение быстро сводится к CVE/BDU Какая версия уязвима? Существует ли публичный эксплойт? И успел ли вендор выпустить исправление? Это конечно важно, но на практике инсталляции Bitrix могут скомпрометировать не из-за одной «критической уязвимости года».В этой статье разберем десять проблем безопасности, характерных для самостоятельно размещаемых инсталляций 1С-Битрикс и Bitrix24. Основной акцент будет не на отдельных CVE, а на конфигурации, эксплуатации и разработке. Читать. Read more

0 newcommer

SlashGear
SlashGear 1 place · today 12:15 EDT

5 Cars To Test Drive If You Love Your Hyundai Elantra

Looking to trade in your favorite compact sedan? Discover the five impressive vehicles that deliver the same value, reliability, and performance. Read more

0 newcommer

Business Insider
Kristine Villarroel @ Business Insider 2 place · today 12:13 EDT

Every US state ranked by high school graduation rate, from lowest to highest

Graduation rates across US states measure the share of students who graduate from high school in 4 years. See how the rates vary from state to state. Read more

0 newcommer

Eurogamer.net
Fran Ruiz @ Eurogamer.net 1 place · today 12:10 EDT

Nightdive's remaster of cult FPS SiN gets a gameplay trailer and Steam demo, three years after the project was indefinitely put on hold

Nightdive Studios has been consistently knocking out impressive remasters of both beloved and niche oldies for a long time now, but the refresh of 1998's cult-classic first-person shooter SiN - originally announced in 2020 - was put on ice in 2023 in order to focus on other projects. After years of fan requests, however, it's back on track and coming to newly announced platforms later this year - with a... Read more

0 fresh

Business Insider
Amanda Krause @ Business Insider 3 place · today 12:10 EDT

7 of the best and worst outfits at the 2026 Tribeca Film Festival

Fashion was everywhere at the 2026 Tribeca Film Festival. Stars like Katy Perry, Keke Palmer, and Colman Domingo were there in eye-catching looks. Read more

0 fresh

Gizmodo
Kyle Barr @ Gizmodo 1 place · today 12:07 EDT

Apple Is Leaving Its First Apple Watch Ultra in the Dust

There are many features that the $800 Apple Watch Ultra from 2022 won’t receive. Read more

0 fresh

The Information
Stephanie Palazzolo @ The Information 1 place · today 12:04 EDT

A Washington, D.C.-based Anthropic customer filed a class action lawsuit against Anthropic Sunday night alleging that the company had misled customers about the value of its premium “Max 5x” and “Max 20x” subscription plans. The lawsuit alleges that Anthropic markets its Max 20x plan, which ... Read more

0 newcommer

Droid Life
Tim @ Droid Life 1 place · today 12:02 EDT

This Sweet Battery Bank With Qi2 Magnets is $20 Off

About a month ago, Ugreen sent over this 10,000mAh battery bank for me to test out. The big selling feature is that it offers Qi2 magnets for wireless charging support up to 15W, but also features USB-C for up to 30W fast charging. I’ve used it on an iPhone 17 Pro and Pixel 10 Pro,... Read the original post: This Sweet Battery Bank With Qi2 Magnets is $20 Off Read more

0 fresh

Gizmodo
Cheryl Eddy @ Gizmodo 2 place · today 12:02 EDT

Curry Barker Has ‘Such a Cool Idea’ for ‘Obsession 2’

He's not saying what it is, of course, but he's planning to tackle it at some point in the future. Read more

0 fresh

Business Insider
Leon Siciliano @ Business Insider · today 12:01 EDT

Why 25 billion impressions means nothing, according to Mars Snacking CBO Rankin Carroll

Lara O'Reilly speaks with Rankin Carroll about Mars Snacking's $36 billion acquisition of Kellanova, covering the decisions, challenges, and insights. Read more

0 fresh

Habr
TechRecruiter @ Habr 2 place · today 12:01 EDT

Обзор GPU-облаков в России для обычного пользователя в 2026

Сейчас я учусь на 2 курсе магистратуры МИФИ по ML ( это моё второе высшее образование, по 1 специальности я психолог и TechHR с опытом 17+ лет), и пишу диплом о GENAI аватарах, в рамках диплома я создала прототип коммуникативной системы для HR и кандидатов на основе GENAI аватаров и LLM (подготовка для кандидатов к интервью, первичная оценка кандидатов + доп.сервисы - аналитика по ML-вакансиям в Real-Time). Мой диплом -... Read more

0 fresh

SlashGear
SlashGear 2 place · today 12:00 EDT

The 2026 Infiniti QX80 Is Big, Powerful And Tows Like A Beast, But It's Not All Good News

Infiniti's three-row luxury SUV certainly looks the part, but is it special enough to justify a sky-high price tag? Read more

0 newcommer

The most popular news from the same source for the last week
VentureBeat VentureBeat
VentureBeat
VentureBeat · 06/08/2026 17:01 EDT

In Q1 2026, VentureBeat's Pulse Research surfaced the “Governance Mirage”: the gap between the governance org charts enterprises had drawn and the control layers they had actually built. Forty-three percent said a central team owned AI governance; 23% couldn't agree on who owned it at all; and 31% named vendor opacity as the single biggest obstacle.This new wave of research asks the next question: Once you've admitted the governance problem,... Read more

0

VentureBeat
VentureBeat · 06/08/2026 18:19 EDT

A joint research collaboration between researchers at the University of Illinois at Urbana-Champaign (UIUC), UC Berkeley, and the open source AI-native vector database platform Chroma unveiled Harness-1, a 20-billion parameter open-source search agent built atop OpenAI's gpt-oss-20B open source model that fundamentally redesigns how AI executes complex retrieval tasks. Harness-1 achieves a massive leap in performance, scoring 73% average on its ability to recall relevant information correctly from a curated Read more

0

VentureBeat
VentureBeat · 06/09/2026 10:00 EDT

Presented by Norton For 39 days this summer, the planet will be doing roughly the same thing at the same time. The 2026 World Cup spans 104 matches across 16 cities in the United States, Canada, and Mexico, with billions of people likely to watch over the course of the tournament. It could very well be one of the largest shared events the internet has ever been asked to carry.... Read more

0

VentureBeat
VentureBeat · 06/09/2026 13:19 EDT

Anthropic today launched two new AI models — Claude Fable 5 and Claude Mythos 5 — marking the company’s first broad release of the powerful “Mythos-class” AI capabilities it previously made available only to participating organizations in its restricted cybersecurity program, Project Glasswing, which it announced two months ago.The company says Fable 5, which is the version most users and developers will get starting today, exceeds every Claude model it... Read more

0

VentureBeat
VentureBeat 3 place · 06/09/2026 13:49 EDT

On-device AI models have stayed small because the entire weight set has to live in DRAM, capping practical parameter counts well below what server-side deployments use. Enterprise architects evaluating agentic workloads have had to choose between capable cloud-dependent models and limited on-device ones. Apple's third-generation foundation models, announced at WWDC26, break that constraint by moving the weight set off DRAM entirely.The AFM 3 family was developed in collaboration with Google... Read more

0

VentureBeat
VentureBeat · 06/09/2026 17:41 EDT

Engineering teams building agentic coding pipelines now have a concrete open-source alternative to managed models like Claude Fable 5 — one that runs on a single H100. The tradeoff: Cohere's North Mini Code, which launched Tuesday, generated three times the output tokens of comparable models in independent testing, a verbosity cost that compounds in high-volume production workloads.The new open-source model is a 30 billion parameter mixture-of-experts (MoE) model with 3... Read more

0

VentureBeat
VentureBeat 3 place · 06/09/2026 17:49 EDT

Apple’s new Siri AI, unveiled yesterday at Apple's annual Worldwide Developers Conference (WWDC 2026), may look like a consumer product story on the surface. But for enterprise developers and IT leaders, the bigger news from WWDC26 is that Apple is turning Siri into a systemwide AI interface for apps, data and workplace actions across iPhone, iPad, Mac, Apple Watch and Vision Pro, as revealed in the WWDC26 Apple Intelligence developer... Read more

0

VentureBeat
VentureBeat · 06/10/2026 13:31 EDT

Enterprise AI teams face a dilemma: The best models today might not be the best models a year from now. MassMutual's answer is to stop making long-term bets — and build infrastructure that can swap models as the market shifts.“The world of AI today is extremely dynamic,” Sears Merritt, MassMutual CIO, explained in a new VB Beyond the Pilot podcast. “We wanted to make sure we were positioned to ride... Read more

0

VentureBeat
VentureBeat · 06/10/2026 15:50 EDT

In a sweeping new essay titled "Policy on the AI Exponential," Anthropic co-founder and CEO Dario Amodei publicly calls for new government regulations governing the release of powerful AI models — specifically comparing AI industry to commercial aviation, which follows regulations enforced by the U.S. Federal Aviation Administration (FAA) — arguing that this is necessary to maintain public safety as AI capabilities and potential misuses grow.Alongside the essay, Anthropic released... Read more

0

VentureBeat
VentureBeat · 06/10/2026 17:52 EDT

Training a foundation LLM from scratch costs millions and requires internet-scale data — which is why most enterprises don't bother. Sapient thinks it has a cheaper path.To overcome this brute-force scaling dogma, researchers at Sapient developed HRM-Text, which replaces standard Transformers with a highly sample-efficient Hierarchical Recurrent Model (HRM), an architecture they first introduced last year.HRM decouples computation into slow-evolving strategic and fast-evolving execution layers. Instead of b Read more

0

Most popular sources

  • You see 680 news out of 680.
  • Sources 61 out of 61.
Startup News 0%
ScienceDaily 0%
Tech Wire Asia 0%
Sifted 0%
Ubergizmo 0%
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

15.06.2026 12:31
Last update: 12:25 EDT.
News rating updated: 19:20.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2026