21 place 0

627 Monitoring LLM behavior: Drift, retries, and refusal patterns

VentureBeat
VentureBeat 2 place · 04/26/2026 11:13 EDT

The stochastic challengeTraditional software is predictable: Input A plus function B always equals output C. This determinism allows engineers to develop robust tests. On the other hand, generative AI is stochastic and unpredictable. The exact same prompt often yields different results on Monday versus Tuesday, breaking the traditional unit testing that engineers know and love.To ship enterprise-ready AI, engineers cannot rely on mere “vibe checks” that pass today but fail when customers use the product. Pr

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat VentureBeat
Silicon Valley
George Avalos @ Silicon Valley 1 place · 02/07/2106 01:28 EDT

Newark apartment complex bought for much less than prior value

An East Bay apartment complex has been bought at a price that's well below its prior value. Read more

0

🔮
09.06.2026 ♈︎ Dear Aries, today promises to be quite ambiguous and eventful for you. In the realm... Read more ›
Silicon Valley
George Avalos @ Silicon Valley 2 place · 02/07/2106 01:28 EDT

PG&E buys San Jose building to bolster South Bay operations

A PG&E Corp. unit has bought a San Jose building in a move to bolster the utility's South Bay operations. Read more

0

The Information
Valida Pau @ The Information 1 place · today 19:00 EDT

It may not seem like it now, as we barrel toward SpaceX’s public debut this week, but not every initial public offering will involve a $1.75 trillion rocket-ship company. Soon enough, public investors may get to judge a much more down-to-earth business: school bus fleets. Sequoia Capital-backed Zum, known for using its electric school buses to shuttle kids to class in school districts including Los Angeles and San Francisco, has... Read more

0 newcommer

TechRadar
TechRadar 1 place · today 19:00 EDT

NYT Strands hints and answers for Wednesday, June 10 (game #829)

Looking for NYT Strands answers and hints? Here's all you need to know to solve today's game, including the spangram. Read more

0 fresh

TechRadar
TechRadar 2 place · today 19:00 EDT

Quordle hints and answers for Wednesday, June 10 (game #1598)

Looking for Quordle clues? We can help. Plus get the answers to Quordle today and past solutions. Read more

0 fresh

Slashdot
BeauHD @ Slashdot 1 place · today 19:00 EDT

US Labels BYD, Baidu, Alibaba and Other Tech Giants As Aiding China's Military

The Pentagon has added Alibaba, BYD, Baidu, Unitree, and other Chinese companies to its list of firms it says support China's military, barring them from U.S. defense contracts. The companies and China's embassy deny the allegations. The Associated Press reports: Created in 2021 by a congressional mandate, the list (PDF) seeks to identify Chinese companies that the Pentagon considers to have links to the Chinese military -- not only those... Read more

0 fresh

TechRadar
TechRadar 3 place · today 19:00 EDT

NYT Connections hints and answers for Wednesday, June 10 (game #1095)

Looking for NYT Connections answers and hints? Here's all you need to know to solve today's game, plus my commentary on the puzzles. Read more

0 fresh

The Information
Michael Roddan @ The Information 2 place · today 18:57 EDT

Prediction markets platform Kalshi is asking customers in some wagers to provide the name of their employer, industry and job function before making bets, to help the company crack down on potential insider trading. “For markets with heightened insider or manipulation risk, we now collect ... Read more

0 newcommer

MacRumors
Juli Clover @ MacRumors 1 place · today 18:52 EDT

Apple Updates App Store Guidelines With Stricter Rules for Low-Quality Apps

Apple updated its App Store Review Guidelines this week, adding stricter language around low-quality apps. The 4.3 Spam rule already barred overly simple apps in saturated categories, but Apple now includes language saying low-effort apps could be pulled from the App Store. Apps in oversaturated categories that are not updated, improved, or do not attract customers may be removed, according to Apple. App Guideline 4.3(b) New Language: Don't submit apps... Read more

0 fresh

SlashGear
SlashGear 1 place · today 18:45 EDT

China's Destroyer Fleet Production Could Soon Outpace The US Navy

Think the U.S. military still has the largest maritime force? Discover why defense analysts are sounding the alarm over China's new destroyer fleet. Read more

0 fresh

Digital Trends
Faiz Aly @ Digital Trends 1 place · today 18:42 EDT

A guide to Sony’s 2026 TVs and home theater lineup

Sony’s 2026 home theater lineup includes the new BRAVIA 9 II, BRAVIA 7 II, BRAVIA 8 II, and Theater Trio. Here’s what True RGB brings to the table and which products stand out in the company’s latest lineup. Read more

0 fresh

Business Insider
Brent D. Griffiths @ Business Insider 1 place · today 18:40 EDT

What smart people are saying about OpenAI's IPO filing

Financial analysts, CEOs, and researchers dissect how OpenAI is positioned as it enters a new phase of the AI race: preparing to go public. Read more

0 fresh

Habr
nlaik @ Habr 1 place · today 18:38 EDT

Anthropic выпустила Fable 5 — и я задумался, не движемся ли мы не в ту сторону

«9 июня Anthropic выпустила Claude Fable 5 — первую публично доступную модель класса Mythos, той самой, которую в апреле сочли слишком мощной для публики. Релиз впечатляет по бенчмаркам, но чем дольше я его читал, тем сильнее крепло ощущение: модель дороже вдвое, медленнее, тратит лимиты по двойной ставке, а на части запросов сама себя понижает до Opus. Разбираю релиз по фактам и рассуждаю, почему удешевление и ускорение, возможно, важнее, чем очередные... Read more

0 fresh

Gizmodo
Bruce Gil @ Gizmodo 2 place · today 18:37 EDT

The Apple Car Is Dead, and Waymo Just Bought Its Gravesite

The robotaxi company paid $220 million for the former proving ground where Apple tested its now dead self-driving car project. Read more

0 fresh

Digital Trends
Sudhanshu Kumar Mangalam @ Digital Trends 2 place · today 18:37 EDT

Rivian R2 SUV deliveries have begun, just not for the version most buyers may want

Rivian’s smaller electric SUV is finally on the road, but buyers waiting for the more affordable R2 trims will have to wait longer. Read more

0 fresh

TechRadar
TechRadar · today 18:35 EDT

'AI tools could lead to nothing less than the death of astrophysics': Researchers predict bleak future for thousands who study black holes, galaxies, and supernovae

Astrophysicists increasingly fear artificial intelligence could weaken scientific reasoning while transforming research, publishing, training, and academic culture worldwide. Read more

0 fresh

GSMArena.com
GSMArena.com 1 place · today 18:31 EDT

Apple launches personalized recommendations for the App Store

At its Worldwide Developers Conference, Apple announced a new feature coming to its App Store: Personalized Collections. In fact, these are app and game recommendations tailored to you. They will be based on your interests, and will come with App Notes that explain why specific apps are recommended. The tailored recommendations can appear on the Apps, Games, and Search tabs, and they will evolve over time based on your app... Read more

0 fresh

The most popular news from the same source for the last week
VentureBeat VentureBeat
VentureBeat
VentureBeat 2 place · 06/02/2026 21:55 EDT

Every new AI agent your team deploys starts from scratch: no memory of how the business works, where data lives, or what rules apply. And as agentic coding tools spin up applications faster than anyone can govern them, each one risks becoming another silo outside your data layer entirely. Microsoft is addressing both problems directly at Build 2026.According to VentureBeat's VB Pulse's Q1 2026 RAG Infrastructure Market Tracker, hybrid retrieval... Read more

0

VentureBeat
VentureBeat 1 place · 06/03/2026 14:49 EDT

While many AI open source model providers are pursuing larger and more powerful models, Google is still giving attention to the smaller, more local side of the market. Today, the tech giant released Gemma 4 12B, an 11.95-billion-parameter open-weights model with permissive Apache 2.0 license optimized to execute locally on a standard enterprise laptop using just 16GB of VRAM or unified memory.That means those enterprise users looking to keep working... Read more

0

VentureBeat
VentureBeat · 06/04/2026 16:25 EDT

Anthropic co-founder and CEO Dario Amodei said it was coming, but it still feels like a milestone: More than 80% of the code merged into Anthropic’s production codebase in May wasn't authored by humans, but by its own AI model, Claude, according to a new report shared by the record-breaking AI startup today.This transformation has triggered an 8x increase in the volume of code shipped per engineer per quarter compared... Read more

0

VentureBeat
VentureBeat · 06/05/2026 12:42 EDT

Meta's AI support agent bound recovery emails to accounts for whoever asked, and SOCs never saw an alert. An authorized agent writes a log of legitimate transactions, so nothing in the detection stack fired. Attackers asked the bot to make the change, took the one-time code it sent, and ran the password reset, 404 Media reported.No malware, no stolen credentials, and no prompt injection in the sense most security teams... Read more

0

VentureBeat
VentureBeat 3 place · 06/05/2026 13:51 EDT

When someone on a team corrects an AI agent — better prompts, better feedback, better context — that improvement disappears the moment a colleague opens the same tool. The correction doesn't transfer, and the next person starts from zero.The problem compounds in multi-agent workflows, where teams expect agents to share context across users and tasks. Without a shared memory layer, every team member effectively trains a different version of the... Read more

0

VentureBeat
VentureBeat 3 place · 06/05/2026 15:31 EDT

Microsoft used its Build 2026 conference this week to push a clear message: agents are rapidly moving into production throughout enterprise systems, and the winning platform will be the one that gives them reliable context, governance, identity, memory — and secure access to enterprise data. The company announced Microsoft IQ as a context layer across GitHub Copilot, Microsoft Foundry and Copilot Studio; Work IQ APIs coming June 16; Fabric IQ... Read more

0

VentureBeat
VentureBeat 2 place · 06/05/2026 18:55 EDT

For three years, Microsoft's artificial intelligence story has been inseparable from OpenAI. The partnership — cemented by a cumulative investment exceeding $13 billion — gave Microsoft early access to the most advanced AI models on the planet, catapulting its Copilot products into the enterprise mainstream and adding hundreds of billions of dollars to its market capitalization. To the outside world, Microsoft's AI strategy was OpenAI.Mustafa Suleyman wants to change that... Read more

0

VentureBeat
VentureBeat 1 place · 06/06/2026 00:00 EDT

Our system did one thing, and it did it well: It turned natural-language questions into API calls.The users were analysts, account managers, and operations leads. They knew what data they needed, but assembling it manually meant pulling from four dashboards, two BI tools, and a Salesforce report builder. With our system, they typed the request in plain English. A request like "Compile a report on sales volume for January through... Read more

0

VentureBeat
VentureBeat 2 place · 06/07/2026 12:00 EDT

Agentic AI is now a core part of the engineering process, driving massive execution leverage and helping us generate more code than ever before. Yet, a difficult question I’ve increasingly heard from business leaders is: if we’re shipping code faster than ever, why aren’t our products improving at the same rate?The reason is that writing code was never the rate limiter. Defining the right requirements, integrating with complex systems, and... Read more

0

VentureBeat
VentureBeat 3 place · 06/07/2026 21:02 EDT

Our system did one thing, and it did it well: It turned natural-language questions into API calls.The users were analysts, account managers, and operations leads. They knew what data they needed, but assembling it manually meant pulling from four dashboards, two BI tools, and a Salesforce report builder. With our system, they typed the request in plain English. A request like "Compile a report on sales volume for January through... Read more

0

Most popular sources

  • You see 932 news out of 932.
  • Sources 61 out of 61.
ScienceDaily 0%
Financial Times 0%
Tech Wire Asia 0%
Tech.eu 0%
ReadWrite 0%
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

09.06.2026 19:24
Last update: 19:15 EDT.
News rating updated: 02:10.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2026