103 place 10

404 Sierra’s new benchmark reveals how well AI agents perform at real work

VentureBeat
Ken Yeung @ VentureBeat · 06/20/2024 14:09 EDT

Sierra’s new benchmark reveals how well AI agents perform at real work

Sierra releases TAU-bench, a new benchmark that claims to more accurately evaluate AI agent performance in the real world. Read how 12 popular LLMs fared.

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat VentureBeat
Silicon Valley
George Avalos @ Silicon Valley 1 place · 02/07/2106 01:28 EDT

Newark apartment complex bought for much less than prior value

An East Bay apartment complex has been bought at a price that's well below its prior value. Read more

0

🔮
29.06.2026 ♈︎ Dear Aries, today the stars favor bright emotions and eventful moments. In the love sphere,... Read more ›
Silicon Valley
George Avalos @ Silicon Valley 2 place · 02/07/2106 01:28 EDT

PG&E buys San Jose building to bolster South Bay operations

A PG&E Corp. unit has bought a San Jose building in a move to bolster the utility's South Bay operations. Read more

0

Business Insider
Jordan Pandy @ Business Insider 1 place · today 05:01 EDT

A self-sustaining private island in Canada sold for $6 million — it can only be reached by boat or helicopter

A luxurious lodge on a private island in Nova Scotia with enough infrastructure to maintain life off the mainland was sold for $6 million. Read more

0 newcommer

Habr
borisalekseev1 (Райффайзен Банк) @ Habr 1 place · today 05:00 EDT

Современный MQTT-сервис на Python

В Python при выборе библиотеки для работы с MQTT почти всегда приходишь к paho-mqtt. Это зрелый и самый популярный клиент, но его API построен на колбэках, а современное Python-приложение живёт в asyncio: FastAPI, фоновые воркеры, асинхронные клиенты и всё это в одном общем event loop.В одном из IoT-проектов я столкнулся ровно с этим. Мне нужен был MQTT-клиент, который без сложной адаптации встраивается в асинхронное приложение и позволяет работать с подписками... Read more

0 newcommer

Eurogamer.net
Lottie Lynn @ Eurogamer.net 1 place · today 05:00 EDT

A long-buried secret hidden in a cosy game about books made me realise something essential

Hello! Eurogamer is once again marking Pride with a week of features celebrating the intersection of queer culture and gaming in all its guises. Things get underway today as Lottie plays historian in Tiny Bookshop and is moved by what she finds. And if you want to catch up on previous years' festivities, why not have a nose around our Pride Week hub? Read more Read more

0 newcommer

Tech Wire Asia
Dashveenjit Kaur @ Tech Wire Asia 1 place · today 05:00 EDT

Why Apple is lobbying Washington to buy China’s memory chips

Apple is lobbying the Trump administration for clearance to buy Chinese memory chips from CXMT, as an AI-driven price surge eats into its hardware margins. CXMT sits on the Pentagon’s military blacklist, and Apple wants assurance it won’t face tighter export curbs before it commits. Apple’s push for memory chips from China comes down to ... Read more

0 newcommer

TechRadar
TechRadar 1 place · today 05:00 EDT

The AI infrastructure boom is bigger than GPUs

AI infrastructure is evolving beyond GPUs into the operational backbone of enterprise business systems. Read more

0 newcommer

Business Insider
Geoff Weiss @ Business Insider 2 place · today 05:00 EDT

Nvidia is quietly staffing up around its AI ambitions in outer space

Nvidia expands its space computing team for the Space-1 system, focusing on AI software development for low-Earth orbit data centers. Read more

0 newcommer

UK Tech News
Kirstie Pickering @ UK Tech News 1 place · today 05:00 EDT

UK venture capital firm Osney Capital has announced the final close of its oversubscribed debut cybersecurity fund at £60m, marking one of the largest debut seed funds in the UK. In the year since its first close at over £50m, Osney has completed seven investments into early-stage UK cybersecurity companies at pre-seed and seed stage, ... Read more

0 newcommer

Habr
mimfort074 @ Habr 2 place · today 04:57 EDT

Как подключить таск-трекер к кодовой базе через RAG и не сойти с ума от стоимости токенов

Главная проблема работы с LLM в реальном проекте — не качество модели, а контекст. Рассказываю, как с помощью RAG-индекса репозитория (векторы + граф вызовов) и плагина для Claude Code автоматически собирать правильный контекст по задаче из трекера — без ручного сбора и лишних токенов. Читать далее Read more

0 newcommer

Habr
hag19 @ Habr 3 place · today 04:55 EDT

Hajiz: rootless Linux‑sandbox на Rust — namespaces, Landlock, seccomp‑BPF и eBPF‑аудит без sudo

Я написал с нуля инструмент для изоляции Linux‑приложений на Rust — Hajiz. Это учебный проект по системной безопасности, который вырос во что‑то, чем можно реально пользоваться. Хочу рассказать, как он устроен изнутри: почему порядок применения механизмов изоляции критичен, как работает eBPF‑аудит с privilege separation, и зачем всё это когда есть Docker.→ github.com/hag19/hajiz Читать далее Read more

0 newcommer

Silicon Canals
Silicon Canals Editorial Team @ Silicon Canals 1 place · today 04:52 EDT

Adults who reread the same handful of novels every few years aren’t avoiding new books, they’re returning to the version of themselves who first met those pages, before life asked them to be useful to everyone else

Adults who reread the same novels every few years aren't avoiding new books — psychology suggests they're using familiar pages to reconnect with the version of themselves that existed before life turned them into everyone else's reliable one. Read more

0 fresh

Habr
danin @ Habr · today 04:52 EDT

Каскад когерентности: почему реклама работает

Как собственная информационная архитектура потребителя объясняет разрозненные теории рекламы, порождает тестируемые предсказания и обнаруживает структурную границу между рекламой, которая встраивается, и рекламой, которая повреждает Читать далее Read more

0 newcommer

The Information
Qianer Liu @ The Information 1 place · today 04:47 EDT

China’s ChangXin Memory Technologies has signed a long-term deal to supply Tencent Holdings with more than 20 billion yuan ($2.94 billion) of server DRAM chips, Reuters reported, citing three people familiar with the matter. DRAM, or dynamic random-access memory, helps servers quickly access ... Read more

0 fresh

The most popular news from the same source for the last week
VentureBeat VentureBeat
VentureBeat
VentureBeat · 06/22/2026 10:23 EDT

Not every company can or should build their own frontier AI language model. However, the harness controlling the model is something that most enterprises can and should customize for their specific purposes.Of course, this is easier said than done. Agent harnesses are still largely tuned through manual, ad hoc debugging — a process that relies heavily on intuition rather than systematic feedback loops, making it difficult to keep pace with... Read more

0

VentureBeat
VentureBeat 3 place · 06/22/2026 11:00 EDT

Presented by SplunkEvery day, organizations learn things their AI systems never get to use.A security analyst corrects an AI-generated investigation. A network engineer identifies the root cause of a recurring outage. An observability team discovers that a pattern of latency, logs and infrastructure changes predicts service degradation. A customer operations team learns which signals indicate an escalation is likely.Each moment contains valuable organizational knowledge. But in most enterprises, that knowle Read more

0

VentureBeat
VentureBeat 3 place · 06/22/2026 12:13 EDT

Last night, the increasingly enterprise-focused AI startup Sakana launched Fugu, a multi-agent orchestration system that delivers frontier-level AI performance through a single, OpenAI-compatible API. Designed for developers, enterprises, and nations seeking resilience against vendor lock-in and geopolitical export controls, Fugu (Japanese for "pufferfish"), bypasses the traditional monolithic model structure by dynamically routing queries to a swappable pool of specialized AI agents. Sakana CEO and co-foun Read more

0

VentureBeat
VentureBeat 3 place · 06/22/2026 16:22 EDT

Alibaba Cloud on Sunday released HappyHorse 1.1, a major upgrade to its AI video generation model that the company says delivers production-ready video synthesis across core content creation scenarios. The model is now live on Alibaba Cloud Model Studio with full API access for enterprise customers and developers, accompanied by a 40% sitewide launch discount for the first two weeks.The release arrives at a moment of remarkable upheaval in the... Read more

0

VentureBeat
VentureBeat 3 place · 06/23/2026 03:00 EDT

Presented by F5When enterprises move AI workloads from pilot to production, data delivery often becomes the factor that determines whether those systems can scale reliably. Point-to-point architectures connecting storage directly to compute hold up under demonstration conditions, but they often break down under sustained, concurrent production traffic. The result is stalled inference pipelines, delayed RAG systems, underutilized GPUs, and SLA violations, all of which carry direct business consequences. "Org Read more

0

VentureBeat
VentureBeat · 06/23/2026 13:00 EDT

Anthropic on Tuesday launched Claude Tag, a new product that embeds its most advanced AI model directly inside Slack as a persistent, shared teammate that anyone on a team can delegate work to by simply typing @Claude.The product, available today in beta for Claude Enterprise and Team customers, replaces Anthropic's existing Claude in Slack app and represents the company's most aggressive move yet to colonize the enterprise collaboration layer —... Read more

0

VentureBeat
VentureBeat · 06/23/2026 14:53 EDT

While many enterprises have already begun integrating AI-generated images, visuals, graphics and videos into their production workflows — there is also a growing pool of data and subjective commentary indicating AI imagery ultimately looks non-distinct, monotonous, and too unoriginal to ensure a brand and its assets stand out from the pack. That it's "AI slop," in other words. AI creative tools startup Krea is hoping to change that trend by... Read more

0

VentureBeat
VentureBeat · 06/23/2026 15:35 EDT

The security implications of advanced AI models were immediately clear to Visa’s technology team when they began testing Anthropic’s Mythos model.Just weeks into Project Glasswing, the team observed how quickly attackers can identify and weaponize vulnerabilities in critical code bases, creating security risks, explained Rajat Taneja, Visa’s president of technology, during a call to prepare for his session at VB Transform 2026, VentureBeat’s upcoming agentic AI event. Visa is among... Read more

0

VentureBeat
VentureBeat · 06/24/2026 11:14 EDT

OpenAI and Broadcom this morning unveiled their first custom AI accelerator chip named "Jalapeño," positioning it is as a purpose-built processor for large language model (LLM) inference, rather than the more general GPUs offered by the likes of Nvidia or AMD. According to its creators, Jalapeño is designed to support workloads behind ChatGPT, Codex, the API and future agentic products, though notably, both OpenAI's and Broadcom's news releases position it... Read more

0

VentureBeat
VentureBeat · 06/24/2026 11:34 EDT

Customer expectations have shifted from simple, fast conversational interactions to complex agentic AI-powered tasks that legacy IT architectures simply can’t handle. To address this, Intuit made the bold decision to overhaul its technical infrastructure for its business platform. The company moved away from its multi-agent setup, which prioritized broad capabilities, to a granular, skill-and-tool-based architecture while embedding human experts directly into the workflow alongside AI. This shift involved d Read more

0

Most popular sources

  • You see 388 news out of 388.
  • Sources 61 out of 61.
Vox 0%
Skift 0%
Tom's Hardware 0%
MacRumors 0%
Mashable 0%
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

29.06.2026 05:10
Last update: 05:06 EDT.
News rating updated: 12:08.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2026