70 place 84 fresh

80 Databricks' OfficeQA uncovers disconnect: AI agents ace abstract tests but stall at 45% on enterprise docs

VentureBeat
VentureBeat 1 place · today 11:00 EDT

There is no shortage of AI benchmarks in the market today, with popular options like Humanity's Last Exam (HLE), ARC-AGI-2 and GDPval, among numerous others.AI agents excel at solving abstract math problems and passing PhD-level exams that most benchmarks are based on, but Databricks has a question for the enterprise: Can they actually handle the document-heavy work most enterprises need them to do?The answer, according to new research from the data and AI platform company, is sobering. Even the best-perfor

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat VentureBeat
Tom's Hardware
Tom's Hardware 1 place · today 09:21 EDT

This GitHub script claims to wipe all of Windows 11's AI features in seconds — "RemoveWindowsAI" can disable every single AI feature in the OS, from Copilot to Recall and more

If you've been unhappy with the direction Microsoft has taken Windows, offering no meaningful improvements beyond AI and aesthetics, then, well, not much can be done about that. But, at least you can disable all the AI features that seem to have populated every corner of the OS, with a simple script from GitHub. Read more ›

2,197 fresh

đź”®
09.12.2025 ♎︎ Dear Libra, today will bring you a special atmosphere and new impressions. In the realm... Read more ›
Slashdot
msmash @ Slashdot 1 place · today 13:15 EDT

Microsoft To Invest $17.5 Billion in India

Microsoft announced on Tuesday its largest-ever investment in Asia -- $17.5 billion over four years starting in 2026 -- to expand cloud and AI infrastructure across India, fund skilling programs, and support ongoing operations in the country. The commitment adds to a $3 billion investment the company announced in January 2025 that is on track to be spent by the end of 2026. A new hyperscale cloud region in Hyderabad... Read more ›

1,783 fresh

Gizmodo
Gayoung Lee @ Gizmodo 1 place · today 12:20 EDT

Tiny Robot Lost Under Antarctic Ice for 8 Months Comes Back With Rare Data

“Against the enormity of such a wild region, this is an amazing story of the little float that could.” Read more ›

1,320 fresh

Tom's Hardware
Tom's Hardware 2 place · today 12:12 EDT

Arduino Uno Q Review: The board with two brains

Qualcomm’s recent acquisition of Arduino has introduced the Arduino Uno Q, a board that combines a Linux SBC powered by Qualcomm’s Dragonwing with an STM32 microcontroller. But are two brains better than one? Read more ›

1,298 fresh

Engadget
Lawrence Bonk @ Engadget 1 place · today 12:39 EDT

Microsoft Flight Sim 2024 now has a Stranger Things expansion

Microsoft Flight Simulator 2024 just got a fairly bizarre expansion inspired by the Netflix show Stranger Things. If you've ever wanted to fly over a fictional Indiana town in the 1980s, this is the update for you. That's right. The game now lets folks fly over Hawkins, Indiana and check out more than 40 iconic locations from the series, including Starcourt Mall, the junkyard, the government lab and, of course,... Read more ›

1,092 fresh

CNET
Alex Valdes @ CNET 1 place · today 12:53 EDT

Australia Bans Social Media for Kids Under 16. Which Sites Are Blocked?

The new Australian law restricts some social media platforms, but other services and AI chatbots are exempt. Read more ›

1,016 fresh

Business Insider
Polly Thompson @ Business Insider 1 place · today 12:02 EDT

Accenture struck a deal with Anthropic, 8 days after saying it would partner with OpenAI

Accenture and Anthropic are the latest firms to partner as corporations rush to use AI tools to serve both staff and clients. Read more ›

863 fresh

Wired
Will Knight @ Wired 1 place · today 12:06 EDT

OpenAI, Anthropic, and Block Are Teaming Up to Make AI Agents Play Nice

American AI giants are backing a new effort to establish open standards for building agentic software and tools. Read more ›

786 fresh

Wired
Ryan Waniata, Parker Hall @ Wired 2 place · today 13:05 EDT

The Best Karaoke Speakers from Small and Portable to Massive

Looking to make karaoke night a regular thing? We’ve tested everything from Bluetooth speakers to full-blown PAs. Read more ›

743 fresh

Business Insider
Henry Chandonnet @ Business Insider 2 place · today 11:53 EDT

Figma CEO says he was initially a 'bad manager.' Here's how he turned it around.

Dylan Field had no management experience before cofounding Figma. He had to learn a "whole new skillset," he said. Read more ›

565 fresh

Business Insider
Thibault Spirlet @ Business Insider 3 place · today 12:15 EDT

The US shut down a chip-smuggling ring that involved swapping Nvidia labels with a fake company name

Prosecutors alleged the network hid Nvidia hardware behind bogus paperwork and fake companies to bypass export rules and ship them to China. Read more ›

517 fresh

Eurogamer.net
Ed Nightingale @ Eurogamer.net 1 place · today 10:23 EDT

Video game workers across Europe fight back against exploitation, AI and layoffs in "historic milestone" for union efforts

Video game workers from multiple unions across Europe have released a joint statement pledging a "united front" against industry exploitation. Read more Read more ›

459 fresh

Business Insider
Bryan Metzger @ Business Insider · today 10:07 EDT

White House AI Czar says Trump isn't trying to force data centers on communities that don't want them

David Sacks said the Trump administration's effort to restrict state AI regulation won't "force communities to host data centers they don't want." Read more ›

404 fresh

Business Insider
Ayelet Sheffey @ Business Insider · today 09:10 EDT

Tell us what you think about the future of capitalism

Business Insider is exploring the future of capitalism in the US, and we want to hear from you. Read more ›

380 fresh

MacRumors
Hartley Charlton @ MacRumors 1 place · today 12:59 EDT

Apple to Make More Foldable iPhones Than Expected

Apple has ordered 22 million OLED panels from Samsung Display for the first foldable iPhone, signaling a significantly larger production target than the display industry had previously anticipated, ET News reports. In the now-seemingly deleted report, ET News claimed that Samsung plans to mass-produce 11 million inward-folding OLED displays for Apple next year, as well as 11 million accompanying external displays. With Samsung Display serving as the exclusive supplier of... Read more ›

371 fresh

Eurogamer.net
Matt Wales @ Eurogamer.net 2 place · today 11:46 EDT

Ubisoft just can't keep its Assassin's Creed Black Flag remake a secret, and it's now been spotted again ahead of The Game Awards

Ubisoft's long-rumoured Black Flag remake is seemingly even closer to reality after a listing for Assassin's Creed Black Flag Resynced was spotted on the Pan European Game Information (PEGI) ratings board website. Read more Read more ›

361 fresh

The most popular news from the same source for the last week
VentureBeat VentureBeat
VentureBeat
VentureBeat 1 place · 12/05/2025 08:00 EDT

Three years ago this week, Chat GPT was born. It amazed the world and ignited unprecedented investment and excitement in AI. Today, ChatGPT is still a toddler, but public sentiment around the AI boom has turned sharply negative. The shift began when OpenAI released GPT-5 this summer to mixed reviews, mostly from casual users who, unsurprisingly, judged the system by its surface flaws rather than its underlying capabilities.Since then, pundits... Read more ›

73

VentureBeat
VentureBeat 1 place · 12/04/2025 00:00 EDT

Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to parse through the results, which vary widely and can be misleading. Anthropic's 153-page system card for Claude Opus 4.5 versus OpenAI's 60-page GPT-5 system card reveals a fundamental split in how these labs approach security validation. Anthropic discloses in... Read more ›

54

VentureBeat
VentureBeat 1 place · 12/04/2025 18:00 EDT

OpenAI researchers have introduced a novel method that acts as a "truth serum" for large language models (LLMs), compelling them to self-report their own misbehavior, hallucinations and policy violations. This technique, "confessions," addresses a growing concern in enterprise AI: Models can be dishonest, overstating their confidence or covering up the shortcuts they take to arrive at an answer. For real-world applications, this technique evolves the creation of more transparent and... Read more ›

29

VentureBeat
VentureBeat 2 place · 12/08/2025 03:00 EDT

Presented by Design.comFor most of history, design was the last step in starting a business — something entrepreneurs invested in once the idea was proven. Today, it’s one of the first. The rise of generative AI has shifted how small businesses imagine, launch, and grow — turning what used to be a months-long creative process into something interactive, iterative, and accessible from day one.Search data tells the story. Since 2022,... Read more ›

28

VentureBeat
VentureBeat · 12/02/2025 21:00 EDT

Vector databases emerged as a must-have technology foundation at the beginning of the modern gen AI era. What has changed over the last year, however, is that vectors, the numerical representations of data used by LLMs, have increasingly become just another data type in all manner of different databases. Now, Amazon Web Services (AWS) is taking the next leap forward in the ubiquity of vectors with the general availability of... Read more ›

2

VentureBeat
VentureBeat 2 place · 12/03/2025 03:00 EDT

Presented by CelonisWhen tariff rates change overnight, companies have 48 hours to model alternatives and act before competitors secure the best options. At Celosphere 2025 in Munich, enterprises demonstrated how they’re turning that chaos into competitive advantage — with quantifiable results that separate winners from losers.Vinmar International: Theglobal plastics and chemicals distributor created a real-time digital twin of its $3B supply chain, cutting default expedites by more than 20% and... Read more ›

2

VentureBeat
VentureBeat 2 place · 12/04/2025 09:02 EDT

Amazon Web Services on Wednesday introduced Kiro powers, a system that allows software developers to give their AI coding assistants instant, specialized expertise in specific tools and workflows — addressing what the company calls a fundamental bottleneck in how artificial intelligence agents operate today.AWS made the announcement at its annual re:Invent conference in Las Vegas. The capability marks a departure from how most AI coding tools work today. Typically, these... Read more ›

1

VentureBeat
VentureBeat 3 place · 12/03/2025 00:00 EDT

Presented by IndeedAs AI continues to reshape how we work, organizations are rethinking what skills they need, how they hire, and how they retain talent. According to Indeed’s 2025 Tech Talent report, tech job postings are still down more than 30% from pre-pandemic highs, yet demand for AI expertise has never been greater. New roles are emerging almost overnight, from prompt engineers to AI operations managers, and leaders are under... Read more ›

0

VentureBeat
VentureBeat · 12/03/2025 00:00 EDT

One problem enterprises face is getting employees to actually use the AI agents their dev teams have built. Google, which has already shipped many AI tools through its Workspace apps, has made Google Workspace Studio generally available to give more employees access to design, manage and share AI agents, further democratizing agentic workflows. This puts Google directly in competition with Microsoft’s Copilot and undercuts some integrations that brought OpenAI’s ChatGPT... Read more ›

0

VentureBeat
VentureBeat · 12/03/2025 17:00 EDT

Just a few short weeks ago, Google debuted its Gemini 3 model, claiming it scored a leadership position in multiple AI benchmarks. But the challenge with vendor-provided benchmarks is that they are just that — vendor-provided. A new vendor-neutral evaluation from Prolific, however, puts Gemini 3 at the top of the leaderboard. This isn't on a set of academic benchmarks; rather, it's on a set of real-world attributes that actual... Read more ›

0

Most popular sources

  • You see 897 news out of 897.
  • Sources 61 out of 61.
Business Insider 22% 6
Wired 17% 16
Gizmodo 12% 9
Tom's Hardware 11% 0
Android Authority 5% 3
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

09.12.2025 13:54
Last update: 13:46 EDT.
News rating updated: 20:41.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2025