VentureBeat #413

52 place 33

413 Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test

Michael Nuñez @ VentureBeat · 10/10/2024 15:47 EDT

OpenAI's new MLE-bench challenges AI systems with real-world data science tasks, revealing both the progress and limitations of AI in machine learning engineering compared to human experts.

Share (23) Tweet

To see detailed statistics for the news please log in »

Read the original

Add your comment

You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

Tech News

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat

1

Newark apartment complex bought for much less than prior value

George Avalos @ Silicon Valley 1 place · 02/07/2106 01:28 EDT

An East Bay apartment complex has been bought at a price that's well below its prior value. Read more ›

Share (0) Tweet

0

🔮

Your personal horoscope »

21.06.2026 ♒︎ Today's Horoscope for Aquarius: Today will bring you a sense of balance and a bit... Read more ›

2

PG&E buys San Jose building to bolster South Bay operations

George Avalos @ Silicon Valley 2 place · 02/07/2106 01:28 EDT

A PG&E Corp. unit has bought a San Jose building in a move to bolster the utility's South Bay operations. Read more ›

Share (0) Tweet

0

3

Hackers leak facial recognition records tied to millions of Madison Square Garden visitors

Shimul Sood @ Digital Trends 1 place · today 15:44 EDT

A cybercriminal group has published what it claims are millions of records stolen from Madison Square Garden Entertainment. The leak is drawing attention not just because of its size, but because it includes facial recognition data, internal threat assessments, and detailed visitor profiles. Read more ›

Share (0) Tweet

0 newcommer

4

CD Projekt Red co-CEO admits it "indefinitely" "lost the faith" of some fans after Cyberpunk 2077

Vikki Blake @ Eurogamer.net 1 place · today 15:42 EDT

The Witcher 4 developer CD Projekt Red believes it's yet to complete its "full redemption arc" after the disastrous 2020 launch of its open-world action-adventure game, Cyberpunk 2077. Read more Read more ›

Share (0) Tweet

0 newcommer

5

Cops Keep Getting Arrested for Using Flock's Cameras to Stalk People

EditorDavid @ Slashdot 1 place · today 15:40 EDT

404 Media remembers how a Florida police office looked up his ex-girlfriend's license plate in the Flock automated license plate reader system at least 69 times in 2024 — even searching for her mom's license plate at least 24 times. The police office was charged with stalking and hacking-related offenses, serving one day in prison with five years of probation — but his case "was not a one-off." [Alternate link... Read more ›

Share (0) Tweet

0 newcommer

6

3

Ubisoft co-founder Claude Guillemot dies in plane crash

Vikki Blake @ Eurogamer.net 2 place · today 15:18 EDT

Ubisoft co-founder, Claude Guillemot, has died in a plane crash. Read more Read more ›

Share (0) Tweet

0 fresh

7

3

Why Do Utah State Highway Signs Have A Beehive?

SlashGear 1 place · today 15:15 EDT

Utah's highway signs stand out from those used in most states, and the unusual symbol is a reflection of an important part of the state's history. Read more ›

Share (0) Tweet

0 fresh

8

3

Аудит алгоритмов: как реализация Boyer-Moore с 190K звёзд на GitHub оказалась brute-force

StudyQA @ Habr 1 place · today 15:12 EDT

Проверил реализацию Boyer-Moore в TheAlgorithms/Python (190K+ звёзд). Оказалось, что сдвиг bad character записывается в переменную for-цикла, что в Python не имеет эффекта. Алгоритм выдаёт правильные результаты, но работает как brute-force O(nm) вместо O(n/m). Плюс ещё две находки: бесконечный цикл в типичных реализациях full BM и ошибка в оригинальной статье 1977 года, которую исправили только в 1980-м. Читать далее Read more ›

Share (0) Tweet

0 fresh

9

3

Thanks to AI, a Chinese startup has figured out the priciest fusion energy bottleneck

Shimul Sood @ Digital Trends 2 place · today 15:08 EDT

Fusion energy has spent decades trapped in an expensive cycle of trial and error. Now, a Chinese startup believes AI-powered simulation software could dramatically accelerate reactor development by helping scientists test designs virtually before committing to costly real-world experiments. Read more ›

Share (0) Tweet

0 fresh

10

3

How to watch Uruguay vs Cape Verde: Free Streams & TV Channels for FIFA World Cup 2026

TechRadar 1 place · today 15:05 EDT

Here's how to watch Uruguay vs Cape Verde for free online and from anywhere as World Cup 2026 underdogs Cape Verde look to spring another surprise. Read more ›

Share (0) Tweet

0 fresh

11

The Best LED Face Masks That Will Improve Your Skin's Appearance

Nasha Addarich Martínez @ CNET 1 place · today 15:00 EDT

We tested popular FDA-cleared LED face masks to find the best ones for your home needs. Read more ›

Share (17) Tweet

0 newcommer

12

4

Bose thinks it can be a media company for some reason

Terrence O’Brien @ The Verge 1 place · today 14:53 EDT

The history books are littered with the corpses of corporate record labels started by companies that had no business being in the music industry. Bose thinks it can be the exception to the rule. It thinks it can be Red Bull. And, while Bose has more of a right to dip its toes into the […] Read more ›

Share (0) Tweet

0 fresh

13

4

I left the Navy SEALs to have more time with my 3 kids. What I learned in the military helped me raise confident kids.

Kelly Burch @ Business Insider 1 place · today 14:51 EDT

Former Navy SEAL Brandon Webb says lessons from sniper training helped him teach his children confidence, resilience, and independence. Read more ›

Share (0) Tweet

0 fresh

14

4

Marvel’s New Comics Universe Is Starting All at Once

Justin Carter @ Gizmodo 1 place · today 14:50 EDT

Marvel wants its new 'Midnight' books to feel like a big deal, so they're getting a full week all to themselves. Read more ›

Share (0) Tweet

0 fresh

15

4

The John Ternus Era at Apple Will Reportedly Revive Apple’s Focus on Bold Design

Mike Pearl @ Gizmodo 2 place · today 14:47 EDT

Apple products used to be huggable toys. Are those days coming back? Read more ›

Share (0) Tweet

0 fresh

16

4

RICE, ICE, MoSCoW: когда фреймворк приоритизации вас топит

Akhmadaliev @ Habr 2 place · today 14:47 EDT

Когда я пришёл в Instameal, у нас был бэклог на сорок задач и ни одного чёткого критерия почему одно важнее другого.Мы попробовали RICE. Потом ICE. Потом MoSCoW. Потом снова RICE с другими весами.Проблема была не в том, что мы выбирали неправильный фреймворк. Проблема была в том, что мы думали: выберем правильный инструмент - и приоритеты выстроятся сами.Не выстроятся.Что такое каждый из трёхRICE: Reach (охват) × Impact (влияние) × Confidence (уверенность)... Read more ›

Share (0) Tweet

0 fresh

17

4

Is The Leatherman Arc Worth The Price? Owners Have This To Say About It

SlashGear 2 place · today 14:45 EDT

The Leatherman Arc is one of the most expensive multitools on the market, so it's no surprise that owners and reviewers have strong opinions about its value. Read more ›

Share (0) Tweet

0 fresh

18

4

Popular free VPN, streaming apps bombard business networks with 'laundered' traffic used by criminals to 'blend into normal consumer noise' — here's how to keep safe

TechRadar 2 place · today 14:40 EDT

Residential proxies are both a boon for threat actors and a detriment for their victims, and many of them exist due to a lack of awareness Read more ›

Share (0) Tweet

0 fresh

19

4

Сервисы конвертации кода «съедают» опенсорс

GlobalSign_admin (GlobalSign) @ Habr 3 place · today 14:37 EDT

Недавно в интернете начал работу сервис рефакторинга Malus.sh по «очистке кода от опенсорсных лицензий». Он позиционирует себя как «чистая комната», где софт очищается от лицензионного бремени. Туда загружается манифест свободного проекта, а LLM за небольшую плату переписывает код с сохранением функциональности. Идея в том, что новый код можно использовать как угодно, без соблюдения требований свободных лицензий APGL, MIT, Apache и др., под которыми опубликован оригинал.Недобросовестные разработчики получаю Read more ›

Share (0) Tweet

0 fresh

20

4

Когда лучше публиковаться на Хабре. Статистический анализ связи времени публикации и охвата статей

TatarnikovEgor @ Habr · today 14:32 EDT

На Хабре сейчас высокая конкуренция среди авторов за внимание читателей. По данным самого Хабра, в 2025 году на сайте было более 10 тысяч уникальных авторов контента, а количество публикаций превысило 51 тысячу. Это означает, что даже качественный материал может не получить заметный охват из-за большого количества публикаций в ленте.Есть распространённое мнение, что публиковать статьи нужно в предобеденное время, чтобы люди на обеденном перерыве могли почитать эти статьи, тогда охват будет... Read more ›

Share (0) Tweet

0 fresh

The most popular news from the same source for the last week
VentureBeat

1

Attackers scale deception with AI. Defenders need truth at machine speed.

VentureBeat · 06/15/2026 03:00 EDT

Presented by SplunkAI has changed the economics of cyber deception.An attacker can now generate thousands of convincing phishing lures, fake identities, and tailored pretexts before a defender finishes a single change-control cycle. That is the new security challenge: deception got faster and cheaper, while verification did not.Much of the discussion around AI for defense centers on detection models. Detection matters, but it is not the only bottleneck. The deeper constraint... Read more ›

Share (0) Tweet

0

2

Vibe coding can build your pipeline. It can't explain it six months later

VentureBeat · 06/15/2026 11:14 EDT

AI coding agents are rapidly accelerating data engineering by generating transformations, pipelines, orchestration workflows, validation tests, and infrastructure configurations from prompts. However, enterprise data platforms have long operated across fragmented systems owned by different teams and built on different technologies. As these systems evolve independently, organizations increasingly struggle with inconsistent business logic, duplicated implementations, difficult downstream impact analysis, and Read more ›

Share (0) Tweet

0

3

85% of IT teams claim every AI agent is under control. Only 42% actually know who owns them.

VentureBeat 3 place · 06/15/2026 13:19 EDT

Organizational leaders are nearly twice as likely to hide their AI use compared to all other employees, at 42% versus 23%, according to new Ivanti research surveying 3,900 employees across six countries. Among leaders who conceal that usage, 52% say they do it for a "secret advantage." The same research found 85% of IT professionals claim a named owner exists for every AI agent. Only 42% say ownership is actually... Read more ›

Share (0) Tweet

0

4

When deep research isn't enough for your business: Sakana AI launches 'ultra deep research' agent for 100+ page reports in 8 hours

VentureBeat 3 place · 06/15/2026 15:30 EDT

Tokyo-based AI startup Sakana AI has officially launched its first commercial product, Sakana Marlin. Billed as a "Virtual CSO" (Chief Strategy Officer), Marlin is an autonomous, B2B research agent that deliberately abandons the instantaneous text generation of modern chatbots in favor of deep, long-horizon reasoning. What sets Marlin apart from the current ecosystem of AI tools is its temporal scale: instead of returning an answer in seconds, it runs continuous,... Read more ›

Share (0) Tweet

0

5

Satya Nadella warns that AI could hollow out entire industries, echoing the damage done by globalization

VentureBeat 2 place · 06/15/2026 15:49 EDT

Microsoft CEO Satya Nadella published a sweeping essay on Sunday laying out what he describes as the defining economic challenge of the AI era: the risk that a handful of frontier models will absorb the expertise of entire industries and commoditize it, leaving businesses stripped of their competitive moats."The last thing any of us want is a world where every company across every sector is ceding value to a few... Read more ›

Share (0) Tweet

0

6

Stanford's DeLM cuts multi-agent task costs 50% — without a central orchestrator

VentureBeat · 06/16/2026 13:47 EDT

One of the assumptions behind today’s AI frameworks is that agents require a “boss” at the center; this orchestrator runs the show, routes requests, and makes sure the whole system doesn’t descend into chaos. That assumption may be wrong, and the cost of carrying it could be measured in inference dollars and coordination latency. A new Stanford framework called a decentralized language model, or DeLM, is built on the premise... Read more ›

Share (0) Tweet

0

7

Databricks says it solved the decades-old data pipeline problem that's been slowing AI agents

VentureBeat · 06/16/2026 16:04 EDT

For decades, data professionals have struggled with the challenge of managing both operational and analytical databases in a unified approach that doesn't introduce latency and performance degradation.Agents made the problem structural. A system that reasons continuously and acts on live data cannot tolerate a pipeline between itself and the information it needs to act on.At the Data + AI Summit on Tuesday, Databricks announced two products aimed at collapsing that... Read more ›

Share (0) Tweet

0

8

Z.ai’s open-weights GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks for 1/6th the cost

VentureBeat 3 place · 06/16/2026 17:26 EDT

Today, Chinese AI startup Z.ai (formerly Zhipu AI) announced the immediate release of GLM-5.2, a 753-billion parameter open-weights large language model (LLM) engineered specifically to dominate "long-horizon" autonomous coding and engineering tasks. Available immediately on Hugging Face, the Z.ai API, and more than 20 third-party coding environments, the model boasts a highly stable 1-million-token context window alongside enterprise subscription tiers starting at just $12.60 per month. In excellent news f Read more ›

Share (0) Tweet

0

9

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

VentureBeat 3 place · 06/16/2026 20:32 EDT

On Sunday, a team of nine researchers at Sina Weibo — the Chinese social media giant better known for its microblogging platform than for cutting-edge artificial intelligence — quietly posted a 14-page technical report to arXiv that sent shockwaves through the AI research community. Their claim: a language model with just 3 billion parameters can match or exceed the reasoning performance of flagship systems from Google DeepMind, OpenAI, Anthropic, and... Read more ›

Share (0) Tweet

0

10

Anthropic ships major Claude Design overhaul with design system imports, code round-trips, and a fix for its token-burning problem

VentureBeat · 06/17/2026 15:00 EDT

When Anthropic quietly released Claude Design in April as a "research preview," it generated the kind of instant traction most product teams dream about: more than one million users in its first week. It also generated a problem. The tool consumed tokens so voraciously that a PCWorld reviewer burned through 80 percent of his weekly Claude Pro allowance in roughly 25 minutes, producing just three variations of a single webpage... Read more ›

Share (0) Tweet

0

Most popular sources

You see 305 news out of 305.
Sources 61 out of 61.

Silicon Canals	0%
Inc42 Media	0%
Wired	0%
The Fintech Times	0%
Vox	0%
View sources »

Tech News

LIKE us on Facebook so you won't miss the most important news of the day!

21.06.2026 15:54
Last update: 15:45 EDT.
News rating updated: 22:40.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.

Times42 © 2026