52 place 33

413 Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test

VentureBeat
Michael Nuñez @ VentureBeat · 10/10/2024 15:47 EDT

Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test

OpenAI's new MLE-bench challenges AI systems with real-world data science tasks, revealing both the progress and limitations of AI in machine learning engineering compared to human experts.

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat VentureBeat
Silicon Valley
George Avalos @ Silicon Valley 1 place · 02/07/2106 01:28 EDT

Newark apartment complex bought for much less than prior value

An East Bay apartment complex has been bought at a price that's well below its prior value. Read more

0

🔮
21.06.2026 ♒︎ Today's Horoscope for Aquarius: Today will bring you a sense of balance and a bit... Read more ›
Silicon Valley
George Avalos @ Silicon Valley 2 place · 02/07/2106 01:28 EDT

PG&E buys San Jose building to bolster South Bay operations

A PG&E Corp. unit has bought a San Jose building in a move to bolster the utility's South Bay operations. Read more

0

Digital Trends
Shimul Sood @ Digital Trends 1 place · today 15:44 EDT

Hackers leak facial recognition records tied to millions of Madison Square Garden visitors

A cybercriminal group has published what it claims are millions of records stolen from Madison Square Garden Entertainment. The leak is drawing attention not just because of its size, but because it includes facial recognition data, internal threat assessments, and detailed visitor profiles. Read more

0 newcommer

Eurogamer.net
Vikki Blake @ Eurogamer.net 1 place · today 15:42 EDT

CD Projekt Red co-CEO admits it "indefinitely" "lost the faith" of some fans after Cyberpunk 2077

The Witcher 4 developer CD Projekt Red believes it's yet to complete its "full redemption arc" after the disastrous 2020 launch of its open-world action-adventure game, Cyberpunk 2077. Read more Read more

0 newcommer

Slashdot
EditorDavid @ Slashdot 1 place · today 15:40 EDT

Cops Keep Getting Arrested for Using Flock's Cameras to Stalk People

404 Media remembers how a Florida police office looked up his ex-girlfriend's license plate in the Flock automated license plate reader system at least 69 times in 2024 — even searching for her mom's license plate at least 24 times. The police office was charged with stalking and hacking-related offenses, serving one day in prison with five years of probation — but his case "was not a one-off." [Alternate link... Read more

0 newcommer

SlashGear
SlashGear 1 place · today 15:15 EDT

Why Do Utah State Highway Signs Have A Beehive?

Utah's highway signs stand out from those used in most states, and the unusual symbol is a reflection of an important part of the state's history. Read more

0 fresh

Habr
StudyQA @ Habr 1 place · today 15:12 EDT

Аудит алгоритмов: как реализация Boyer-Moore с 190K звёзд на GitHub оказалась brute-force

Проверил реализацию Boyer-Moore в TheAlgorithms/Python (190K+ звёзд). Оказалось, что сдвиг bad character записывается в переменную for-цикла, что в Python не имеет эффекта. Алгоритм выдаёт правильные результаты, но работает как brute-force O(nm) вместо O(n/m). Плюс ещё две находки: бесконечный цикл в типичных реализациях full BM и ошибка в оригинальной статье 1977 года, которую исправили только в 1980-м. Читать далее Read more

0 fresh

Digital Trends
Shimul Sood @ Digital Trends 2 place · today 15:08 EDT

Thanks to AI, a Chinese startup has figured out the priciest fusion energy bottleneck

Fusion energy has spent decades trapped in an expensive cycle of trial and error. Now, a Chinese startup believes AI-powered simulation software could dramatically accelerate reactor development by helping scientists test designs virtually before committing to costly real-world experiments. Read more

0 fresh

TechRadar
TechRadar 1 place · today 15:05 EDT

How to watch Uruguay vs Cape Verde: Free Streams & TV Channels for FIFA World Cup 2026

Here's how to watch Uruguay vs Cape Verde for free online and from anywhere as World Cup 2026 underdogs Cape Verde look to spring another surprise. Read more

0 fresh

CNET
Nasha Addarich Martínez @ CNET 1 place · today 15:00 EDT

The Best LED Face Masks That Will Improve Your Skin's Appearance

We tested popular FDA-cleared LED face masks to find the best ones for your home needs. Read more

0 newcommer

The Verge
Terrence O’Brien @ The Verge 1 place · today 14:53 EDT

Bose thinks it can be a media company for some reason

The history books are littered with the corpses of corporate record labels started by companies that had no business being in the music industry. Bose thinks it can be the exception to the rule. It thinks it can be Red Bull. And, while Bose has more of a right to dip its toes into the […] Read more

0 fresh

Business Insider
Kelly Burch @ Business Insider 1 place · today 14:51 EDT

I left the Navy SEALs to have more time with my 3 kids. What I learned in the military helped me raise confident kids.

Former Navy SEAL Brandon Webb says lessons from sniper training helped him teach his children confidence, resilience, and independence. Read more

0 fresh

Gizmodo
Justin Carter @ Gizmodo 1 place · today 14:50 EDT

Marvel’s New Comics Universe Is Starting All at Once

Marvel wants its new 'Midnight' books to feel like a big deal, so they're getting a full week all to themselves. Read more

0 fresh

Habr
Akhmadaliev @ Habr 2 place · today 14:47 EDT

RICE, ICE, MoSCoW: когда фреймворк приоритизации вас топит

Когда я пришёл в Instameal, у нас был бэклог на сорок задач и ни одного чёткого критерия почему одно важнее другого.Мы попробовали RICE. Потом ICE. Потом MoSCoW. Потом снова RICE с другими весами.Проблема была не в том, что мы выбирали неправильный фреймворк. Проблема была в том, что мы думали: выберем правильный инструмент - и приоритеты выстроятся сами.Не выстроятся.Что такое каждый из трёхRICE: Reach (охват) × Impact (влияние) × Confidence (уверенность)... Read more

0 fresh

SlashGear
SlashGear 2 place · today 14:45 EDT

Is The Leatherman Arc Worth The Price? Owners Have This To Say About It

The Leatherman Arc is one of the most expensive multitools on the market, so it's no surprise that owners and reviewers have strong opinions about its value. Read more

0 fresh

Habr
GlobalSign_admin (GlobalSign) @ Habr 3 place · today 14:37 EDT

Сервисы конвертации кода «съедают» опенсорс

Недавно в интернете начал работу сервис рефакторинга Malus.sh по «очистке кода от опенсорсных лицензий». Он позиционирует себя как «чистая комната», где софт очищается от лицензионного бремени. Туда загружается манифест свободного проекта, а LLM за небольшую плату переписывает код с сохранением функциональности. Идея в том, что новый код можно использовать как угодно, без соблюдения требований свободных лицензий APGL, MIT, Apache и др., под которыми опубликован оригинал.Недобросовестные разработчики получаю Read more

0 fresh

Habr
TatarnikovEgor @ Habr · today 14:32 EDT

Когда лучше публиковаться на Хабре. Статистический анализ связи времени публикации и охвата статей

На Хабре сейчас высокая конкуренция среди авторов за внимание читателей. По данным самого Хабра, в 2025 году на сайте было более 10 тысяч уникальных авторов контента, а количество публикаций превысило 51 тысячу. Это означает, что даже качественный материал может не получить заметный охват из-за большого количества публикаций в ленте.Есть распространённое мнение, что публиковать статьи нужно в предобеденное время, чтобы люди на обеденном перерыве могли почитать эти статьи, тогда охват будет... Read more

0 fresh

The most popular news from the same source for the last week
VentureBeat VentureBeat
VentureBeat
VentureBeat · 06/15/2026 03:00 EDT

Presented by SplunkAI has changed the economics of cyber deception.An attacker can now generate thousands of convincing phishing lures, fake identities, and tailored pretexts before a defender finishes a single change-control cycle. That is the new security challenge: deception got faster and cheaper, while verification did not.Much of the discussion around AI for defense centers on detection models. Detection matters, but it is not the only bottleneck. The deeper constraint... Read more

0

VentureBeat
VentureBeat · 06/15/2026 11:14 EDT

AI coding agents are rapidly accelerating data engineering by generating transformations, pipelines, orchestration workflows, validation tests, and infrastructure configurations from prompts. However, enterprise data platforms have long operated across fragmented systems owned by different teams and built on different technologies. As these systems evolve independently, organizations increasingly struggle with inconsistent business logic, duplicated implementations, difficult downstream impact analysis, and Read more

0

VentureBeat
VentureBeat 3 place · 06/15/2026 13:19 EDT

Organizational leaders are nearly twice as likely to hide their AI use compared to all other employees, at 42% versus 23%, according to new Ivanti research surveying 3,900 employees across six countries. Among leaders who conceal that usage, 52% say they do it for a "secret advantage." The same research found 85% of IT professionals claim a named owner exists for every AI agent. Only 42% say ownership is actually... Read more

0

VentureBeat
VentureBeat 3 place · 06/15/2026 15:30 EDT

Tokyo-based AI startup Sakana AI has officially launched its first commercial product, Sakana Marlin. Billed as a "Virtual CSO" (Chief Strategy Officer), Marlin is an autonomous, B2B research agent that deliberately abandons the instantaneous text generation of modern chatbots in favor of deep, long-horizon reasoning. What sets Marlin apart from the current ecosystem of AI tools is its temporal scale: instead of returning an answer in seconds, it runs continuous,... Read more

0

VentureBeat
VentureBeat 2 place · 06/15/2026 15:49 EDT

Microsoft CEO Satya Nadella published a sweeping essay on Sunday laying out what he describes as the defining economic challenge of the AI era: the risk that a handful of frontier models will absorb the expertise of entire industries and commoditize it, leaving businesses stripped of their competitive moats."The last thing any of us want is a world where every company across every sector is ceding value to a few... Read more

0

VentureBeat
VentureBeat · 06/16/2026 13:47 EDT

One of the assumptions behind today’s AI frameworks is that agents require a “boss” at the center; this orchestrator runs the show, routes requests, and makes sure the whole system doesn’t descend into chaos. That assumption may be wrong, and the cost of carrying it could be measured in inference dollars and coordination latency. A new Stanford framework called a decentralized language model, or DeLM, is built on the premise... Read more

0

VentureBeat
VentureBeat · 06/16/2026 16:04 EDT

For decades, data professionals have struggled with the challenge of managing both operational and analytical databases in a unified approach that doesn't introduce latency and performance degradation.Agents made the problem structural. A system that reasons continuously and acts on live data cannot tolerate a pipeline between itself and the information it needs to act on.At the Data + AI Summit on Tuesday, Databricks announced two products aimed at collapsing that... Read more

0

VentureBeat
VentureBeat 3 place · 06/16/2026 17:26 EDT

Today, Chinese AI startup Z.ai (formerly Zhipu AI) announced the immediate release of GLM-5.2, a 753-billion parameter open-weights large language model (LLM) engineered specifically to dominate "long-horizon" autonomous coding and engineering tasks. Available immediately on Hugging Face, the Z.ai API, and more than 20 third-party coding environments, the model boasts a highly stable 1-million-token context window alongside enterprise subscription tiers starting at just $12.60 per month. In excellent news f Read more

0

VentureBeat
VentureBeat 3 place · 06/16/2026 20:32 EDT

On Sunday, a team of nine researchers at Sina Weibo — the Chinese social media giant better known for its microblogging platform than for cutting-edge artificial intelligence — quietly posted a 14-page technical report to arXiv that sent shockwaves through the AI research community. Their claim: a language model with just 3 billion parameters can match or exceed the reasoning performance of flagship systems from Google DeepMind, OpenAI, Anthropic, and... Read more

0

VentureBeat
VentureBeat · 06/17/2026 15:00 EDT

When Anthropic quietly released Claude Design in April as a "research preview," it generated the kind of instant traction most product teams dream about: more than one million users in its first week. It also generated a problem. The tool consumed tokens so voraciously that a PCWorld reviewer burned through 80 percent of his weekly Claude Pro allowance in roughly 25 minutes, producing just three variations of a single webpage... Read more

0

Most popular sources

  • You see 305 news out of 305.
  • Sources 61 out of 61.
Silicon Canals 0%
Inc42 Media 0%
Wired 0%
The Fintech Times 0%
Vox 0%
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

21.06.2026 15:54
Last update: 15:45 EDT.
News rating updated: 22:40.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2026