41 place 0

969 Apple Researchers Challenge AI Reasoning Claims With Controlled Puzzle Tests

Slashdot
msmash @ Slashdot · 06/09/2025 10:00 EDT

Apple Researchers Challenge AI Reasoning Claims With Controlled Puzzle Tests

Apple researchers have found that state-of-the-art "reasoning" AI models like OpenAI's o3-mini, Gemini (with thinking mode-enabled), Claude 3.7, DeepSeek-R1 face complete performance collapse [PDF] beyond certain complexity thresholds when tested on controllable puzzle environments. The finding raises questions about the true reasoning capabilities of large language models.

The study, which examined models using Tower of Hanoi, checker jumping, river crossing, and blocks world puzzles rather than standard

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
Slashdot Slashdot
Silicon Valley
George Avalos @ Silicon Valley 1 place · 02/07/2106 01:28 EDT

Newark apartment complex bought for much less than prior value

An East Bay apartment complex has been bought at a price that's well below its prior value. Read more

0

🔮
22.06.2026 ♈︎ Dear Aries, today will bring you a mix of different emotions and events that will... Read more ›
Silicon Valley
George Avalos @ Silicon Valley 2 place · 02/07/2106 01:28 EDT

PG&E buys San Jose building to bolster South Bay operations

A PG&E Corp. unit has bought a San Jose building in a move to bolster the utility's South Bay operations. Read more

0

Business Insider
Sarah Sloat @ Business Insider 1 place · today 09:42 EDT

Gensler put AI in the design room. Now it's helping shape thousands of projects.

Gensler's team tested a range of vendor-created generative AI tools, some of which are now integrated within an in-house interface. Read more

0 newcommer

CoinDesk
Olivier Acuna @ CoinDesk 1 place · today 09:40 EDT

OKX and NYSE partner to bridge tradfi and crypto markets in joint venture led by Andrew Cuomo

The goal is to enable OKX’s 120 million global users to access ICE futures and NYSE tokenized equities markets, the two U.S. companies announced. Read more

0 newcommer

Habr
Gravsharp @ Habr 1 place · today 09:38 EDT

Как запустить рекламу только в Google Maps

Как запустить рекламу только в Google MapsДолгое время у рекламодателей не было нормального способа выделить Google Maps как отдельный плейсмент. Можно было получать показы в Картах через локальные сценарии, Performance Max, поисковые кампании с адресами и другие форматы, но изолировать именно Maps было сложно или невозможно.Сейчас ситуация изменилась. Если нужно запустить рекламу именно в Google Maps, самый точный вариант — использовать Demand Gen и выбрать в настройках каналов только Maps.... Read more

0 newcommer

TechRadar
TechRadar 2 place · today 09:37 EDT

How to watch I Kissed a Girl season 2 for *FREE*

As the Sapphic reality dating show sees Dannii Minogue welcome 10 lusty singletons to the masseria, here's how to watch I Kissed a Girl season 2 free online. Read more

0 newcommer

Habr
WandererZero @ Habr 2 place · today 09:36 EDT

Смартфон на АА-батарейках: как я собрал автономный медиакомбайн, живущий в среднем 10 дней от одного заряда

Смартфон на АА-батарейках: как я собрал автономный медиакомбайн, живущий в среднем 10 дней от одного зарядаЯ всегда мечтал про умный гаджет, работающий от надежного источника питания АА. Но современный рынок, так и не смог мне ничего предложить...Поэтому сделал сам)Хотел поделиться результатами, и узнать ваше мнение. Читать далее Read more

0 newcommer

Habr
dmagog @ Habr 3 place · today 09:32 EDT

MLE-bench: золото взято, а доказательства остались в /tmp

В апреле мой агент смог перешагнуть золотой порог на MLE-bench в агентских соревнованиях Berkeley RDI, а когда я решил показать «тот самый код, который взял золото» — понял, что не уверен, существует ли он вообще.Хабр, привет! Меня зовут Георгий, и в своей первой статье на площадке я решил разобраться, что же происходило на самом деле. Цифровой детектив: с чем я преодолел планку, где этот результат теперь (спойлер: нигде) и сколько... Read more

0 newcommer

Business Insider
Gabbi Shaw @ Business Insider 2 place · today 09:31 EDT

How every Avenger in the Marvel Cinematic Universe compares to their comic-book counterpart

The Marvel Cinematic Universe has been going strong since "Iron Man" in 2008, but Marvel heroes have been around since as early as 1941. Read more

0 fresh

Habr
DimaIam (StudyAI) @ Habr · today 09:30 EDT

Самые странные нательные технологии в истории

Сегодня мы окунемся в историю странных технологий и гаджетов, которые люди когда-то носили (или могли бы носить) с не меньшей гордостью, чем мы носим смарт-часы сегодня. Читать далее Read more

0 newcommer

BetaKit
Trevor Nichols @ BetaKit 1 place · today 09:30 EDT

Sellit9 closes $4.1 million to help Canadians trade in old electronics

Startup helps consumers skip "marketplace drama" while avoiding price haggling and safety concerns. Read more

0 fresh

Startups News
Daniel Levi @ Startups News 1 place · today 09:30 EDT

ITG targets $429 million IPO as AI data center boom fuels broadband infrastructure demand

The rush to cash in on the AI infrastructure buildout is no longer limited to chipmakers, cloud giants, and data center developers. It is now pulling in the companies that build and maintain the physical networks carrying all that traffic. ... Read more

0 fresh

AlleyWatch
AlleyWatch @ AlleyWatch 1 place · today 09:27 EDT

#NYCtech Week in Review: 6/14/26 – 6/20/26

11 new deals and $313M+ invested into NYC startups for the week. NYC Tech News for the week ending 6/20/26 featuring news for Chronograph, Interchecks, Hypha AI, and much, much more. Read more

0 fresh

Inc42 Media
Lokesh Choudhary @ Inc42 Media 1 place · today 09:27 EDT

Zypp Electric Gears For $200 Mn IPO, Eyes Listing In FY28

EV logistics startup Zypp Electric is gearing up for a public market debut and is looking to raise up to… Read more

0 fresh

Gizmodo
James Pero @ Gizmodo 2 place · today 09:24 EDT

Fender Elie 6 Review: So Close to Being Great

There's a lot to love about the Elie 6, but it's missing one major piece of the Bluetooth speaker/guitar amp puzzle. Read more

0 fresh

Habr
prohetamine @ Habr · today 09:24 EDT

Как я спустя 10 лет всё-таки собрал светодиодный костюм

Привет Хабр! Эта история началась в 2017 году, тогда все было иначе. Я только учился программировать и собирать что-то на микроконтроллерах, это была увлекательная часть моего детства. В этой статье я расскажу как это было второй раз, о некоторых решениях и ошибках. Читать далее Read more

0 fresh

The most popular news from the same source for the last week
Slashdot Slashdot
Slashdot
BeauHD @ Slashdot · 06/15/2026 11:00 EDT

Swiss Voters Reject Proposal To Cap Population At 10 Million

An anonymous reader quotes a report from The Guardian: Voters in Switzerland have rejected an unprecedented far-right proposal to cap the country's population at 10 million in a divisive referendum dubbed "the Swiss Brexit." Some 54.79% of voters were against the proposal by the Swiss People's party (SVP) and 45.21% were in favor. Turnout was 58.86%. A different outcome would have obliged the Swiss government to limit the population, currently... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 12:00 EDT

Google CEO Largely Avoids Discussing AI In Stanford Commencement Speech

BrianFagioli writes: Google CEO Sundar Pichai delivered Stanford University's 2026 commencement address, but despite leading one of the companies at the center of the AI boom, he spent very little time discussing artificial intelligence. Instead, the speech focused on optimism, working on hard things, and following your interests. The omission is notable given how many graduates are entering a job market being reshaped by AI. While Pichai briefly referenced a... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 13:00 EDT

Fox Is Buying Roku For $22 Billion

Fox is buying Roku for $22 billion, combining Fox's sports, news, entertainment, Tubi, and Fox One offerings with a streaming platform that reaches about 100 million people. The companies say the merger would create the "third-largest player in US television by share of viewing," while Fox insists Roku will remain open to competing apps after the deal closes. CNN reports: Fox has dabbled in streaming over the past few years... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 14:00 EDT

Britain Unveils Sweeping Ban On Social Media For Under-16s

Longtime Slashdot reader schwit1 shares a report from NBC News: British Prime Minister Keir Starmer has announced a sweeping ban on social media use for those under 16, joining other countries around the world seeking to protect children online. "It's a big step for our country," Starmer said in a recorded video message released Monday. "Social media is making our children unhappy and unsafe, and as a parent, as much... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 15:00 EDT

Trump's 'Made In the USA' Phone Is Just a Reskinned HTC U24 Pro

Longtime Slashdot reader necro81 writes: The heavily promoted, $499 T1 "Trump Phone" was originally said to be "Made in the USA" and ship in September 2025. Later, that was downgraded to "Assembled in the USA." Given the Trump Organization's lack of engineering or supply chain expertise, many assumed the "T1" would just be a private-label phone made by someone else. After a number of delays, the first phones are finally... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 16:02 EDT

Users Cry Foul After AMD Stripped Memory Crypto From Its Consumer CPUs

An anonymous reader quotes a report from Ars Technica: A decade ago, AMD added a protection to its high-end CPUs to protect them against cold boot attacks and other types of physical exploits that siphon sensitive data out of the connected memory chips. Short for Transparent Secure Memory Encryption, TSME encrypts the entire contents stored in memory, making the data useless to physical attackers. Over time, AMD added TSME to... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 17:00 EDT

Google Chrome's Next Update Will Mark the End of Popular Ad Blockers

Google is removing Chrome's last remaining workarounds for Manifest V2 extensions, effectively ending support for legacy ad blockers such as the original uBlock Origin. 9to5Google reports: CyberNews points out a Chromium commit that removes support for the "kExtensionManifestV2Disabled" flag, which is referred to as "dead code" seeing as Chrome no longer supports Manifest V2 extensions. This removal acts as the final stop for many Manifest V2-based ad blocker extensions that... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 18:00 EDT

FBI Issues Urgent Kali365 Security Warning For Teams, Outlook, OneDrive Users

alternative_right shares a report from The Hill: The FBI released an urgent security warning to the public about a fast-acting scam targeting Microsoft 365 users on Teams, Outlook and OneDrive. The agency warned that the hacking platform Kali365 seeks out OAuth device codes, allowing scammers to sneak past multi-factor authentication codes, and without the need for a password, to access Microsoft accounts. Scammers will send a phishing email impersonating a... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 19:00 EDT

The US Government Is Letting a Key Data Center Regulation Expire

The Federal Data Center Enhancement Act (FDCEA) is set to expire in September without an apparent replacement, potentially ending requirements for federal agencies to report on data-center efficiency, resilience, energy and water use, and contractor sustainability. Wired reports: Despite the public backlash, the Office of Management and Budget (OMB), the government agency that sets guidance for how agencies implement policies in line with the president's agenda, is not providing any... Read more

0

Slashdot
BeauHD @ Slashdot · 06/15/2026 23:30 EDT

Cybersecurity Vets Protest 'Dangerous' US Government Ban On Anthropic's Most Powerful Models

An anonymous reader quotes a report from TechCrunch: A group made up of dozens of cybersecurity experts, including several well-known veterans of the industry, published an open letter to the U.S. government asking it to lift the export control order on Anthropic's Fable and Mythos models. According to the open letter, "this action has taken the best models away from [cybersecurity] defenders" who now can't use the models to find... Read more

0

Most popular sources

  • You see 619 news out of 619.
  • Sources 61 out of 61.
Ars Technica 0%
ScienceDaily 0%
Financial Times 0%
ArcticStartup 0%
Tech Wire Asia 0%
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

22.06.2026 09:50
Last update: 09:45 EDT.
News rating updated: 16:41.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2026