15 place 1

136 Why your LLM bill is exploding — and how semantic caching can cut it by 73%

VentureBeat
VentureBeat 1 place · 01/10/2026 14:00 EDT

Our LLM API bill was growing 30% month-over-month. Traffic was increasing, but not that fast. When I analyzed our query logs, I found the real problem: Users ask the same questions in different ways."What's your return policy?," "How do I return something?", and "Can I get a refund?" were all hitting our LLM separately, generating nearly identical responses, each incurring full API costs.Exact-match caching, the obvious first solution, captured only 18% of these redundant calls. The same semantic question,.

To see detailed statistics for the news please log in »

Read the original

Add your comment
You must be logged in with Facebook to read and write comments.

A newsletter a day!

You may get 10 most important news around midday in daily newsletter. Press the button and we will send you the most important news only, no spam attached.

or register

LIKE us on Facebook so you won't miss the most important news of the day!

News from the same source
VentureBeat VentureBeat
Business Insider
Madison Hoff,Juliana Kaplan @ Business Insider 1 place · today 04:31 EDT

Here are the winners and losers in the frozen US job market

New data showed which sectors added more jobs than others in 2025. Healthcare had a lot of job growth, while the federal government lost a lot. Read more

4,682 fresh

🔮
11.01.2026 ♉︎ Dear Taurus, today will bring you a mixed range of feelings and events. In the... Read more ›
Slashdot
EditorDavid @ Slashdot 1 place · today 03:34 EDT

C# (and C) Grew in Popularity in 2025, Says TIOBE

For a quarter century, the TIOBE Index has attempted to rank the popularity of programming languages by the number of search engine results they bring up — and this week they had an announcement. Over the last year the language showing the largest increase in its share of TIOBE's results was C#. TIOBE founder/CEO Paul Jansen looks back at how C++ evolved: From a language-design perspective, C# has often been... Read more

728 fresh

Business Insider
Taylor Rains @ Business Insider 2 place · today 03:45 EDT

What it's like working as a private jet flight attendant who makes over $100,000 a year

Private jet flight attendant Kelley Lokensgard earns over $100,000 and travels the world for essentially free. But it is a lot of work and long hours. Read more

570 fresh

Engadget
Jackson Chen @ Engadget 1 place · 01/10/2026 14:21 EDT

An Instagram data breach reportedly exposed the personal info of 17.5 million users

If you received a bunch of password reset requests from Instagram recently, you're not alone. As reported by Malwarebytes, an antivirus software company, there was a data breach revealing the "sensitive information" of 17.5 million Instagram users. Malwarebytes added that the leak included Instagram usernames, physical addresses, phone numbers, email addresses and more. The company added that the "data is available for sale on the dark web and can be... Read more

97

Mashable
Mashable 1 place · 01/10/2026 22:00 EDT

Wordle today: Answer, hints for January 11, 2026

Here's the answer for "Wordle" #1666 on January 11 as well as a few hints, tips, and clues to help you solve it yourself. Read more

94 fresh

Business Insider
Kelsey Vlamis @ Business Insider · 01/10/2026 16:10 EDT

See the list of California's 200-plus billionaires who could be hit by the proposed wealth tax

California has over 200 billionaires that could be hit by a proposed wealth tax. Several have recently moved assets out of the state. Read more

87

Slashdot
EditorDavid @ Slashdot 2 place · 01/10/2026 17:34 EDT

Four More Tech Bloggers are Switching to Linux

Is there a trend? This week four different articles appeared on various tech-news sites with an author bragging about switching to Linux. "Greetings from the year of Linux on my desktop," quipped the Verge's senior reviews editor, who finally "got fed up and said screw it, I'm installing Linux. They switched to CachyOS — just like this writer for the videogame magazine Escapist: I've had a fantastic time gaming on... Read more

83

Tom's Hardware
Tom's Hardware 1 place · 01/10/2026 06:50 EDT

This $2,000 Bitcoin mining water heater can pay for itself by slashing your energy bills, company claims — can rake in $1,000 a year in BTC, offset 80% of electricity and water costs

Superheat was at CES 2026 to showcase what it describes as “a water heater that pays for itself.” Instead of a resistive heating element, it warms your H2O with heat generated by a Bitcoin ASIC miner. Read more

82

Business Insider
Nathan Rennolds @ Business Insider · 01/10/2026 12:35 EDT

Exxon CEO calls Venezuela 'uninvestable' during meeting with Trump

President Donald Trump is pushing for major US oil companies to pump at least $100 billion into Venezuela. Read more

71

GSMArena.com
GSMArena.com 1 place · today 01:37 EDT

Weekly poll: will you buy the Motorola Signature?

Motorola has been pretty inconsistent with flagship releases – its Edge series had an Ultra model in some years but not in others. Now it introduces the Signature line to serve the premium market and the first model in it is quite interesting. The Motorola Signature is not an all out flagship, but it costs a good deal less than those – you can pick up a 12/512GB unit for... Read more

70 fresh

ScienceDaily
ScienceDaily 1 place · 01/10/2026 23:02 EDT

Scientists have discovered an enormous stream of super-hot gas erupting from a nearby galaxy, driven by a powerful black hole at its center. The jets stretch farther than the galaxy itself and spiral outward in a rare, never-before-seen pattern. NASA’s James Webb Space Telescope pierced through thick dust to reveal this violent outflow. The process is so intense it’s robbing the galaxy of star-forming gas at a staggering rate. Read more

66 fresh

Slashdot
EditorDavid @ Slashdot 3 place · 01/10/2026 15:34 EDT

AI Fails at Most Remote Work, Researchers Find

A new study "compared how well top AI systems and human workers did at hundreds of real work assignments," reports the Washington Post. They add that at least one example "illustrates a disconnect three years after the release of ChatGPT that has implications for the whole economy." AI can accomplish many impressive tasks involving computer code, documents or images. That has prompted predictions that human work of many kinds could... Read more

66

Mashable
Mashable 2 place · 01/10/2026 22:00 EDT

NYT Strands hints, answers for January 11, 2026

The NYT Strands hints and answers you need to make the most of your puzzling experience. Read more

66 fresh

Business Insider
Talia Lakritz @ Business Insider · 01/10/2026 09:19 EDT

Karoline Leavitt called her age-gap marriage an 'atypical love story.' Here's what to know about her life and career.

White House press secretary Karoline Leavitt, 28, is expecting her second child with her husband, 60. Here's what to know about her life and career. Read more

65

Engadget
Jackson Chen @ Engadget 2 place · 01/10/2026 16:06 EDT

GameStop reportedly shuts down more than 400 US stores

Your neighborhood GameStop might be on the chopping block, along with more than 400 other retail locations across the US. As first reported by Polygon, the retailer is pursuing a severe cost-saving measure by closing up several hundred physical locations. According to a blog that keeps track of GameStop closures, there are 410 locations that are confirmed to be closing or are already closed, along with another 11 that are... Read more

49

The most popular news from the same source for the last week
VentureBeat VentureBeat
VentureBeat
VentureBeat 1 place · 01/08/2026 11:14 EDT

Anthropic has released Claude Code v2.1.0, a notable update to its "vibe coding" development environment for autonomously building software, spinning up AI agents, and completing a wide range of computer tasks, according to Head of Claude Code Boris Cherny in a post on X last night.The release introduces improvements across agent lifecycle control, skill development, session portability, and multilingual output — all bundled in a dense package of 1,096 commits.... Read more

88

VentureBeat
VentureBeat 1 place · 01/06/2026 15:11 EDT

In the fast-moving world of AI development, it is rare for a tool to be described as both "a meme" and AGI, artificial generalized intelligence, the "holy grail" of a model or system that can reliably outperform humans on economically valuable work. Yet, that is exactly where the Ralph Wiggum plugin for Claude Code now sits. Named after the infamously high-pitched, hapless yet persistent character on "The Simpsons," this newish... Read more

58

VentureBeat
VentureBeat 1 place · 01/07/2026 21:40 EDT

Joining the ranks of a growing number of smaller, powerful reasoning models is MiroThinker 1.5 from MiroMind, with just 30 billion parameters, compared to the hundreds of billions or trillions used by leading foundation large language models (LLMs).But MiroThinker 1.5 stands out among these smaller reasoners for one major reason: it offers agentic research capabilities rivaling trillion-parameter competitors like Kimi K2 and DeepSeek, at a fraction of the inference cost.The... Read more

39

VentureBeat
VentureBeat 3 place · 01/07/2026 15:00 EDT

Nous Research, the open-source artificial intelligence startup backed by crypto venture firm Paradigm, released a new competitive programming model on Monday that it says matches or exceeds several larger proprietary systems — trained in just four days using 48 of Nvidia's latest B200 graphics processors.The model, called NousCoder-14B, is another entry in a crowded field of AI coding assistants, but arrives at a particularly charged moment: Claude Code, the agentic... Read more

3

VentureBeat
VentureBeat 1 place · 01/09/2026 18:14 EDT

Anthropic has confirmed the implementation of strict new technical safeguards preventing third-party applications from spoofing its official coding client, Claude Code, in order to access the underlying Claude AI models for more favorably pricing and limits — a move that has disrupted workflows for users of popular open source coding agent OpenCode. Simultaneously but separately, it has restricted usage of its AI models by rival labs including xAI (through the... Read more

3

VentureBeat
VentureBeat 2 place · 01/05/2026 19:00 EDT

A new study from researchers at Stanford University and Nvidia proposes a way for AI models to keep learning after deployment — without increasing inference costs. For enterprise agents that have to digest long docs, tickets, and logs, this is a bid to get “long memory” without paying attention costs that grow with context length.The approach, called “End-to-End Test-Time Training” (TTT-E2E), reframes language modeling as a continual learning problem: Instead... Read more

1

VentureBeat
VentureBeat 2 place · 01/06/2026 05:30 EDT

The arms race to build smarter AI models has a measurement problem: the tests used to rank them are becoming obsolete almost as quickly as the models improve. On Monday, Artificial Analysis, an independent AI benchmarking organization whose rankings are closely watched by developers and enterprise buyers, released a major overhaul to its Intelligence Index that fundamentally changes how the industry measures AI progress.The new Intelligence Index v4.0 incorporates 10... Read more

1

VentureBeat
VentureBeat · 01/09/2026 00:00 EDT

The big news this week from Nvidia, splashed in headlines across all forms of media, was the company's announcement about its Vera Rubin GPU.This week, Nvidia CEO Jensen Huang used his CES keynote to highlight performance metrics for the new chip. According to Huang, the Rubin GPU is capable of 50 PFLOPs of NVFP4 inference and 35 PFLOPs of NVFP4 training performance, representing 5x and 3.5x the performance of Blackwell.But... Read more

2

VentureBeat
VentureBeat · 01/09/2026 00:00 EDT

Presented by SAPSAP consulting projects today involve a vast amount of documentation, multiple stakeholders, and compressed timelines, which often require manual knowledge retrieval from online SAP documentation. At the same time, cloud ERP programs now demand faster design cycles, continuous enhancements rather than big-bang rollouts, and near-real-time decision-making. Joule for Consultants, SAP's conversational AI solution, was designed to help meet these expectations and support consultants throughout t Read more

2

VentureBeat
VentureBeat 3 place · 01/09/2026 13:29 EDT

Enterprise security teams are losing ground to AI-enabled attacks — not because defenses are weak, but because the threat model has shifted. As AI agents move into production, attackers are exploiting runtime weaknesses where breakout times are measured in seconds, patch windows in hours, and traditional security has little visibility or control.CrowdStrike's 2025 Global Threat Report documents breakout times as fast as 51 seconds. Attackers are moving from initial access... Read more

2

Most popular sources

  • You see 342 news out of 342.
  • Sources 61 out of 61.
Business Insider 34% 15
Tom's Hardware 23% 18
Engadget 7% 0
Slashdot 6% 2
The Verge 5% 4
View sources »

LIKE us on Facebook so you won't miss the most important news of the day!

11.01.2026 04:59
Last update: 04:50 EDT.
News rating updated: 11:51.

What is Times42?

Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.


Times42 © 2026