Enterprises building voice-enabled workflows have had limited options for production-grade transcription: closed APIs with data residency risks, or open models that trade accuracy for deployability. Cohere's new open-weight ASR model, Transcribe, is built to compete on all four key differentiators — contextual accuracy, latency, control and cost.Cohere says that Transcribe outperforms current leaders on accuracy — and unlike closed APIs, it can run on an organization's own infrastructure.Cohere, which can... Read more ›
0
Last week, one of our product managers (PMs) built and shipped a feature. Not spec'd it. Not filed a ticket for it. Built it, tested it, and shipped it to production. In a day.A few days earlier, our designer noticed that the visual appearance of our IDE plugins had drifted from the design system. In the old world, that meant screenshots, a JIRA ticket, a conversation to explain the intent,... Read more ›
0
Many people have tried AI tools and walked away unimpressed. I get it — many demos promise magic, but in practice, the results can feel underwhelming.That’s why I want to write this not as a futurist prediction, but from lived experience. Over the past six months, I turned my engineering organization AI-first. I’ve shared before about the system behind that transformation — how we built the workflows, the metrics, and... Read more ›
0
Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique called IndexCache that cuts up to 75% of the redundant computation in sparse attention models, delivering up to 1.82x faster time-to-first-token and 1.48x faster generation throughput at that context length.The technique applies to models using the DeepSeek Sparse Attention... Read more ›
0
Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique called IndexCache that cuts up to 75% of the redundant computation in sparse attention models, delivering up to 1.82x faster time-to-first-token and 1.48x faster generation throughput at that context length.The technique applies to models using the DeepSeek Sparse Attention... Read more ›
0
Intercom is taking an unusual gamble for a legacy software company: building its own AI model.The 15-year-old, Dublin, Ireland-based massive customer service platform announced Fin Apex 1.0 on Thursday, a small, purpose-built AI model that the company claims outperforms leading frontier models from OpenAI and Anthropic on the metrics that matter most for customer support. The model powers Intercom's existing Fin AI agent, which already handles over one million customer... Read more ›
0
The enterprise voice AI market is in the middle of a land grab. ElevenLabs and IBM announced a collaboration just this week to bring premium voice capabilities into IBM's watsonx Orchestrate platform. Google Cloud has been expanding its Chirp 3 HD voices. OpenAI continues to iterate on its own speech synthesis. And the market underpinning all of this activity is enormous — voice AI crossed $22 billion globally in 2026,... Read more ›
0
Presented by OutSystems After two years of flashy AI demos, rushed agent prototypes, and breathless predictions, enterprise technology leaders are striking a more pragmatic tone in 2026. In a recent webinar hosted by OutSystems, a panel of software executives and enterprise practitioners made the case that the most consequential AI work happening now is focused on the practical matters of governance, orchestration, and iteration, along with integrating agents into the... Read more ›
0
As Large Language Models (LLMs) expand their context windows to process massive documents and intricate conversations, they encounter a brutal hardware reality known as the "Key-Value (KV) cache bottleneck."Every word a model processes must be stored as a high-dimensional vector in high-speed memory. For long-form tasks, this "digital cheat sheet" swells rapidly, devouring the graphics processing unit (GPU) video random access memory (VRAM) system used during inference, and slowing the... Read more ›
0
Enterprise data teams moving agentic AI into production are hitting a consistent failure point at the data tier. Agents built across a vector store, a relational database, a graph store and a lakehouse require sync pipelines to keep context current. Under production load, that context goes stale. Oracle, whose database infrastructure runs the transaction systems of 97% of Fortune Global 100 companies by the company's own count, is now making... Read more ›
0
Standard RAG pipelines break when enterprises try to use them for long-term, multi-session LLM agent deployments. This is a critical limitation as demand for persistent AI assistants grows.xMemory, a new technique developed by researchers at King’s College London and The Alan Turing Institute, solves this by organizing conversations into a searchable hierarchy of semantic themes.Experiments show that xMemory improves answer quality and long-range reasoning across various LLMs while cutting inference... Read more ›
0
OpenAI is shuttering Sora, its stand-alone AI video generation app and social network, and the availability for developers to access the Sora 2 video model family through its application programming interface (API) to rely on it for their own products or video generation pipelines.The announcement came abruptly this afternoon with OpenAI posting a message on X and not giving an exact shutdown date for the services, instead promising "timelines for... Read more ›
0
Anthropic on Monday launched the most ambitious consumer AI agent to date, giving its Claude chatbot the ability to directly control a user's Mac — clicking buttons, opening applications, typing into fields, and navigating software on the user's behalf while they step away from their desk.The update, available immediately as a research preview for paying subscribers, transforms Claude from a conversational assistant into something closer to a remote digital operator.... Read more ›
0
Web infrastructure giant Cloudflare is seeking to transform the way enterprises deploy AI agents with the open beta release of Dynamic Workers, a new lightweight, isolate-based sandboxing system that it says starts in milliseconds, uses only a few megabytes of memory, and can run on the same machine — even the same thread — as the request that created it. Compared with traditional Linux containers, the company says Dynamic Workers... Read more ›
0 newcommer
Web infrastructure giant Cloudlflare is seeking to transform the way enterprises deploy AI agents with the open beta release of Dynamic Workers, a new lightweight, isolate-based sandboxing system that it says starts in milliseconds, uses only a few megabytes of memory, and can run on the same machine — even the same thread — as the request that created it. Compared with traditional Linux containers, the company says that makes... Read more ›
0
Getting AI agents to perform reliably in production — not just in demos — is turning out to be harder than enterprises anticipated. Fragmented data, unclear workflows, and runaway escalation rates are slowing deployments across industries.“The technology itself often works well in demonstrations,” said Sanchit Vir Gogia, chief analyst with Greyhound Research. “The challenge begins when it is asked to operate inside the complexity of a real organization.” Burley Kawasaki,... Read more ›
0
Engineers building browser agents today face a choice between closed APIs they cannot inspect and open-weight frameworks with no trained model underneath them. Ai2 is now offering a third option.The Seattle-based nonprofit behind the open-source OLMo language models and Molmo vision-language family today is releasing MolmoWeb, an open-weight visual web agent available in 4 billion and 8 billion parameter sizes. Until now, no open-weight visual web agent shipped with the... Read more ›
0
Presented by SolidigmLiquid cooling is rewriting the rules of AI infrastructure, but most deployments have not fully crossed the line. GPUs and CPUs have moved to liquid cooling, while storage has depended on airflow, creating an operationally inefficient hybrid architecture.What appears to be a pragmatic transition strategy is, in practice, a structural liability. “A hybrid cooling approach is an operationally inefficient situation,” explains Hardeep Singh, thermal-mechanical hardware team manager at... Read more ›
0
ByteDance, the Chinese tech giant behind TikTok, last month released what may be one of the most ambitious open-source AI agent frameworks to date: DeerFlow 2.0. It's now going viral across the machine learning community on social media. But is it safe and ready for enterprise use?This is a so-called "SuperAgent harness" that orchestrates multiple AI sub-agents to autonomously complete complex, multi-hour tasks. Best of all: it is available under... Read more ›
0
The Innovation Showcase is back at Transform 2026: The Orchestration of Enterprise Agentic AI at Scale, taking place July 14 and 15 in Menlo Park.This year, we are moving beyond generative AI to autonomous agents, focusing on enterprise agentic orchestration, LLM observability and evaluation (LLMOps), RAG infrastructure, inference platforms and optimization, and agentic AI security and identity.We’re on the hunt for the 10 most innovative autonomous agent technologies poised to... Read more ›
0
Most popular sources
|
|
0% |
|
|
0% |
|
|
0% |
|
|
0% |
|
|
0% |
| View sources » | |
LIKE us on Facebook so you won't miss the most important news of the day!
10.06.2026 16:22
Last update: 16:11 EDT.
News rating updated: 23:12.
What is Times42?
Times42 brings you the most popular news from tech news portals in real-time chart.
Read about us in FAQ section.