Scaling Inference Time Compute

How Test-Time Compute Can Help Scale AI

For years, it seemed obvious that the best way to scale up artificial intelligence models was to throw more upfront computing resources at them. The theory was that performance improvements are ...

16hon MSN

OpenAI Taps Cerebras to Speed Up Real-Time AI at Scale

OpenAI partners with Cerebras to add 750 MW of low-latency AI compute, aiming to speed up real-time inference and scale ...

Network World

OpenAI turns to Cerebras in a mega deal to scale AI inference infrastructure

The multibillion-dollar deal shows how the growing importance of inference is changing the way AI data centers are designed ...

NextBigFuture

OpenAI Strawberry LLM Reasoning Needs More Compute and Energy for Inference

Jim Fan is one of Nvidia’s senior AI researchers. The shift could be about many orders of magnitude more compute and energy needed for inference that can handle the improved reasoning in the OpenAI ...

VentureBeat

How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Very small language models (SLMs) can ...

DDN Powers Integrated Compute, Data, and Offload at Scale for NVIDIA Rubin Platform

DDN, the world’s leading AI data platform provider, today announced deep collaboration with NVIDIA to support the company’s ...

NextBigFuture

Test Time Training Will Take LLM AI to the Next Level

MIT researchers achieved 61.9% on ARC tasks by updating model parameters during inference. Is this key to AGI? We might reach the 85% AGI doorstep by scaling and integrating it with COT (Chain of ...

Geeky Gadgets

Google DeepMind Unlocks the Future of AI Efficiency

Google DeepMind’s recent research offers a fresh perspective on optimizing large language models (LLMs) like OpenAI’s ChatGPT-o1. Instead of merely increasing model parameters, the study emphasizes ...

Forbes

Scaling Laws And AI's Future: Lessons From Building Large Systems

Yi Shi is the founder of FlashIntel, pioneering AI agent software. A computer science expert & e/acc proponent shaping transformative tech. After years of building AI products and autonomous agents, ...

TMCnet

How to rent a GPU server for AI, rendering, and heavy compute tasks

Modern compute-heavy projects place demands on infrastructure that standard servers cannot satisfy. Artificial intelligence ...

InfoQ

OpenAI Presents Research on Inference-Time Compute to Better AI Security

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

VentureBeat

When AI reasoning goes wrong: Microsoft Research shows more tokens can mean more problems

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Large language models (LLMs) are ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results