Tagged: inference scaling

2 articles on inference scaling.

January 8, 2026

The LLM Year in Review: What Actually Mattered in 2025 (And What Was Noise)

The prediction was: bigger models win. The reality was: DeepSeek R1 rewrote the rules in January and nothing was the same after that. What 2025 taught us about reasoning, inference-time compute, and the economics of intelligence.

EngineeringRead more →

October 17, 2024

Trading Speed for Quality: A Practical Guide to Inference-Time Scaling

Inference-time scaling lets you tune the latency-quality tradeoff at runtime instead of at training time. When to use Best-of-N sampling, beam search, iterative refinement, or one-shot generation, with real examples from clinical AI.

EngineeringRead more →

Tagged: inference scaling

The LLM Year in Review: What Actually Mattered in 2025 (And What Was Noise)

Trading Speed for Quality: A Practical Guide to Inference-Time Scaling

Clint Johnson

Site

Connect

1Put Health