Skip to main content
CJ
  • Articles
  • Projects
  • Contact

Tagged: inference scaling

2 articles on inference scaling.

The LLM Year in Review: What Actually Mattered in 2025 (And What Was Noise)The LLM Year in Review: What Actually Mattered in 2025 (And What Was Noise)
January 8, 2026

The LLM Year in Review: What Actually Mattered in 2025 (And What Was Noise)

The prediction was: bigger models win. The reality was: DeepSeek R1 rewrote the rules in January and nothing was the same after that. Here is what 2025 actually taught us about reasoning, inference-time compute, and the changing economics of intelligence.

EngineeringRead more →
Trading Speed for Quality: A Practical Guide to Inference-Time ScalingTrading Speed for Quality: A Practical Guide to Inference-Time Scaling
October 17, 2024

Trading Speed for Quality: A Practical Guide to Inference-Time Scaling

Inference-time scaling lets you tune the latency-quality tradeoff at runtime rather than at training time. Here is a practical framework for deciding when to use Best-of-N sampling, beam search, iterative refinement, or one-shot generation — with real examples from clinical AI.

EngineeringRead more →

Clint Johnson

I build stuff for healthcare companies. Sometimes it works, sometimes I learn something. Always caffeinated, usually in Nashville.

Site

  • Articles
  • Projects
  • Contact
  • RSS

Connect

  • 1Put Health

    Healthcare innovation studio

    View

© 2026 Clint Johnson. All rights reserved.