Articles from 2024

17 articles published in 2024.

December 29, 2024

Every Failed AI Product Has the Same Root Cause

The same failure pattern shows up everywhere: teams shipping fast and iterating on vibes instead of building systematic evaluation. Evals aren't a nice-to-have. They're the core competency of any serious AI product team.

ProductRead more →

December 10, 2024

The 6 Ways I've Watched GenAI Projects Fail (And How to Avoid Them)

GenAI projects in healthcare go sideways in predictable ways, sometimes with real patient consequences. Six failure modes that come up over and over again, and what to do instead.

ProductRead more →

November 11, 2024

When to Look Beyond Standard LLMs (And When to Stop Overthinking It)

Most teams should use a frontier API and move on. There are specific situations where alternative architectures matter: extreme latency, long-context scale, cost walls, privacy constraints. The decision framework.

EngineeringRead more →

November 4, 2024

When Recommendations Meet Language: The LLM-RecSys Convergence

Most AI stacks treat the recommendation engine and the language model as two separate systems that hand off to each other. A new class of hybrid models eliminates that seam. The implications for domain-specific AI are significant.

EngineeringRead more →

October 17, 2024

Trading Speed for Quality: A Practical Guide to Inference-Time Scaling

Inference-time scaling lets you tune the latency-quality tradeoff at runtime instead of at training time. When to use Best-of-N sampling, beam search, iterative refinement, or one-shot generation, with real examples from clinical AI.

EngineeringRead more →

September 28, 2024

Inside the Black Box: What Mechanistic Interpretability Means for Builders

Healthcare AI requires explainability. 'The model said so' is not a clinical rationale. Mechanistic interpretability is the research field trying to change that. What it offers practitioners today, where the gap is, and what to do in the meantime.

EngineeringRead more →

September 10, 2024

How to Actually Test If Your AI Will Say Something Dangerous

Most teams treat jailbreak testing as a vibe check. StrongREJECT achieves 0.90 Spearman correlation with human judgment. Automated safety evaluation is real, and there's no excuse not to build it into your pipeline.

EngineeringRead more →

August 23, 2024

The Attack Your LLM App Is Definitely Vulnerable To

Prompt injection is the #1 OWASP threat to LLM applications and most teams aren't taking it seriously. What the attack looks like, why it's hard to stop, and how to harden your system.

EngineeringRead more →

August 4, 2024

The Honest Guide to LLM Evals: What Actually Works

Most teams skip real evals and wonder why their AI products degrade in production. The framework that holds up: from 30-minute manual reviews to binary scoring to knowing when your eval suite is finally doing its job.

EngineeringRead more →

July 17, 2024·6 min read

5 Reasons to Solve for Adoption Before Building Your Digital Health Tool

Clinicians love the idea but no one's buying. That gap is a pattern, and it almost never comes down to the technology. Five adoption problems to solve before you build the product.

HealthcareRead more →

June 29, 2024

Why Your LLM Evaluator Is Lying to You

LLM-as-judge evaluators feel like quality assurance but behave like rubber stamps. They fail hardest on the outputs that matter most: edge cases, safety-critical errors, domain-specific nuance. What to do instead.

EngineeringRead more →

June 11, 2024

Why I Stopped Using RAG for Coding Agents (And What I Do Instead)

The instinct when building a coding agent is 'I need RAG to handle large codebases.' The better instinct is giving the agent tools to explore code the way a senior engineer would: reading files, following imports, tracing execution.

EngineeringRead more →

June 4, 2024·6 min read

React Tooling 2024: Stop Using the Wrong Shit

20+ React apps built this year. What works, what's a waste of time, and why you're probably overengineering.

ReactRead more →

April 17, 2024

The Neural Net Training Recipe That Actually Works

I spent months chasing architecture fixes when my real problem was bad debugging hygiene. The training recipe that works: start simple, visualize everything, tune last. The unglamorous discipline that separates working models from expensive experiments.

EngineeringRead more →

March 30, 2024

You Don't Need GPT-4 for That: Small Models and Edge Agents

Frontier models aren't required for agentic function calling. For healthcare AI, assuming they are can also be a compliance liability. When a fine-tuned 7B model is the right architecture, and when it isn't.

EngineeringRead more →

March 11, 2024

Multi-Agent Orchestration in Practice: What I Learned Building Parallel Agent Systems

The orchestrator/worker pattern is the key mental model for multi-agent systems. How to structure orchestrators, spawn and manage workers, aggregate results, and avoid the coordination failures that sink most implementations.

EngineeringRead more →

January 16, 2024

What It Actually Takes to Build a Real LLM Agent

Everyone's talking about agents. Few have shipped one that works in production. The failure modes, memory tradeoffs, and tool design decisions the architecture papers skip.

ReactRead more →

Articles from 2024

Every Failed AI Product Has the Same Root Cause

The 6 Ways I've Watched GenAI Projects Fail (And How to Avoid Them)

When to Look Beyond Standard LLMs (And When to Stop Overthinking It)

When Recommendations Meet Language: The LLM-RecSys Convergence

Trading Speed for Quality: A Practical Guide to Inference-Time Scaling

Inside the Black Box: What Mechanistic Interpretability Means for Builders

How to Actually Test If Your AI Will Say Something Dangerous

The Attack Your LLM App Is Definitely Vulnerable To

The Honest Guide to LLM Evals: What Actually Works

5 Reasons to Solve for Adoption Before Building Your Digital Health Tool

Why Your LLM Evaluator Is Lying to You

Why I Stopped Using RAG for Coding Agents (And What I Do Instead)

React Tooling 2024: Stop Using the Wrong Shit

The Neural Net Training Recipe That Actually Works

You Don't Need GPT-4 for That: Small Models and Edge Agents

Multi-Agent Orchestration in Practice: What I Learned Building Parallel Agent Systems

What It Actually Takes to Build a Real LLM Agent

Clint Johnson

Site

Connect

1Put Health