
Scott Clark
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
How to Find the Agent Failures Your Evals Miss with Scott Clark
- Published
- May 7, 2026
- Duration
- 47:05
- Summary source
- description
- Last updated
- May 10, 2026
Discusses llm, evals.
Summary
In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Maslow’s hierarchy of observability: telemetry for logging, monitoring for known signals, and post-production or online analytics to surface unknown unknowns. We…
Show notes
In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Maslow’s hierarchy of observability: telemetry for logging, monitoring for known signals, and post-production or online analytics to surface unknown unknowns. We dig into examples of real-world failures Scott’s team has seen in production systems, such as “lazy” tool-use hallucinations that standard eva
Themes
- llm
- evals