Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan cover art

Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan

Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan

Listen for free

View show details

About this listen

Today, I want to share a new episode with Aman Khan.The best way to learn about AI evaluations is to watch 2 PMs build them live from scratch. In our new episode, Aman and I walk through creating evals for an AI customer support agent — from labeling a golden dataset to aligning LLM judges. This is the complete beginners AI eval course you've been waiting for.Aman and I talked about:

(00:00) What are AI evals and how to get good at them

(02:52) The 4 types of AI evaluations everyone should know

(06:08) Live demo: Building evals for a customer support agent

(10:29) Using Anthropic's console to generate great prompts

(15:13) Creating the evaluation criteria

(17:40) Adding human labels to the golden dataset

(31:05) Scaling evals with LLM-judge prompts

(38:21) How to align LLM judges with human judgmentGet the takeaways: https://creatoreconomy.so/p/complete-beginner-course-on-ai-evaluations-aman-khanWhere to find Aman:

X: https://www.linkedin.com/in/amanberkeley/

Website: https://arize.com/📌 Subscribe to this channel – more interviews coming soon!

No reviews yet
In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.