TBD
Hub Room
Business

INSIDE THE TALK

AI Evaluations for Platform Engineering: A Domain-Driven Framework for Proving Value at Scale


"Every vendor claims their AI features will transform your platform. Every internal team believes their AI tooling will reduce toil. But without rigorous evaluation, we're flying blind—unable to distinguish genuine capability from marketing promises or well-intentioned but ineffective implementations. This session introduces a practical framework for evaluating AI capabilities in platform engineering contexts, grounded in Domain-Driven Platform Engineering principles. You'll learn how to assess AI tools across four critical categories - IaC Generation - Incident Response - Documentation - Policy Enforcement These 4 are taken while accounting for the fundamentally different definitions of ""good"" held by developers, SREs, security engineers, and engineering leaders. We'll explore the Evaluation Tradeoff Triangle (code-based, LLM-as-judge, and human review) and demonstrate how to build a hybrid evaluation strategy that optimizes for speed, cost, and accuracy. Most importantly, you'll see how to connect AI evaluation results to business outcomes—proving ROI, de-risking adoption decisions, and building the governance structures AI-native platforms demand. Attendees will leave with actionable tools: a scoring framework, persona-weighted evaluation matrices, and a method selection guide they can apply immediately to their vendor assessments and internal AI initiatives."
LANGUAGE
English
LEVEL
Intermediate
FORMAT
Talk

SPEAKERS

Speaker
Ajay Chankramath
Founder & CEO
@Platformetrics
Platmosphere logo