The real talk on agent evaluation

0 views

0 0

The real talk on agent evaluation

Join Kristopher Overholt, Aja Hammerly, and Jason Davenport as they unpack how software engineers can practically approach building AI agents—without the hype. From defining what an agent is (“a program with a job”) to creating useful proof-of-concepts using tools like Gemini 2.5 Pro and grounding, the group shares approachable strategies for getting started. Kristopher explains how his own "Hello World" agent used long-term memory and web search to mirror real developer workflows, while Aja and Jason chime in with perspectives on coding styles, debugging, and how agent development fits into everyday engineering.

The trio also explores what it means to “vibe code” live at a developer event, and how to balance creativity with structure across teams and languages. In the final stretch, they tackle a critical but often overlooked topic: agent evaluation. They emphasize that effective measurement isn’t about perfection—it’s about closing the feedback loop, starting with a clear metric, and understanding that agents are layered systems built on tools, models, and reasoning. Whether you’re experimenting with frontend tools or building backend logic, this episode delivers real-world insight into the evolving role of AI in software development.

Resources:
Get started with Agent Development Kit and memory → https://goo.gle/3HyTz5F
and https://goo.gle/4kFJt1e

Watch more Real Terms for AI → https://goo.gle/AIwordsExplained
Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech

#AIAgent #DevTips #AI

Speaker: Aja Hammerly, Jason Davenport, and Kristopher Overholt
Products Mentioned: Gemini, Vertex AI, AI Infrastructure

Date: June 13, 2025