Eval-driven agent development
Start from a working-but-flaky agent. Write a 10-case eval set, run it, watch it fail, iterate the system prompt, run again, watch the score move. Ends with the eval wired as a CI gate so regressions can't ship.
Details
City
Date
Time
Agenda
08:00AM – 09:00AM
Check in and breakfast
No items found.
10:00AM – 10:30AM
Morning break
morning sessions
No items found.
12:30PM – 02:00PM
Lunch
afternoon sessions
No items found.
Evening
06:00PM – 08:00PM
Anthropic's developer conference
Join us for a day of hands-on workshops, live demos of new capabilities and conversations with the teams behind Claude. Watch live from anywhere, or apply for an in-person seat in San Francisco, London, or Tokyo.