Platmosphere | A Mia-Platform Invitation

How do we know if our application functionally behaves the way we need it to? How do we guide agents so that it does what we need? My colleague Birgitta Böckeler observes: At the moment, most people who give high autonomy to their coding agents do this: - Feed-forward: A functional specification (of varying levels of detail, from a short prompt to multi-file descriptions) - Feed-back: Check if the AI-generated test suite is green, has reasonably high coverage, some might even monitor its quality with mutation testing. Then combine that with manual testing. You might expect that the AI-generated tests are being carefully read and reviewed; after all, they are our first line of defense, aren't they? But in practice, reading a lot of test code is even harder than reading a lot of generated production code. So we end up glancing at the tests and quickly skimming the test names. As a result, many practitioners who are otherwise very proficient with coding agents, are not feeling as confident about behaviour as we might expect. In this presentation, I argue that the problem is not with the tests, but with test-as-code, and I propose test-as-fixtures as a better alternative. There is a convergence of approaches: in the Go testing tradition, most tests are tabular, and the part you will want to read is not the test code, but the table. And when the table grows, it can be moved to text files. Something similar happens in the approval testing tradition, where we use custom printers to delete extraneous detail from an application output and compare it with an approved version. This is captured in the Approved Fixtures pattern, published by Ivett Ördög in the Augmented Coding Patterns. The advantage of fixtures is that they are basically pairs of (input, expected output), and are expressed in a human-friendly notation, and they are much easier and more fun to manually verify than test code. Because fixtures are easy to read, they restore a tests-first workflow: whether written by the developer or generated by the agent, the fixtures are reviewed and approved before implementation begins, and executed whenever the application changes, thus restoring our faith in the behaviour of our applications.

LANGUAGE

English

LEVEL

Intermediate

FORMAT

Talk

SPEAKERS

Matteo Vaccari

Technical Principal

@thoughtworks

INSIDE THE TALK

A harness for behaviour: ensuring AI-generated code does what we want - TDD in the age of AI

SPEAKERS