Simon Willison has a way of making AI tools feel inevitable rather than hyped. His latest is scan-for-secrets, a Python package that scans log files and text files for accidentally committed API keys and credentials. The tool itself is narrow in scope: configure it with a config file at ~/.scan-for-secrets.conf.sh, point it at a directory, and it flags matches against six encoding formats including JSON, URL percent-encoding, HTML entities, backslash-doubled strings, and Unicode escapes. Released April 5 on his blog, it is free and open source.
But the tool is not the point. The point is how he built it.
"I built this tool using README-driven development," Willison wrote. He wrote the full specification for scan-for-secrets in a README describing exactly how it should behave, then handed that document to Claude Code and told it to implement the tool using red/green test-driven development. The GitHub repository contains three test files: test_cli.py, test_escaping.py, and test_scanner.py. That test coverage is the evidence. It is a real workflow, not a demo.
This matters because Willison has spent the past year thinking out loud about how AI coding tools actually work in practice, and his conclusions are more useful than the average takes in this space. In a Lenny Newsletter interview published this year, he argued that November 2025 was the inflection point when AI coding agents crossed from "mostly works" to "actually works." That framing is worth taking seriously. Willison is not a bull case; he is a builder who has been running AI-assisted development daily and documenting what he finds.
His three daily agentic patterns, as he calls them, are: red/green TDD (write the failing test first, let the AI implement against it), templates (reusable prompt structures for common tasks), and hoarding (accumulating context and tools that make the AI more effective over time). The README-driven development pattern is an application of the first. The spec is the test, the AI is the implementer, and the developer is the reviewer who decides whether the output is correct.
The scan-for-secrets repo is the concrete example. The test files exist because a developer insisted on them before accepting the AI's output. That is a workflow other builders can replicate: write the spec, hand it to an AI, run the tests, iterate. Whether the tool itself is useful to any given developer is beside the point. The methodology is the product.
Willison has published his patterns in detail at simonwillison.net/guides/agentic-engineering-patterns. Red/green TDD is the anchor: it gives the AI a clear target and gives the developer a fast way to detect when the AI has gone off the rails. Templates and hoarding are the scaffolding that makes the pattern scale across a project.
The counterargument is straightforward. This workflow requires a developer who knows what they want well enough to write the spec first. That is not a small requirement. The skill is in the specification, not in reviewing the AI's output. A developer who can write a good spec will often be faster writing the code themselves.
But that critique misses the real shift. The bottleneck in software development has never been implementation speed for simple tasks. It has always been translating a vague idea into a precise description of desired behavior. If AI coding tools are getting good enough to implement against a precise spec, the bottleneck shifts to the quality of the spec writer. Willison is building in the open at exactly the moment that question is becoming urgent for every engineering team that has started using these tools.
Scan-for-secrets is a credential scanner. It is also a proof of concept for a workflow that more engineering teams are going to have to develop opinions about.