What model checking taught me about evaluating AI coding agents
Why a unit test is the wrong tool for a nondeterministic agent.
May 25, 20263 min read
Search for a command to run...
Why a unit test is the wrong tool for a nondeterministic agent.
Claude Code, Codex, and Windsurf now look alike: all three have skills, hooks, and plugins. So it's natural to assume your setup ports between them. It mostly doesn't. I maintain a repo that targets a