Eval-Driven Development
AI systems don’t fail because they’re dumb, they fail because we never define “good.”
AI systems don’t fail because they’re dumb, they fail because we never define “good.”
The three levers for steering pre-trained models and the evals that guide them.
When answers replace clicks, publishers, brands, and platforms face a new incentive structure.
Every week I see pitches that follow a similar pattern: “X for Y, powered by AI.” I call this promptware.
Why technical leadership just got exponentially harder.
Apple’s Illusion of Thinking reminds us that transformers may asymptote to human ability and that may be enough to transform the world.