Idea, running test after implementation #31

Durafen · 2025-07-18T16:28:16Z

Durafen
Jul 18, 2025

A lot of times claude doesn't rerun his test after implementation to make sure it passes now.

We can force him on task change or even just output a print that reminds claude to do it.

Is it a good idea in tdd flow?

nizos · 2025-07-19T06:10:11Z

nizos
Jul 19, 2025
Maintainer

Thanks for raising this! The agent should verify tests before moving on.

I have seen a similar behavior where it added a test and then another test without running the previous test or making it pass. It definitely breaks the TDD flow.

I have only seen it do that once though. I wonder if it is language-specific (more confident in Python?) or related to the recent model changes.

The current design ensures that the validator never blocks the agent from adding a single new test. I wanted the agent to always be able to move forward by creating a new test, but this seems to allow it to break the TDD cycle.

Your suggestion is definitely an improvement. However, Claude Code sometimes ignores postToolUse messages.

Could we address this programmatically instead? We could enrich the validator's context with an action history:

"""

added "should multiply two numbers" to calculator.test.ts.
Ran tests in calculator.test.ts: 1 failure - "should multiply two numbers" expected 9, received undefined.
added multiply method to calculator.ts.
"""

This gives the validator context about what happened before each modification. We can start simple: which file changed, test results: passed/failed and enrich it further later as needed.

This would also let us create integration tests for this behavior.

What are your thoughts? :)

0 replies

Durafen · 2025-07-19T19:01:25Z

Durafen
Jul 19, 2025
Author

I have been experimenting with idea that first time claude tries to finish a task, he sees a message (and it can retry it to pass) and move the next task
⎿ Error: TodoWrite operation blocked by hook:
- 💡 About to update todos! Remember to run tests after implementing changes to verify they pass (Green phase verification).
Click again to proceed.

For now, I have conflicting thoughts about the experiment, coz i noticed claude tries to cheat/cheese the system a lot, even the regular tests tdd implementation.
But one of the benefits that happened that claude just reformat his todo list, so he can continue in the current todo sequence, which i find as a big benefit, coz claude tends not to update his todolist even when its really needed

0 replies

Durafen · 2025-07-19T19:02:53Z

Durafen
Jul 19, 2025
Author

I do liked your idea of implementation, its going to be a lot "claude proof" than my experiment.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Idea, running test after implementation #31

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Idea, running test after implementation #31

Uh oh!

Durafen Jul 18, 2025

Replies: 3 comments

Uh oh!

nizos Jul 19, 2025 Maintainer

Uh oh!

Durafen Jul 19, 2025 Author

Uh oh!

Durafen Jul 19, 2025 Author

Durafen
Jul 18, 2025

nizos
Jul 19, 2025
Maintainer

Durafen
Jul 19, 2025
Author

Durafen
Jul 19, 2025
Author