Skip to content

Conversation

@ulfgebhardt
Copy link
Member

@ulfgebhardt ulfgebhardt commented Sep 26, 2025

🍰 Pullrequest

enforce coverage

Enforce a certain percentage of coverage. If this requirement is not met the test fails.

You can see it fail here, as the requirement is set to 100%.

image

You can see it succeed here.
image


This PR also fixes broken unit tests.

Issues

Todo

  • Unit tests are not enforced to pass before merge
image Example: image
  • Discovered flaky broken tests - fixed in 118f8f1
image image image image

@ulfgebhardt ulfgebhardt changed the title feat(workflow): enforce coverage [WIP] feat(workflow): enforce coverage Sep 26, 2025
coverage requirement on the right place
@ulfgebhardt ulfgebhardt self-assigned this Sep 26, 2025
@ulfgebhardt ulfgebhardt added bug something is wrong in Cypht tests test related labels Sep 26, 2025
more fixes
find clover.xml

find clover.xml

find clover.xml

found clover.xml

remove clutter
@ulfgebhardt
Copy link
Member Author

@marclaporte Please have a look. I am done here.

As this fixes the unit tests I would like to call priority on this. If you have doubts about the coverage enforcement, I would separate the fix from the feature.

Also I would like to have a talk about enforcing some workflows, so broken tests can't happen anymore and the coverage does not dwindle further.

@ulfgebhardt ulfgebhardt changed the title [WIP] feat(workflow): enforce coverage feat(workflow): enforce coverage Sep 26, 2025
public function test_message_controls() {
$mod = new Hm_Output_Test(array('msg_controls_extra' => 'foo', 'foo' => 'bar', 'bar' => 'foo'), array('bar'));
$this->assertEquals('<a class="toggle_link" href="#"><i class="bi bi-check-square-fill"></i></a><div class="msg_controls fs-6 d-none gap-1 align-items-center"><div class="dropdown on_mobile"><button type="button" class="btn btn-outline-success btn-sm dropdown-toggle" id="coreMsgControlDropdown" data-bs-toggle="dropdown" aria-haspopup="true" aria-expanded="true">Actions</button><ul class="dropdown-menu" aria-labelledby="coreMsgControlDropdown"><li><a class="dropdown-item msg_read core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="read">Read</a></li><li><a class="dropdown-item msg_unread core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unread">Unread</a></li><li><a class="dropdown-item msg_flag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="flag">Flag</a></li><li><a class="dropdown-item msg_unflag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unflag">Unflag</a></li><li><a class="dropdown-item msg_delete core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="delete">Delete</a></li><li><a class="dropdown-item msg_archive core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="archive">Archive</a></li><li><a class="dropdown-item msg_junk core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="junk">Junk</a></li></ul></div><a class="msg_read core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="read">Read</a><a class="msg_unread core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unread">Unread</a><a class="msg_flag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="flag">Flag</a><a class="msg_unflag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unflag">Unflag</a><a class="msg_delete core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="delete">Delete</a><a class="msg_archive core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="archive">Archive</a><a class="msg_junk core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="junk">Junk</a>foo</div>', message_controls($mod));
$this->assertEquals('<a class="toggle_link" href="#"><i class="bi bi-check-square-fill"></i></a><div class="msg_controls fs-6 d-none gap-1 align-items-center"><div class="dropdown on_mobile"><button type="button" class="btn btn-outline-secondary btn-sm dropdown-toggle" id="coreMsgControlDropdown" data-bs-toggle="dropdown" aria-haspopup="true" aria-expanded="true">Actions</button><ul class="dropdown-menu" aria-labelledby="coreMsgControlDropdown"><li><a class="dropdown-item msg_read core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="read">Read</a></li><li><a class="dropdown-item msg_unread core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unread">Unread</a></li><li><a class="dropdown-item msg_flag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="flag">Flag</a></li><li><a class="dropdown-item msg_unflag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unflag">Unflag</a></li><li><a class="dropdown-item msg_delete core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="delete">Delete</a></li><li><a class="dropdown-item msg_archive core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="archive">Archive</a></li><li><a class="dropdown-item msg_junk core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="junk">Junk</a></li></ul></div><a class="msg_read core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="read">Read</a><a class="msg_unread core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unread">Unread</a><a class="msg_flag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="flag">Flag</a><a class="msg_unflag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unflag">Unflag</a><a class="msg_delete core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="delete">Delete</a><a class="msg_archive core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="archive">Archive</a><a class="msg_junk core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="junk">Junk</a>foo</div>', message_controls($mod));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff is hard to read here.

The difference in all failing tests is a html class.

btn-outline-success becomes btn-outline-secondary

@marclaporte marclaporte requested a review from kroky September 27, 2025 02:30
@kroky
Copy link
Member

kroky commented Sep 29, 2025

@marclaporte Please have a look. I am done here.

As this fixes the unit tests I would like to call priority on this. If you have doubts about the coverage enforcement, I would separate the fix from the feature.

Also I would like to have a talk about enforcing some workflows, so broken tests can't happen anymore and the coverage does not dwindle further.

Our coverage is not good and enforcing a percent will further deepen the problem with tests - it will require developers to add tests just for the sake of coverage instead of thinking what really should be tested. Imagine adding some simple code that drops the coverage from 70.5% to 69.5%. You are forced to add some tests to cover the 70% threshold but the code you added is simple and doesn't need testing. You are forced to either add tests for other code (which done in the same MR has a number of issues by itself) or add tests to cover your code which is sometimes counter-productive. Having too many tests is also not something desirable - we have more and more code to maintain and having many tests doesn't mean they are quality tests - i.e. they are really catching bugs when they should.

Thus, I am not a fan of coverage reports and am against enforcing any percent here. Instead, we should think about critical or complicated code blocks that should be tested. We can add a requirement for adding proper unit or integration tests for any new feature added or major refactoring. We can and should enforce passing the test suite before merging but this should be done when the tests are stable - recently, we have had too many intermittent selenium test failures, for example. Overall, a common sense should be used when thinking about what items to test and what tests to add - a digit percentage is not suitable enough to make this decision.

@ulfgebhardt
Copy link
Member Author

ulfgebhardt commented Sep 29, 2025

@marclaporte Please have a look. I am done here.
As this fixes the unit tests I would like to call priority on this. If you have doubts about the coverage enforcement, I would separate the fix from the feature.
Also I would like to have a talk about enforcing some workflows, so broken tests can't happen anymore and the coverage does not dwindle further.

Our coverage is not good and enforcing a percent will further deepen the problem with tests - it will require developers to add tests just for the sake of coverage instead of thinking what really should be tested. Imagine adding some simple code that drops the coverage from 70.5% to 69.5%. You are forced to add some tests to cover the 70% threshold but the code you added is simple and doesn't need testing. You are forced to either add tests for other code (which done in the same MR has a number of issues by itself) or add tests to cover your code which is sometimes counter-productive. Having too many tests is also not something desirable - we have more and more code to maintain and having many tests doesn't mean they are quality tests - i.e. they are really catching bugs when they should.

Thus, I am not a fan of coverage reports and am against enforcing any percent here. Instead, we should think about critical or complicated code blocks that should be tested. We can add a requirement for adding proper unit or integration tests for any new feature added or major refactoring. We can and should enforce passing the test suite before merging but this should be done when the tests are stable - recently, we have had too many intermittent selenium test failures, for example. Overall, a common sense should be used when thinking about what items to test and what tests to add - a digit percentage is not suitable enough to make this decision.

I disagree on all points with you here.

  • Forcing Developers to write tests is good to ensure they also learn and cover this part of software development.
  • Even a Snapshot render test can proof useful for other developers as they see that something changed where they did not expect it
  • Unit tests are very cheap (compared to integration & fullstack tests)
  • The projects I work with all use this mechanic and all have coverage of 90%+ consistently.
  • I can also observe when implementing this mechanic and starting even with a low coverage like 5-10% it will lead to increased coverage over time
  • You can always lower the coverage requirement

@marclaporte
Copy link
Member

Instead, we should think about critical or complicated code blocks that should be tested.

Is there a way to help detect this or it's pretty much based on the feedback of experienced developers?

@kroky
Copy link
Member

kroky commented Sep 29, 2025

You think that quantitative measure is good for ensuring a quality test suite? Forcing developers write tests will lead to bloatware if developers are not careful enough and if they are - they will already write quality tests - then, why do we force them? Sorry but I don't see the point here.

@kroky
Copy link
Member

kroky commented Sep 29, 2025

@ulfgebhardt have you actually compared the tests we have against the tons of module code we have - I am not even including the complicated JS we have? I think the actual "coverage" of the code is closer to 30% than 78% at the moment just from common judgement based on the fact that:

  1. I have worked on cypht's codebase in the past several years
  2. I have done pretty major refactorings and have reviewed most of the testing code

This percentage enforcement is very misleading.

@kroky
Copy link
Member

kroky commented Sep 29, 2025

Instead, we should think about critical or complicated code blocks that should be tested.

Is there a way to help detect this or it's pretty much based on the feedback of experienced developers?

I am not sure there is an automated way to detect this but we can establish strong quidelines and make PR reviewers check them besides the constributors.

@ulfgebhardt
Copy link
Member Author

ulfgebhardt commented Sep 29, 2025

@kroky
I did not compare what is covered and what not - and its not my task to do this. There might be portions of the software which are not covered (e.g. all javascript as this are php unit tests) and not detected as to be covered, but that is something you would notice during development and change rather then me actively checking this with an unknown codebase.

My method to verify it works: Coverall (a tool you already use) reports 80%, this PR shows 78%, which is sufficiently close in my opinion.


This also gives the developer new tools for discovering uncovered code. Here is an example from a different project, but this can also be made available to developers in this project if #1710 is solved.

See overview what files are covered and how much
image

Detect uncovered lines and branches:
image


quoting @kroky

I think the actual "coverage" of the code is closer to 30% than 78% at the moment just from common judgement based on the fact that:

I have worked on cypht's codebase in the past several years
I have done pretty major refactorings and have reviewed most of the testing code

I do not doubt your competence regarding cypht. I just want to provide a common solution to a problem the project currently faces and I happen to know that this method has achieved its goal on other repositories after being implemented.


The metric of a "good review" is obviously flawed, as you have broken unit tests in your master and coverage has decreased over time from 95% to now 78% (assuming the number is actually correct).

This approach gives a structural solution to exactly this problem; it is the standard all around github; it is proven to work and makes software better and more reliable.

And while I understand that it is very annoying in the beginning to be forced to write tests, it is how it is done. I started with the same mindset, but that has changed - now I actually like ensuring things work with tests and I often discover flaws and shortcomings or edge cases I have not thought about when implementing the tests.


I propose this, because the software currently requires a lot of manual testing - which I am doing and am reported to be one of the few - discovering a lot of flaws.

This solves this problem so I am no longer needed and software quality increases.

@kroky
Copy link
Member

kroky commented Sep 30, 2025

@ulfgebhardt thanks for your notes and advises! I appreciate you taking the time to do all this manual testing, reporting bugs and dealing with consequences. I can assure you there are others doing the same. Major part of the current situation with broken tests in master and dealing with a lot of bugs (which, btw, are not caught by the existing unit tests) is the lack of reviews that Cypht used to have for at least a year. I have requested to do a review of all code merged in master for a month or so in order to make it stable again and release stable versions. Things are progressing smoothly but maybe more slowly than we anticipated.

I can give you several PR request examples that contained tests that would have increased the coverage but which tests actually tested almost nothing - they contained copy-pasted chunks of code from the modules and tested code running in the tests themselves rather than the module code. This needed a manual review and advice from a more senior developer, so that tests become useful and actually contribute to the coverage. Additionally, the developers gained experience and learned how to deal with such situations in the future, so this manual review resulted in many benefits and I can't consider it as "flawed". My main point here is that numeric percentage is not the one we need here to deal with the current situation. It might be harmful and misleading if we have to deal with sloppy written tests. That's why I am against forcing it.

However, I would definitely want to:

  • add a requirement for unit and integration tests to pass before being able to merge code in master
  • show code coverage report in the CI and warn/advise developer to consider updating their code if the coverage drops. It will at least be a nice reminder to add tests to something you have forgotten to do
  • establish commit and code quality guidelines that includes do's and don'ts for automated tests

@kroky
Copy link
Member

kroky commented Oct 1, 2025

@kroky
Copy link
Member

kroky commented Oct 3, 2025

I have pushed a fix for the tests, so they now pass in master. I have also added a restriction for merge requests to have latest master, pass the checks and resolve the conversations before merging.
Next thing is to add a visual representation of the coverage percent as a check and make it show to the developer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug something is wrong in Cypht tests test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants