Skip to content

Unit tests rely on stale chemkin files and test database #2789

@sevyharris

Description

@sevyharris

Bug Description

While adding some new unit tests, I ran into the problem that the test database didn't have any autogenerated trees (with multiple groups specified). More generally, a lot of the chemkin files in the test_data folder are several years stale, and I realized that it would be very easy for someone to break various parts of RMG's post-processing code without the unit tests catching it.

Consider these examples:

  1. Someone changes how the chemkin writer denotes flux pairs in the comments. The fluxfiagramTest doesn't detect the change because it's relying on this 9-year-old chemkin file.
  2. Someone makes an awesome new tree type for estimating kinetics. The uncertainty tool can no longer parse the kinetics source from the comments, but uncertaintyTest thinks it's fine because the test database is still stuck at the same version from ~8 years ago.
  3. Someone adds a new option for estimating species thermo (or something that involves reading from the database). They go to add unit tests and it turns out they need to update the test database to a more recent version that supports their new feature. They update the test database and 10 completely unrelated unit tests now fail because they depended on the old version, which nobody has updated in years (partially because the effort to do so keeps increasing as more unit tests get added). In order to push the new changes, the developer needs to fix 10 other parts of RMG that they may not understand very well.

Discussion

I think there are two main problems here:

  1. It's a pain to update the test database, so nobody ever does it
  2. A lot of code that should be tested on fresh RMG models/chemkin files uses stale chemkin files instead

I have no idea what to suggest for 1. It's hard to automate because we need the test database to be small enough to load quickly, but complex enough to handle all the test cases in the unit tests.

For 2, I think we need a separate class of functional tests where RMG generates some fresh chemkin files using the latest database and then runs functional tests with those files. I don't know if it makes sense to just tack these on to the regression tests, since RMG is already running full models for those? But also, functional tests for things like the flux diagram and uncertainty estimator might be difficult to shoehorn into the regression testing framework and probably deserve a separate calling line in the CI.yml, possibly making use of the chemkin files generated during the regression test.

Does anyone have any suggestions for tackling these problems with the testing? Specifically, how can we implement functional tests that use fresh RMG output files without dramatically increasing the CI runtime?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions