-
-
Notifications
You must be signed in to change notification settings - Fork 417
Support CDMS all species option; fix format for CDMS linelist reading; fix CDMS quantum numbers parsing #3302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for these fixes. Regarding the failed example cases, I'll want to more carefully review and note them in tests. I'll do that now, just letting you know it's work in progress. |
I'm trying to hunt down a molecule that actually uses the negative values for quantum numbers. Any ideas for a good one to use? |
I built more tests, and in doing so, discovered more problems & corner cases. I have the feeling that CDMS is changing their fixed formatting on me every week, but the lack of test coverage means I have no evidence for that. |
CH3CN is a good example of molecule, where lower case characters are used to signal negative quantum numbers smaller than –9. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3302 +/- ##
==========================================
- Coverage 69.87% 69.82% -0.06%
==========================================
Files 232 232
Lines 19769 19797 +28
==========================================
+ Hits 13814 13823 +9
- Misses 5955 5974 +19 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a changelog, also I see test failures with pytest -P linelists -R
. Could you @keflavich have another look?
(also, I see a couple of commits even with a "needs to be squashed" message, so do go ahead and give it a squash, too :) )
…; fix CDMS quantum numbers parsing 1) Adding support for CDMS queries with lines of all species 2) Fixing the CMDS lines list parsing Support CDMS all species option; fix format for CDMS linelist reading; fix CDMS quantum numbers parsing Adding test for a new functionality when all species are requested from CDMS
trivial formatting fix my refactor; it was incorrect oops, fix to last one (yes, this needs to be squashed; pushing fast to skip tests... and spam my inbox...)
shift tag back one spot. Fix tests to accommodate more complete "tag" name add the b1 = -21 test whitespace
I swear CDMS is trolling me.... this was working 3 weeks ago. I'll look more closely again. |
Yeap, I'm sorry that it took some for me to get back to this for a final review. On the bright side, it's not |
The default value for molecule parameter of query_lines function is 'All'. One, therefore, can expect that the following example code should work and return the lines of all species:
However, this code does not work. This PR adds support for such a use case and makes the code above work. The returned line list can include the species with bad formatting. This PR adds a filter to exclude such species from the list (there are species with bad formatting in the frequency range in the example code above). Maybe it can be a good idea to warn a user that some lines can be excluded from the query results (I think one can discuss on how to do it in a proper way). This PR also adds some test of the new functionality. This test does not follow a pattern of other tests in that it does not utilize the the row number to check the test query result. The intent is to avoid the test breakage due to CDMS data updates.
PR2385 fixed some errors of line list parsing. However, some errors still remained. The starts value for MOLWT field should be decreased as this field occupies a larger number of characters that it is assumed in the current code version. An example is contained in the results of the example code above, where MOLWT field has an extra character for 1-CN-2-CCH-C6H4.
In the current code version, parse_letternumber function does not work correctly for parameters like a0, b0 etc. as it should according to CDMS documentation.
This PR fixes both these parsing errors.