Skip to content

Support CDMS all species option; fix format for CDMS linelist reading; fix CDMS quantum numbers parsing #3302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

ParfenovS
Copy link

The default value for molecule parameter of query_lines function is 'All'. One, therefore, can expect that the following example code should work and return the lines of all species:

import astropy.units as u
from astroquery.linelists.cdms import CDMS
CDMS.query_lines(min_frequency=100.3 * u.GHz,max_frequency=100.5 * u.GHz,min_strength=-5)

However, this code does not work. This PR adds support for such a use case and makes the code above work. The returned line list can include the species with bad formatting. This PR adds a filter to exclude such species from the list (there are species with bad formatting in the frequency range in the example code above). Maybe it can be a good idea to warn a user that some lines can be excluded from the query results (I think one can discuss on how to do it in a proper way). This PR also adds some test of the new functionality. This test does not follow a pattern of other tests in that it does not utilize the the row number to check the test query result. The intent is to avoid the test breakage due to CDMS data updates.

PR2385 fixed some errors of line list parsing. However, some errors still remained. The starts value for MOLWT field should be decreased as this field occupies a larger number of characters that it is assumed in the current code version. An example is contained in the results of the example code above, where MOLWT field has an extra character for 1-CN-2-CCH-C6H4.
In the current code version, parse_letternumber function does not work correctly for parameters like a0, b0 etc. as it should according to CDMS documentation.
This PR fixes both these parsing errors.

@keflavich
Copy link
Contributor

Thanks for these fixes.

Regarding the failed example cases, I'll want to more carefully review and note them in tests. I'll do that now, just letting you know it's work in progress.

@keflavich
Copy link
Contributor

I'm trying to hunt down a molecule that actually uses the negative values for quantum numbers. Any ideas for a good one to use?

@keflavich
Copy link
Contributor

I built more tests, and in doing so, discovered more problems & corner cases. I have the feeling that CDMS is changing their fixed formatting on me every week, but the lack of test coverage means I have no evidence for that.

@ParfenovS
Copy link
Author

CH3CN is a good example of molecule, where lower case characters are used to signal negative quantum numbers smaller than –9.

Copy link

codecov bot commented Apr 28, 2025

Codecov Report

Attention: Patch coverage is 43.18182% with 25 lines in your changes missing coverage. Please review.

Project coverage is 69.82%. Comparing base (4ddb0aa) to head (c94ad8d).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
astroquery/linelists/cdms/core.py 43.18% 25 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3302      +/-   ##
==========================================
- Coverage   69.87%   69.82%   -0.06%     
==========================================
  Files         232      232              
  Lines       19769    19797      +28     
==========================================
+ Hits        13814    13823       +9     
- Misses       5955     5974      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bsipocz bsipocz added the cdms label Apr 28, 2025
@bsipocz bsipocz added this to the v0.4.11 milestone Apr 28, 2025
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a changelog, also I see test failures with pytest -P linelists -R. Could you @keflavich have another look?

(also, I see a couple of commits even with a "needs to be squashed" message, so do go ahead and give it a squash, too :) )

ParfenovS and others added 6 commits May 21, 2025 18:19
…; fix CDMS quantum numbers parsing

1) Adding support for CDMS queries with lines of all species
2) Fixing the CMDS lines list parsing

Support CDMS all species option; fix format for CDMS linelist reading; fix CDMS quantum numbers parsing

Adding test for a new functionality when all species are requested from CDMS
trivial formatting

fix my refactor; it was incorrect

oops, fix to last one (yes, this needs to be squashed; pushing fast to skip tests... and spam my inbox...)
shift tag back one spot.  Fix tests to accommodate more complete "tag" name

add the b1 = -21 test

whitespace
@keflavich
Copy link
Contributor

I swear CDMS is trolling me.... this was working 3 weeks ago. I'll look more closely again.

@bsipocz
Copy link
Member

bsipocz commented May 21, 2025

I swear CDMS is trolling me.... this was working 3 weeks ago. I'll look more closely again.

Yeap, I'm sorry that it took some for me to get back to this for a final review. On the bright side, it's not main that is broken now, only the PR 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants