Skip to content

disdrodb review #156

@ghiggi

Description

@ghiggi

Submitting Author: Gionata Ghiggi (@ghiggi)
All current maintainers: (@ghiggi)
Package Name: disdrodb
One-Line Description of Package: disdrodb - A software for the decentralized archiving and standardization of global disdrometer data
Repository Link: https://github.com/ltelab/disdrodb
Version submitted: v.0.0.21
EIC: @isabelizimm
Editor: @Zeitsperre
Reviewer 1: TBD
Reviewer 2: TBD
Archive: TBD
JOSS DOI: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

Description

The raindrop size distribution (DSD) describes the concentration and size distributions of raindrops in a volume of air. It is a crucial piece of information to model the propagation of microwave signals through the atmosphere (key for telecommunication and weather radar remote sensing calibration), to improve microphysical schemes in numerical weather prediction models, and to understand land surface processes (rainfall interception, soil erosion).

Recognizing the importance of understanding DSD's spatial and temporal variability, scientists worldwide have initiated efforts to "count the drops" by deploying disdrometers—specialized instruments designed to record DSD. Numerous measurement campaigns have been conducted by meteorological services, national agencies (e.g., NASA, ARM, NCAR), and university research groups. Despite these efforts, a significant portion of the collected data remains difficult to access. These data are often stored in diverse formats with inadequate documentation and metadata, posing challenges in sharing, analyzing, comparing, and reusing the data.

In response to these challenges, the disdrodb Python package offers:

  1. A Decentralized Data Archive Infrastructure: The disdrodb package establishes a decentralized data archive, fostering the exchange and retrieval of raw disdrometer data within the scientific community. This infrastructure addresses the issue of data accessibility, documentation and promotes collaborative research.

  2. Standardization of Raw Data. The disdrodb package provides tools to convert heterogeneous raw data into a uniform netCDF4 format, known as the DISDRODB L0 product. This standardization is a significant step forward, ensuring that data from different sources become compatible and easier to analyze, compare, and share, thereby enhancing the overall utility and reusability of the data.

Scope

  • Data retrieval
  • Data extraction
  • Data processing/munging
  • Data deposition
  • Data validation and testing
  • Data visualization
  • Workflow automation
  • Citation management and bibliometrics
  • Scientific software wrappers
  • Database interoperability

Domain Specific & Community Partnerships

  • Geospatial
  • Education
  • Pangeo

How the and why the package falls under the categories you indicated above

Data Retrieval

The disdrodb package facilitates the retrieval of raw measurement acquired by disdrometer stations which are included in the DISDRODB Decentralized Data Archive. This remote archive comprises public cloud repositories such as Zenodo. The disdrodb package tracks the available stations through the DISDRODB Metadata Archive which is hosted on GitHub.

Data Munging

After downloading the desired data, users can use disdrodb to convert the heterogeneous raw data into a uniform netCDF4 format (DISDRODB L0) with a single command. This conversion facilitates subsequent scientific analysis and product generation. For each disdrometer station, the disdrodb python package has a specialized reader that enable to accurately parse the raw sensor data.

Data Deposition

The disdrodb package offers a workflow for users who wish to contribute their disdrometer measurements to the DISDRODB community. This workflow ensures the long-term documentation of the data and simplifies the data upload process to the DISDRODB Decentralized Data Archive. Users must perform three main tasks:

  • Create a reader that reads the raw data into a dataframe, adhering to the DISDRODB guidelines.

  • Provide the metadata of the disdrometer station, which will be included in the DISDRODB Metadata Archive.

  • Upload the station's raw data to a remote repository and insert the station data URL into the DISDRODB Metadata Repository. The disdrodb package can automate this final step if the chosen remote repository is Zenodo.

Who is the target audience and what are scientific applications of this package?

The primary audience for this package includes researchers and students in the fields of remote sensing and atmospheric science, specifically those focused on precipitation. The package is designed to support applications in remote sensing and atmospheric science.

Are there other Python packages that accomplish the same thing? If so, how does yours differ?

To our knowledge, there are no other packages that offer an integrated infrastructure for retrieving, sharing, archiving, reading, and standardizing disdrometer data.

However, the pyDSD package exists for studying the DSD. It provides methods for high-level scientific analysis of disdrometer raw data, such as computing DSD parameters and simulating weather radar reflectivities.

The DISDRODB Working Group plans to leverage and adapt pyDSD codes in the future to generate uniform, high-level scientific products for all stations within the DISDRODB Global Archive.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • uses an OSI approved license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a tutorial with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

  • I have read the author guide.
  • I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    on-hold-or-maintainer-unresponsive

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions