pMD design considerations

The preliminary implementation of data-raw/sampleMetadata.R (https://github.com/ASAP-MAC/parkinsonsMetagenomicData/pull/7/) contains hard-coded file names, dataset names, and joining code that is repeated for each dataset. This will be cumbersome to maintain as data are added or updated. Some design considerations for pMD:

we need to choose "reference" cross-platform data that will be routinely/automatically updated as data become available or updated. The simplest would be files directly output by the NF pipeline, but further derived data can be used if needed for computational efficiency by package users.
break down big functions into small, single-purpose functions. Small functions for internal use only can be not exported (easiest way is to start these function names with a .).
don't hard-code names of individual datasets or files - functions should work independently of what data are available. Use some kind of iteration to join samples requested by the user.
consider where sampleMetadata should come from. Having it as a .rda file in data/ has always been a hassle in maintenance of cMD; it could be much more convenient if this were instead a function that included a version argument that pulls metadata directly from the source.
add roxygen2 markup for all exported functions.

Use this wiki space to propose design/implementation in more detail.

Data sources

Data Processing Results

Raw pipeline output stored at gs://metagenomics-mac

Vignette provides an example of downloading a single file based on sample UUID and output file type. Download of multiple files at the same time should also be supported, but is not yet validated.

Curated Metadata

Currently manually maintained at parkinsonsManualCuration, transitioning to be sourced from ODM and curation team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pMD design considerations

Data sources

Uh oh!

Clone this wiki locally