This repository stores metadata templates in use at SciLifeLab, organized according to data type. The information flow between this repository, the data producing platforms and the data submitter with the end goal of data submission to a public end repository is sketched in the diagram below.
A template has a title, a description and a semantic version number, as well as well as a list of associated attribute fields. Each attribute field needs to have:
- name
- description
- type
- list of controlled vocabulary terms if applicable
- level of requirement/cardinality (mandatory vs optional)
- Potentially in the future: end_repository_alias (if applicable; can be multiple if multiple relevant end repositories are considered)
- Potentially in the future: reference_ontology (if exists)
In addition to data type specific fields capturing the technical metadata itself, all templates include additional organizational metadata such as
- SciLifeLab infrastructure platform and unit
- Unit internal project ID(s)
- Associated order ID
- Experimental Sample IDs (as assigned by the unit, 1 exp sample = 1 data file (pair))
- Associated Sample IDs (as shared by the researcher with the unit)
- Delivery date
- Template name
- Template version
A row entry for an individual sample would then be
<data_type_specific_field1> | ... | <data_type_specific_fieldM> | <data_file_name_R1> | ... | <data_file_name_RP> | <orga_meta_field1> | ... | <orga_meta_fieldN> |
---|
Templates are provided as .tsv, .json and .xlsx. The .json and .xlsx files include controlled vocabulary terms where available.
Title | Description | Link |
---|---|---|
SciLifeLab Genomics Technical Metadata Template | This template aims to capture technical metadata for genomics data produced at the Genomics platform, compatible with submission requirements from ENA and ArrayExpress. | genomics/README.md |