A template to create and work with Masterdata definitions in the BAM Data Store.
- Click on Use this template button on the top right and create a new Masterdata repository.
- The next screen will prompt you to chose a Github organization or profile for the new repository, as well as a name. We recommend you naming the repository starting with
masterdata-
, so it is easy to be recognized. - After creating your new repository, you can:
- Open in Github Codespace
- Clone it locally
- You need to install
cruft
. - Run the following command to create the structure of the new Masterdata project:
cruft create https://github.com/BAMresearch/masterdata-generator
This will prompt you with some questions to be filled in:
[1/9] author (Erika Mustermann):
[2/9] email (erika.mustermann@bam.de):
[3/9] Github organization or profile name (BAMResearch):
[4/9] project_name (masterdata-example):
[5/9] The name of the folder src/<module_name>, default is the
<project_name> separated with underscores (masterdata_example):
[6/9] short_description (A short description for the masterdata project.):
[7/9] include_python_masterdata [y/n] (y):
[8/9] include_excel_masterdata [y/n] (y):
[9/9] include_rdfxml_masterdata [y/n] (y):
Note
Not filling anything on the questions above will result on chosing the default. The default value for each question is defined in between parenthesis next to the question.
- The files generated by
cruft
are created under./<project-name>
. To move all the structure one level up, run:
sh <project-name>/move_generated_files.sh
- After filling the questionnaire and running the
move_generated_files.sh
script, you will have a new repository with:
masterdata-<name>/
├── LICENSE (MIT)
├── README.md
├── pyproject.toml
├── src/
│ └── masterdata-<name>/
# if Python is selected as an option
│ ├── __init__.py
│ ├── object_types.py
│ ├── vocabulary_types.py
# if Excel is selected as an option
│ ├── masterdata.xlsx
# if RDF/XML is selected as an option
│ └── masterdata.owl
└── tests/
├── __init__.py
└── test.py
- If you are using Python, you can install the package and its dependencies by creating a virtual environment (either with
conda
orvenv
) and running:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install uv
uv pip install -e '.[dev]'
Note
Step 8 is defined using venv
for creating a virtual environment in an Ubuntu OS. Please, modify the commands if you use conda
or you are installing in Windows/MacOS.
Once you have the skeleton of the Masterdata repository, you can define new classes in the corresponding file under src/<module-name>/
.
In case of Python, these are different modules for Object Types and Vocabularies. For Excel or RDF/XML, this is a single file named masterdata.xlsx
/masterdata.owl
, respectively. In all the cases, a dummy entity called ExampleObjectType
is created to serve as an example.
You can export from Python to the other formats using the bam-masterdata
CLI. You can also generate Python code from Excel or RDF/XML with the same CLI. Read more in the bam-masterdata
documentation.
A set of Github actions are defined in .github/workflows/
. These will:
- Ensure typing and formatting consistency in Python code following PEP-8 standards using
mypy
andruff
rules defined in thepyproject.toml
. - Check consistency of the Masterdata definitions using the
bam-masterdata
checker CLI functionalities.
If the pipeline passes, it means your Masterdata definitions are formatted properly and do not clash with the current bam-masterdata
definitions.
Important
In bam-masterdata
, there is a set of Masterdata entities already defined. These correspond to the general definitions (to be used across MSE sub-communities). If you want to know how to define new Masterdata entities using inheritance from the general ones, read the corresponding documentation.
The action .github/workflows/publish.yml
already implements the publishing of your repository to PyPI. When you are ready, you need:
- Define a secret and name it
PYPI_API_TOKEN
from pypi.org. - Create a Release from the Github page of your repository.
Read more here: Github PyPI documentation.