Skip to content

climate-resource/cmip7-scenariomip-ghg-concentrations

Repository files navigation

CMIP7 ScenarioMIP GHG Concentrations

Generation of GHG concentration inputs (i.e. forcings) for CMIP7's ScenarioMIP.

Status

  • development: the project is actively being worked on

Installation

We do all our environment management using pixi. To get started, you will need to make sure that pixi is installed (instructions here, we found that using the pixi provided script was best on a Mac).

To create the virtual environment, run

pixi install
pixi run pre-commit install

These steps are also captured in the Makefile so if you want a single command, you can instead simply run make virtual-enviroment.

Having installed your virtual environment, you can now run commands in your virtual environment using

pixi run <command>

For example, to run Python within the virtual environment, run

pixi run python

As another example, to run a notebook server, run

pixi run jupyter lab

Creating the files

Process

In short

  1. Receive data from the emissions team
  2. Do a new run where you update --emissions-file, --run-id, --esgf-version and --input4mips-cvs-source
  3. Send data to the publication team using scripts/upload-to-llnl.py

In long

  1. Receive markers from the emissions team
    • the markers are defined in scripts/generate-concentration-files.py. If there are changes, make sure you update this variable.
  2. Receive emissions from the emissions team
    • they should send two files. They produce these files with the script here (hopefully merged into main soon). The two files are:
      1. the emissions for each scenario, except for emissions of species that we derive from our inversions of sources like WMO (2022) (where we use only a single concentration projection, rather than having variation across scenarios)
      2. emissions for each scenario at the fossil/biosphere level. This is used for some extrapolations of latitudinal gradients. It's the same data as above, just at slightly higher sectoral detail.
  3. Put the received emissions in data/raw/input-scenarios
  4. Update the emissions file you use for your run. There are two options for how to do this:
    1. specify this from the command line via the --emissions-file option
    2. change the value of the emissions_file variable in scripts/generate-concentration-files.py
  5. Run with a new run ID and ESGF version (using the command line argument --run-id and --esgf-version). Pick whatever makes sense here (we don't have strong rules about our versioning yet)
    • This will also require creating entries for the controlled vocabularies (CVs). This requires updating this file to include source IDs of the form "CR-scenario-esgf-version". In practice, simply copy the existing "CR-scenario-esgf-version" entries and update their version to match the ESGF version you used above. Then push this to GitHub.
    • When you run, you will need to update the value of --input4mips-cvs-source. You can do this either via the command-line argument --input4mips-cvs-source or just update the value in scripts/generate-concentration-files.py. The value should be of the form "gh:[commit-id]" e.g. "gh:c75a54d0af36dbedf654ad2eeba66e9c1fbce2a2".
  6. When the run is finished, upload the results for the publication team with pixi run python scripts/upload-to-llnl.py --unique-upload-id-dir <unique-value-here> output-bundles/<run-id>/data/processed/esgf-ready/input4MIPs e.g. pixi run python scripts/upload-to-llnl.py --unique-upload-id-dir cr-scenario-concs-20250701-1 output-bundles/v0.1.0a2/data/processed/esgf-ready/input4MIPs
  7. Tell the publication team that the results are uploaded and the folder in which to find them i.e. the value of --unique-upload-id-dir

Parallelisation

By default, this all runs serially. You can add extra cores with the flags below:

  • --n-workers: the number of threaded (i.e. parallel) workers to use for submitting jobs
    • note: this doesn't result in true parallelism. A full explanation is beyond the scope of this document (but if you want to google, explore the difference between multiprocessing with threads compared to processes in python)
  • --n-workers-multiprocessing: the number of multiprocessing (i.e. parallel) workers to use, excluding any tasks that require running MAGICC
  • --n-workers-multiprocessing-magicc: the number of multiprocessing (i.e. parallel) workers to use for tasks that run MAGICC
  • --n-workers-per-magicc-notebook: the number of MAGICC workers to use in each MAGICC-running task.
    • note: the total number of MAGICC workers is the product of --n-workers-multiprocessing-magicc and --n-workers-per-magicc-notebook

In general, you want:

  • --n-workers: equal to the number of cores on your CPU (or more)
  • --n-workers-multiprocessing: equal to the number of cores on your CPU (or more)
  • --n-workers-multiprocessing-magicc, --n-workers-per-magicc-notebook: the product should be equal to equal to the number of cores on your CPU (or more)

For example, for an eight core machine you might do something like

pixi run python scripts/generate-concentration-files.py --n-workers 8 --n-workers-multiprocessing 8 --n-workers-multiprocessing-magicc 2 --n-workers-per-magicc-notebook 4

Specific gases

If you need/want to run only for a specific gas, you can use the --ghg flag as shown below.

pixi run python scripts/generate-concentration-files.py --ghg ccl4 --ghg cfc113

Development

TODO: update this section as we add:

  • tests
  • anything else

Install and run instructions are the same as the above (this is a simple repository, without tests etc. so there are no development-only dependencies).

Contributing

TODO: update as we figure out the structure

Repository structure

TODO: update as we figure out the structure

We have a basic Makefile which captures key commands in one place (for more thoughts on why this makes sense, see general principles: automation). For an introduction to make, see this introduction from Software Carpentry. Having said this, if you're not interested in make, you can just copy the commands out of the Makefile by hand and you will be 90% as happy.

Tools

In this repository, we use the following tools:

  • git for version-control (for more on version control, see general principles: version control)
  • Pixi for environment management (for more on environment management, see general principles: environment management)
    • there are lots of environment management systems. Pixi works well in our experience and, for projects that need conda, it is the only solution we have tried that worked really well.
    • we track the pixi.lock file so that the environment is completely reproducible on other machines or by other people (e.g. if you want a colleague to take a look at what you've done)
  • pre-commit with some very basic settings to get some easy wins in terms of maintenance, specifically:
    • code formatting with ruff
    • basic file checks (removing unneeded whitespace, not committing large files etc.)
    • (for more thoughts on the usefulness of pre-commit, see general principles: automation
    • track your notebooks using jupytext (for more thoughts on the usefulness of Jupytext, see tips and tricks: Jupytext)
      • this avoids nasty merge conflicts and incomprehensible diffs
  • prefect for workflow orchestration

Original template

This project was generated from this template: basic python repository. copier is used to manage and distribute this template.

About

Generation of GHG concentration inputs (i.e. forcings) for CMIP7's ScenarioMIP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published