Settings

Description of LPT settings

Options are defined as Python dictionaries. There are 7 dictionaries each with options that can be set:

dataset
plotting
output
lpo_options
lpt_options
merge_split_options
mjo_id_options

The option values are set directly by Python scripts. After the values are set, the main function lpt_driver.py is called.

Default values for all the options are set in lpt/default_options.py.
Many of these options are over-ridden in your [RUN]/lpt_run.py.

For a brief overview/reminder of what the settings are, see the comments in lpt/default_options.py, MASTER_RUN/lpt_run.py and [RUN]/lpt_run.py. More details about each setting are provided in this wiki.

Dataset options

Table 1. dataset dictionary options.

Option	Data Type	Description
dataset['label']	string	Used in the output file names. For example, "imerg" in lpt_systems_imerg_2023111000_2024020823.nc
dataset['raw_data_parent_dir']	string	Parent directory, which is in common with all the files, for the input data. It can be a relative path. Subdirectories, such as by date, can be set using the file_name_format option.
dataset['raw_data_format']	string	Controls which lpt/readdata.py function gets used to read in the raw data. The value must match a valid data format in the if/elif/else block at the top of readdata.py. See the current list of current options in Table 2 below.
dataset['file_name_format']	string	The path for filenames under the raw_data_parent_dir. This is a Python format string such as would be used with datetime.strftime(). For example, for 00 UTC 2024-01-10, "%Y/%m/gridded_rain_rates_%Y%m%d%H.nc" would get converted in to "2024/01/gridded_rain_rates_2024011000.nc".
dataset['data_time_interval']	integer	The time between input files. Units: Hours.
dataset['verbose']	True or False	Whether to print more detailed information about the files to the screen.
dataset['longitude_variable_name']	string	Longitude variable name for generic_netcdf. NOTE: The readdata.py functions are set up to convert -180 to 180 longitude to 0 - 360.
dataset['latitude_variable_name']	string	Latitude variable name for generic_netcdf
dataset['time_variable_name']	string	Time variable name for generic_netcdf. Ignored if there is no time dimension.
dataset['field_variable_name']	string	Name of the variable to use for feature identification, a Python string (e.g., "rainfall" for LPT).
dataset['field_units']	string	Units of data. This is mainly used for generating plots, not for calculations. It is OK to set it to "" if plots are not being created.
dataset['area']	list of Floats	Geographical area of data to use. A Python list of float values for [lon_begin, lon_end, lat_begin, lat_end], e.g., [0.0, 360.0, -50.0, 50.0]. The input data will be subsetted to this region. NOTE: The readdata.py functions are set up to convert -180 to 180 longitude to 0 - 360.

Table 2. Raw data format options.

Raw data option value	Description
generic_netcdf	NetCDF data. The intended variable must have dimensions (lat, lon) or (time, lat, lon), or similar variables. The specific variable names are set by the dataset dictionary options named like "*_variable_name". NOTE: These options are ignored for the other raw data formats.
cmorph	CMORPH data in binary format. NOTE: For NetCDF format data, you can use generic_netcdf instead.
imerg_hdf5	IMERG V6 data in HDF5 format.
cfs_forecast	CFS Forecast data in Grib2 format.

Plotting options

Table 3. plotting dictionary options.

Option	Data Type	Description
plotting['do_plotting']	True or False	Whether to generate plots. This applies only to the LPO and LPT steps, e.g., lpo_options['do_lpo_calc'] (Map plots of rainfall and LPO) and lpt_options['do_lpt_calc'] (Time-longitude plot). The other plotting options are ignored if this is set to False. NOTE: This is best used as a "gut check" for a short time period to determine whether the code is doing what you expect. If you are running for a long period, this will consume resources, so maybe set it to False for your "production" runs.
plotting['plot_area']	list of Floats	Geographical area of data for map plots. A Python list of float values for [lon_begin, lon_end, lat_begin, lat_end], e.g., [0.0, 360.0, -50.0, 50.0]
plotting['time_lon_range']	list of Floats	Longitude range for time-longitude plots. Does not need to be the same as A Python list of float values for [lon_begin, lon_end], e.g., [40.0, 200.0]

Output options

The output path has several components, depending on the dataset label, accumulation/averaging period, spatial filtering, and threshold value.

The convention for LPO data output, expressed as a Python formatted string, is like this:

fout = (f"{output['data_dir']}"
    + f"/{dataset['label']}"
    + f"/g{lpo_options['filter_stdev']}"
    + f"_{lpo_options['accumulation_hours']}h"
    + f"/thresh{lpo_options['thresh']}"
    + "/objects/"
    + dt_this.strftime(output['sub_directory_format'])
    + "/" + dt_this.strftime('objects_%Y%m%d%H.nc')

for example: ./data/imerg/g50_72h/thresh12/objects/2024/01/20240110/objects_2024011000.nc.

For images, replace "data" with "images" and ".nc" with ".png".
For systems, replace "objects" with "systems" and no date-based sub directory (ignore output['sub_directory_format'])

Table 4. output dictionary options.

Option	Data Type	Description
output['img_dir']	string	directory for plotting outputs. Can be a relative path.
output['data_dir']	string	directory for data outputs (text/NetCDF). Can be a relative path.
output['sub_directory_format']	string	The subdirectory beneath the img_dir or data_dir. This is a Python format string such as would be used with datetime.strftime(). For example, for 00 UTC 2024-01-10, '%Y/%m/%Y%m%d' is converted in to '2024/01/20240110'. This pertains to LPO output data and LPO map plots.

LPO Settings

Table 5.1. lpo_options dictionary options for LPO identification.

These options control the identify of large scale precipitation options (LPOs).

Option	Data Type	Description
lpo_options['do_lpo_calc']	True or False	Whether to go through the LPO identification stage of the calculation. If set to False, the LPO step is skipped and all other lpo_options dictionary options are ignored.
lpo_options['lpo_calc_n_cores']	integer	How many processes to use. If it is > 1, the LPO calculations will run in parallel with one time stamp per processor. Make sure you have the resources if you use many processors.
lpo_options['overwrite_existing_files']	True or False	Whether to calculate LPO and re-write files that already exist.
lpo_options['multiply_factor']	float	A factor to multiply the raw data by to get it into the units you want. For example, use 24.0 to convert from mm/h to mm/day.
lpo_options['field_units']	string	The units of the data after applying the multiply_factor. For example, 'mm d-1'. This is used for NetCDF output.
lpo_options['thresh']	float	Threshold value to use for LPO identification. The units are for the data after multiply_factor is applied.
lpo_options['accumulation_hours']	integer	Accumulation/running averag period. Units: hours. Set to 0 to just use instantaneous values without any time averaging.
lpo_options['filter_stdev']	integer	Number of grid points for the standard deviation of the Gaussian spatial filter. Set to 0 for no spatial smoothing.
lpo_options['filter_n_stdev_width']	integer	How many standard deviations to use for the Gaussian spatial filter. For example, if the filter_stdev is 20 and n_stdev_width is set to 3, the filter extend out 60 points.
lpo_options['min_points'] = 400	integer	Minimum number of contiguous grid points to keep as an LPO.
lpo_options['cold_start_mode']	True or False	Whether to use cold start mode. Cold start mode is mainly for model runs, for which you may not have data going back in time to calculating a running average/accumulation at the beginning of the run. In Cold start mode, the averaging period is ramped up from an initial value (lpo_options['cold_start_const_period']) to the intended averaging period. For time 0 to lpo_options['cold_start_const_period'] hours, a constant average value is used. The same, stationary LPOs will be identified during this period. For cold_start_const_period to accumulation_hours, the average from the initial time to the valid time will be used.
lpo_options['cold_start_const_period']	integer	Time period during which constant averaged data (and stationary LPO) is used at the beginning of the cold start. See above.

Table 5.2. lpo_options dictionary options for LPO spatio-temporal masks.

Option	Data Type	Description
lpo_options['do_lpo_mask']	True or False	Whether to generate LPO mask file. If set to False, the rest of the LPO mask settings in this table are ignored. Note: This does not require lpo_options['do_lpo_calc'] = True, although LPO mask files would need to have been generated if do_lpo_calc is False.
lpo_options['mask_detailed_output']	True or False	Whether to use detailed mask output. By default (False), a single variable "mask" is output. If this is set to True, up to four mask variables are output. This is mainly useful for understanding what each step of LPO does and for development. See the Output section below for more details.
lpo_options['mask_include_rain_rates']	True or False	Whether to include masked rain rates in mask output files. The masked rain is simply the rain (or whatever variable is used for LPO step) with values outside of the mask set to missing.

lpo_options['mask_calc_volrain'] = True # Whether to calculate a volumetric rain and include with mask files. lpo_options['mask_calc_with_filter_radius'] = True # Whether to calculate the mask with filter variables. (Takes much longer to run) lpo_options['mask_calc_with_accumulation_period'] = True # Whether to calculate the mask with filter variables. (Takes much longer to run) lpo_options['mask_coarse_grid_factor'] = 0 # If > 0, it will use a coarsened grid to calculate masks. Good for high res data. lpo_options['target_memory_for_writing_masks_MB'] = 10000 # Target to limit memory demand from writing masks to files. The more, the faster it can run. lpo_options['mask_n_cores'] = 1 # How many processors to use for LPO mask calculations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Settings

Description of LPT settings

Dataset options

Table 1. dataset dictionary options.

Table 2. Raw data format options.

Plotting options

Table 3. plotting dictionary options.

Output options

Table 4. output dictionary options.

LPO Settings

Table 5.1. lpo_options dictionary options for LPO identification.

Table 5.2. lpo_options dictionary options for LPO spatio-temporal masks.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally