Skip to content

hamishwp/ODDRIN

Repository files navigation

ODDRIN

Oxford Disaster Displacement Real-time Information Network

Figure 1: ODDRIN interactive data visualisation platform - ODD-Mapping

Welcome to the ODDRIN code, comprised of both a front-end visualisation component, known as ODD-Mapping, and the back-end statistical engine and real-time updating software, IIDIPUS - Integrated Internal DIsplaced PopUlation Sampler.

The aim of this software is to predict the number of people displaced, the number of fatalities, and the number of buildings damaged in the early phases of rapid-onset natural (and human-generated climate-change related) hazards. Predictions are made as accurate as possible by training the model on hundreds of events and hundreds of thousands of damaged buildings, across a broad demographic of countries and hazard severities.

Authors

ODDRIN was designed, developed, made operational and is managed by Dr. Hamish Patten @HamishPatten, Max Anderson Loake @MaxLoake and Professor David Steinsaltz @DavidSteinsaltz, as part of a project developed at the Department of Statistics, University of Oxford.

Documentation

A Bayesian Approach to Disaster Impact Modelling - This pre-print, submitted to the Royal Statistical Society (RSS), details the model, method, and results.

Data-Driven Earthquake Multi-impact Modeling: A Comparison of Models - Published in the International Journal of Disaster Risk Science, this paper compares a range of machine-learning approaches to earthquake impact modelling.

IDMC GRID 2021 Background Paper - This was a non-peer reviewed article written for the Internal Displacement Monitoring Centre (IDMC) early 2021 to be included with the Global Report on Internal Displacement (GRID).


Code Layout

The code can be decomposed into several sections:

  1. Main
  2. Model
  3. Method
  4. Data
  5. Object Orientated Programming (OOP) class formation
  6. Additional functions

Here we try to explain without too much detail the most important files from each section.

Figure 2: an example of ODDRIN output - a surface plot of the predicted displaced population from the Haiti earthquake on the 14/08/2021, including contour lines of the earthquake shakemap intensity (MMI)

Main

Key Files & Roles:
  1. Main.R - This is where the ODDRIN model parameterisation occurs. Here we extract the pre-formatted data, model formulas and structuring, and then run the model-training algorithm to parameterise the model.
  2. AutoQuake.R - This file allows, requiring minimal input, an automated extraction of everything necessary to predict the spatial distribution and magnitude of the mortality, displaced population, and building damage in the immediate aftermath of earthquakes, including fore-shocks and after-shocks.
  3. RealTimeIIDIPUS.R (not yet included) - Real-time tracking of the occurrence of rapid-onset hazards, including predicting the magnitude and spatial distribution of the displaced population, then broadcasting this to partners, such as the IFRC GO Platform.
Key Functions:
  • IIDIPUSModelTraining - Extract data, model and methodology, then train the model using an Adaptive Markov Chain Monte Carlo (AMCMC) algorithm or Sequential Monte Carlo (SMC) algorithm.
  • AutoQuake - Extracts the earthquake intensity data, creates an object that automatically extracts the relevant exposure and vulnerability information, then makes a prediction on the fatalities, population displacement and building damage, per gridpoint.

Model

Key Files & Roles:
  1. Model.R - Here we can find everything model-related. This includes damage function equation definitions, declaring the chosen imported vulnerability indicators required, and, finally, the pseudo-marginal log-likelihood, prior and posterior distribution equations for population displacement and also satellite building damage estimations. Also includes the linear predictor terms that parameterise the systemic vulnerability.
Key Functions:
  • HighLevelPriors - Approximate Bayesian Computing (ABC) method of rejection
  • GetLP - Calculate the exposure-related component of the vulnerability over all grid-cells (e.g. using the SHDI data)
  • GetLP_single - Calculate the exposure-related component of the vulnerability for a single grid cell
  • getLP_event - Calculate the hazard-related component of the vulnerability (e.g. using the night time indicator)
  • addTransfParams - Transforms parameters to reduce correlation between parameters
  • SamplePolyImpact - Sample the impact for each event in the provided event set
  • SamplePointImpact - Sample the impact for each building in the point building dataset
  • CalcDist - Calculates the loss function comparing the sampled and observed data

Method

Key Files & Roles:
  1. Method.R - Define the two algorithms that are used to parameterise the model via likelihood-free Bayesian statistics. The options are the Adaptive MCMC algorithm described in Del Moral, Doucet and Jasra, 2012, and the ABC-SMC algorithm described in Spencer, 2021.
  2. GetInitialValues.R - This file allows the initialisation of the AMCMC algorithm, either by using samples from past model runs or samples from the prior to estimate an appropriate proposal covariance.
Key Functions:
  • Proposed2Physical and Physical2Proposed - link functions for the model parameters, to ensure that the model parameters sampled by the MCMC proposal distribution are on the real line with infinite support.
  • multvarNormProp - proposal parameter set generation function (multivariate normal distribution)
  • AMCMC - Runs the adaptive MCMC algorithm
  • ABCSMC - Runs the ABC-SMC algorithm

Data

Key Files & Roles:
  • GetPopDemo.R - Population and demography data extraction, mostly built around the CIESIN data, but now includes the Facebook Data for Good Population Mapping data.
  • GetSatDamage.R - Given the location of UNOSAT-UNITAR or COPERNICUS building damage assessment data files, this extracts the buildings and harmonises the format of the data to be used by ODDRIN later on, when provided to initialise a BD object
  • GetDisaster.R - Extract the hazard intensity data, the source of which depends on the hazard type. For example, earthquakes rely on USGS
  • GetGDACS.R - This file is really hideous, I apologise, I was learning to code in R at the time. This file is to access the Global Disaster Alert and Coordination System (GDACS) database, this is key to the real-time component of ODDRIN
  • GetUSGS.R - Access earthquake shakemaps and other information automatically from the United States Geological Survey (USGS).
  • AddVulnerability.R - Extract and add the vulnerability data related to the exposure, such as the Vs30 data (from USGS) and the SHDI data (from Global Data Lab)
  • GetWorldBank.R - All things World Bank, as national aggregated values but with temporal trends. For example, it is easy to access temporal trends of population count for most countries around the world, and even have access to when the last time the data was updated via national surveys.
  • GetOSM.R - The file that accesses OpenStreetMaps, including downloading buildings and roads located within a certain bounding box, country or region polygon. Be careful what you wish for here... if your search is too broad, you'll never be able to access anything! Go in small chunks and slowly cover the area you want.
  • GetBuildingCounts.R - The file that accesses building footprint data from Microsoft/Bing Building Footprint datasets.
Key Functions:
  • ExtractData - This is the function that extracts all that we need for ODDRIN for a given hazard occurence. However, currently, this is only automated for earthquakes. Provided a collection of estimates of the maximum observed displaced population of an event (e.g. from IDMC), including the date of the event and the country, this function will find the matching value in GDACS, then create ODD classes, then do the same for the satellite image-based building assessment data.
  • GetPopulationBbox - Provided only a bounding box and the folder name for the population data, this function extracts the population data from CIESIN in a memory efficient way, also ensuring things like continuity across the longitude=0 plane.
  • InterpPopWB and InterpGDPWB - These functions extract the population and GDP nationally aggregated values, respecitively, from the World Bank exactly on the date provided (through interpolation/extrapolation techniques), which are used to make sure that the CIESIN and Kummu pop & GDP values are updated to reflect the value on the day of the hazard.
  • ExtractBDfiles - Provided the location of the folder where the satellite image-based building damage assessment data is kept, this function will extract all UNOSAT and all Copernicus data ready to create an instance of the BD class.
  • GetDisaster - This is the function that, provided with only minimal input (bounding box, start and end date, hazard type), can extract hazard intensity raster data, and output HAZARD objects made from the data.
  • GetEarthquake - Automated extraction for earthquakes from USGS, forming a list of HAZARD objects.
  • FilterGDACS - Extract data from GDACS through their API, then filter it to get only what we need.
  • GetUSGS - For earthquakes, GetDisaster depends entirely on this function, which accesses the USGS database, given minimal input, and extracts the important data and forms a HAZARD class instance from it.

Object Orientated Programming (OOP) class formation

Key Files & Roles:
  • ODDobj.R - The principle ODDRIN class, whereby hazard intensities (from all hazards included), exposed population and exposed buildings, as well as vulnerability information are all included as fields/attributes. The methods of the class greatly facilitate automating the initialisation of objects with only minimal data provided as input, whereby, for example, the interpolation of hazard intensities onto the population grid is automated.
  • BDobj.R - This class is for the satellite image-based building damage assessment data. The main difference from the ODD class is that this is that the data is not on a grid but can be considered as a list of points in space whereby the hazard intensities and other information is interpolated.
  • HAZARDobj.R - Hazard intensity data is read in and then structured into the correct form to be provided to the ODD or BD classes. This class is used heavily by GetUSGS.R, for example.
Key Functions:
  • initialize - Initialises the objects, with a unique initialisation function per object mentioned above
  • DispX - Predict the number of people displaced, based on the hazard, the model and the parameterisation
  • readODD - read in a saved ODD file (stored as RDS). For BD objects and HAZARD objects, the relevant functions are readBD() and readHAZ() respectively.
  • saveODD - save an ODD file (stored as RDS). For BD objects and HAZARD objects, the relevant functions are saveBD() and saveHAZ() respectively.
  • BDX - Predict the building damage classification level, based on the hazard, the model and the parameterisation
  • BDinterpODD - Creating an instance of the BD class is facilitated by providing the instance of the ODD class that corresponds with the same hazard(s).

Additional Functions

Key Files & Roles:
  • Functions.R - Includes all the miscellaneous functions that are required by the ODDRIN code.
Key Functions:
  • convRaster2SPDF - Converts rasters that are imported from a .tif file into the SpatialPixelsDataFrame format which ODDRIN relies on.
  • convMat2SPDF - Same but from matrix format to SpatialPixelsDataFrame format
  • coords2country Given a longitude and latitude coordinate, which country does it belong to?
  • countriesbbox Given the ISO3C code for a given country, what is the countries bounding box?

Installation

Before doing anything, please change the directory location environment variable dir and directory (these two must be equal) in the GetEnv.R file. The simplest installation of ODDRIN is to download and load only the most fundamentally important packages and to source only the files that you will need. To do this, change packred<-T in the file GetEnv.R. Installation in RStudio is simple, just run the following:

  # Extract Environment Variables
  source('RCode/GetEnv.R')
  # Download and install the necessary packages:
  source('RCode/GetODDPackages.R')  

This will error if you have not already installed the necessary data (see the end of installation instructions - 'Data to be downloaded manually').

For a full installation of ODDRIN, the problem is getting the R package rJava to work. For Linux and Mac distributions, follow the installation instructions in 'Linux and Mac Installation' below. For Windows, follow the instructions under 'Windows Installation'.

Linux and Mac Installation

In order to install the full ODDRIN package for Linux and Mac distributions, open a terminal and run the following:

    sudo apt-get install libcurl4-openssl-dev libxml2-dev libjq-dev libprotobuf-dev
                         libv8-dev protobuf-compiler openjdk-8-jdk libssh-dev libssl-dev
                         libgdal-dev libudunits2-dev libopenmpi-dev

This installs all sorts of important software, not just rJava, but this one is what causes the problems - note the openjdk-8-jdk package is the difficult one. Next part is to make sure that you have an enviroment variable for the location of your java libraries. In your /etc/environment file add the java libraries environment variable (using sudo nano /etc/environment):

    LD_LIBRARY_PATH=/usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/:/usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/

Please check that the folder /usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/ actually exists! Otherwise, insert the folder location you find (another example could be LD_LIBRARY_PATH=/usr/lib/jvm/jre/lib/amd64:/usr/lib/jvm/jre/lib/amd64/default). NOW RESTART YOUR COMPUTER! Follow this up with:

    source /etc/environment
    sudo R CMD javareconf JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/bin/jar

Finally, with packred<-F in the file GetEnv.R, run the following:

  # Extract Environment Variables
  source('RCode/GetEnv.R')
  # Download and install the necessary packages:
  source('RCode/GetODDPackages.R')  

Windows Installation

To install the full ODDRIN package on Windows, first install Ubuntu from the Microsoft Store. This allows Linux command syntax to be run on a Windows machine. Follow all instructions as per the 'Linux and Mac Installation' section above EXCEPT the instruction to set the LD_LIBRARY_PATH. Ubuntu does not permit the user to set LD_LIBRARY_PATH in the '/etc/environment' file, so run the following in Ubuntu instead:

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/:/usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/
    echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/:/usr/lib/jvm/java-8-openjdk-amd64/lib/amd64/' >> ~/.bashrc

Follow all the remaining steps after the setting of the LD_LIBRARY_PATH instruction as detailed in the 'Linux and Mac Installation' section above, i.e. check that the folder actually exists, restart your computer, etc. For the sudo R CMD, please ensure that R is installed for use on Ubuntu, otherwise you will get a command not found error.

Data to be Downloaded Manually

In addition to installing the necessary packages, you are also required to manually download several datasets due to licensing and access restrictions. Please follow the instructions below carefully:

  • Global Data Lab Vulnerability Data (SHDI/SGDI)
    You will need to download two sets of files:

    1. The GDL shapefiles from this link (requires free account). Extract and place all files in:
      Demography_Data/SocioEconomic/GlobalDataLab/GDL Shapefiles V6/. Ensure that the file shdi2022_World_large.shp is in that folder.
    2. The CSV data containing SHDI/SGDI values from this link (also requires free account). Name the file exactly:
      SHDI-SGDI-Total 7.0.csv and place it in:
      Demography_Data/SocioEconomic/GlobalDataLab/.
  • VS30 dataset (soil shear wave velocity)
    Download the dataset from the USGS VS30 page. Place the extracted global_vs30.tif and any auxiliary files in:
    Hazard_Data/global_vs30_tif/.

  • Global Earthquake Hazard Frequency Data (PGA)
    Download the most recent version of the PGA hazard data from the Global Earthquake Model's GSHM page. Extract the file named:
    v2023_1_pga_475_rock_3min.tif
    and place it in:
    Hazard_Data/GEM-GSHM_PGA-475y-rock_v2023/.

  • High-resolution population count dataset
    Note that the CIESIN data does not currently seem to be available online. This dataset is not necessary as, if not downloaded, coords2country() is instead used to label the country of each grid cell. If the data does become available again, the download instructions are as follows:

    Download the GPWv4 population count dataset from CIESIN. Use the Single Year option and choose the ASCII format at 30 arc-second resolution. You will need to download files year by year (e.g., 2000, 2005, 2010, 2015), ensuring that each year has its own folder (e.g., Demography_Data/Population/gpw-v4-population-count-2015/). For the model, only the 2015 dataset is required, particularly the file named:
    gpw_v4_population_count_adjusted_to_2015_unwpp_country_totals_rev11_2015_30_sec_1.asc
    Make sure this file exists at:
    file.exists(paste0(dir,"Demography_Data/Population/gpw-v4-population-count-2015/gpw_v4_population_count_adjusted_to_2015_unwpp_country_totals_rev11_2015_30_sec_1.asc")) == TRUE.
    Each folder should contain 8 .asc files (...30_sec_1.asc to ...30_sec_8.asc).

Figure 3: screenshot of the CIESIN website download options for the population count data used in the ODDRIN software

Environment Variables

To run this software, you will need to add the following environment variables to the GetEnv.R file:

directory = dir (for the lazy writers), FBdirectory, packred

Note that FBdirectory is optional, but is important when using data extracted from the Facebook Data for Good platform.

Installation Checks

Please run the InstallationChecks.R file:

  source('RCode/InstallationChecks.R')

Usage/Examples

The files InstallationChecks.R, Main.R, and Autoquake.R provide usage examples.

About

ODDRIN - Oxford Disaster Displacement Real-time Information Network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •