A repository to accompany selected applied sessions of the 2020 VADA Program Summer School
Wrangling and visualizing temporal data presents unique challenges to data analysists. This 1-hour workshop addresses the key tasks involved in exploration of timeseries and walks the participants through an applied example of obtaining COVID-19 mortality data from a live source (https://opendata.ecdc.europa.eu/covid19), computing relative timelines and various temporal metrics for individual countries, and designing information displays for understanding global trends. Data and scripts are provided. Software requirements: R, RStudio, and tidyverse packages.
By Andriy Koval, Ph.D.
After this workshop participants should be able to:
- Plot time series of COVID-19 cases using
ggplot2
package - Add interactive highlights to trajectories using
plotly
package - Compute indicators for key epidemiological events in each country (e.g. day of the first death)
- Construct country-specific timelines relative to key epidemiological events
- Visualize the sequence of key events for a group of countries
During the session, we will walk through creating three graphs:
Goal 1 | Goal 2 | Goal 3 |
---|---|---|
Timeseries with interactive highlights | Trajectories with relative timelines | Sequence of key epidemiological events |
Please open ./analysis/live-in-session/live-in-session.R
script in RStudio. Depending on your familiarity with programming, you have two options to do so:
- Opion 1. Launch RStudio, start a new R script and copy-paste the contents of this file
- Option 2. Clone this repo, launch the project in RStudio and open
./analysis/live-in-session/live-in-session.R
. If this instruction confuses you, please use Option 1.
The data comes from European Centre for Disease Prevention and Control, with the source available from here. I demonstrate the preparation of this data for analysis in ./manipulation/ellis-covid.R script of this repository.
If you are considering an anlytical project involving world-wide COVID-19, consider cloning the repository and studying the following scripts
- I have prepared an narrated version of this workshop, which you can study later on your own.
- ./manipulation/ellis-geography - prepares a reference table with exhaustive list of countries, their two- and three-letter codes, continents, and other useful info. If you are doing cross-country comparisons, consider investing time studying this data prep script, it may save you a lot time and frustration.
- Check out my workshop from the VADA 2019 summer school, in which I demonstrate the technique for functionalizing graphing functions.
Andriy Koval is an Assistant Professor in the Department of Health Management and Informatics at the University of Central Florida. Dr. Koval earned degrees in Quantitative Methods (Ph.D., Vanderbilt), Psychology (M.A., MTSU) and Mass Communication (B.S., MTSU). His research combines longitudinal modeling, reproducible analytics and data visualization to study how people engage health systems and services over the course of their lives. He lives in Orlando, FL.
Check out my blogpost with online books and courses on rstats
that I find particulary useful.