-
Notifications
You must be signed in to change notification settings - Fork 18
ggduo: pairs plots for multiple regression, cca, time series
The function ggpairs and ggscatmat in GGally provide generalized pairs plots for a data frame in R. All pairs of variables are displayed, with plot defaults depending on the type of variable in a matrix format. The diagonal contains univariate displays. These functions extend the classic pairs function in base R, which only handles real-valued variables, to flexibly handle different variable types, and to use the graphics package ggplot2.
This is appropriate for multivariate data, because we want to see each variable vs each other. But in many problems, such as regression, or multiple time series, there are two groups of variables, e.g. response variables and explanatory variables, and we would like to see one group vs the other group. New functions are needed to accomplish this.
GGally description.
The outcomes of the project are:
- R function for a generalized version of pairs plots for two groups of variables to be implemented
- Vignette illustrating usage
Once you have a solution to the medium or/and the hard problem, please get in touch with Dianne Cook.
Several tests that potential students can do to demonstrate their capabilities for this particular project. Please modify the suggestions below to make them specific for your project.
- Easy: Install the GGally package from github (you might have to install the devtools package first). Run one of the examples, put the chart in a knitr/Rmarkdown document and write a paragraph to explain the chart.
- Medium: Merge two ggmatrix objects, and produce a new ggmatrix object
- Hard: Present all ggmatrix objects as a facetted ggplot object, rather than an ad hoc print. Make a pairs plot of the 4 variable iris data with strip labels at the top and side to illustrate that it is accomplished.
Students, please post a link to your test results here.
- Emerson, John W., Walton A. Green, Barret Schloerke, Di Cook, Heike Hofmann, and Hadley Wickham (2012). “The Generalized Pairs Plot.” Journal of Computational and Graphical Statistics, 22 (1), 79-91; doi: 10.1080/10618600.2012.694762.
- Wickham H., ggplot2: Elegant graphics for data analysis. useR, Springer, July 2009.
- Wilkinson L., The Grammar of Graphics. Statistics and Computing, Springer, 1999.
- multiple regression
- multiple time series
- cognostics