-
Notifications
You must be signed in to change notification settings - Fork 18
Mosaicplots in the ggplot2 framework: ggmosaic
** Background
Mosaic plots and, in particular, proper spine plots are missing from ggplot2. While the productplots package is an implementation using ggplot2 graphics, it does not support the full functionality of ggplot2, such as e.g. facetting (for a variable not included in the prodplot) or additional layers (to show e.g. the `density' of points within each category). Within ggplot2 using the position='fill' option in barcharts comes closest to showing a conditional feature allocation. This is no longer supported in histogram (although I found spinograms to be quite useful at times). With the new ggplot2 version 2.0.0 (or, shortly 2.1.0) the way that geoms are support has been completely overhauled, and makes extensions much easier to write. We are proposing to add a mosaic geom to ggplot2 that allows to make use of the full functionality of ggplot2.
** Related work
Mosaic plots have been implemented in a variety of packages: mosaicplot()
is one of the base graphics in the stats
package, mosaic()
is part of the vcd
package. Also part of the vcd
package is strucplot()
, providing an extension to mosaic()
. qmosaic()
is an interactive implementation of mosaic plots as part of the cranvas
package, and the productplots
package is an implementation based on the ggplot2
framework. Why do we need another implementation? I don't want to downtalk any of the existing solutions, but there are some unresolved issues in all of them, e.g.
- default spacing and labels are not quite right in the
mosaicplot()
implementation. From a data visualization point of view, it makes sense to make the best use of the space available, and the (default) spacing choices are not doing that. - In the
vcd
implementation, there are some unintuitive ways, the formula gets resolved, e.g.mosaic(Improved ~ Treatment | Sex, data = Arthritis, zero_size = 0, highlighting_direction = "right")
gives the same result asmosaic(Improved ~ Treatment + Sex, data = Arthritis, zero_size = 0, highlighting_direction = "right")
. Statistically, not the same things are shown, and the chart should reflect that. - the
qmosaic
implementation incranvas
allows very powerful interactions with the chart, but the dependency on Qt makescranvas
very hard to install (besides a specific version of Qt with tricky paths, it also needs both theqtbase
and theqtpaint
package to work) - the implementation of the
productplots
package comes the closest to the envisioned result of this project. However,prodplot
is functionality on top of theggplot2
package and not integrated with it as ageom
, which makes it impossible to use additionalggplot2
tools such as facetting and layering except in very special cases.
** Details of your coding project
- R package for generalized version of mosaic plots implemented as a
geom
for theggplot2
package, What exactly do you want your student to code in the 3-month deadline? What functions? What do they do? Docs? Tests? Vignettes?
** Expected impact
I don't want to dissuade anybody from using their package of choice when drawing mosaic plots. Much rather do we want to reach the wider community of ggplot2 users to draw mosaic plots.
Mentors, please explain how this project will produce a useful package for the R community.
** Mentors
Once you have a solution to the medium or/and the hard problem, please get in touch with [[https://github.com/heike ][Heike Hofmann]] hofmann@iastate.edu and/or [[https://github.com/dicook ][Dianne Cook]].
** Tests
Several tests that potential students can do to demonstrate their capabilities for this particular project. Please modify the suggestions below to make them specific for your project.
- Easy: Install the [[https://github.com/hadley/productplots ][productplots package]] from github (you might have to install the devtools package first). Run one of the examples, put the chart in a knitr/Rmarkdown document and write a paragraph to explain the chart.
- Medium: write a shiny app that shows a mosaicplot (using
prodplot
) of a few variables and allows to interactively change at least one aspect of the mosaic. - Hard: based on Hadley Wickham's introduction to [[http://docs.ggplot2.org/dev/vignettes/extending-ggplot2.html ][extending
ggplot2
]] write a function that implements a geom of your choice. Document the function using Roxygen, and include it into an R package.
** Solutions of tests
Students, please post a link to your test results here.