Skip to content

Rperform: Performance analysis of R package code

Akash Tandon edited this page Mar 9, 2016 · 13 revisions

Background

There exists various tools and software to help developers across different languages test the performance of their code. This analysis can be performed in terms of quantifiable metrics such as time, memory, etc. R, however, lacks a standard tool for performing such an analysis. Rperform had started as a GSoC 2015 project to tackle the same problem. From its README,

Rperform is an R package that makes it easy for R package developers to track quantitative performance metrics of their code, over time. It focuses on providing changes, related to time and memory, brought over in the package’s performance metrics over subsequent development versions and across git branches. Also, you can use Rperform with travis-ci to do performance testing during your travis build by adding some sample scripts to your git repository.

This particular project will mainly deal with improving and expanding on Rperform's functionalities as well as implementing a coherent user interface (UI) for the package developers to view the performance analysis of their code.

Related work

Rperform is a first-of-its-kind package for detailed performance analysis of R packages.

Details of your coding project

There are several focus areas which will be worked upon on this project. They include:

  • Improvement of the visualization functions: One of the most prominent and helpful feature of Rperform are the visualization functions it provides. Here's an example plot generated by Rperform after analyzing the runtime performance of a unit test from Hadley Wickham's stringr package.

Rperform plot

Details about the same can be found on the Rperform Wiki. The visualization functions need to be improved upon and more functions will be required to be added. Maybe the plots can be made interactive using packages such as animint. The direction this aspect of the project takes will heavily depend on how the UI implementation is done (see below).

  • Creating a coherent and useful user interface: Currently, Rperform does have a function which generates a webpage comprised of multiple plots obtained after analyzing the package code. However, there is a need for development of a proper user interface for the package developer to be able to interact meaningfully with the results. Inspiration for the same can be taken from projects such as asv and codespeed.

  • Improvement and addition of multiple functionalities: Rperform currently provides metrics for two cases. First is by comparing multiple commits in a single branch. In the second case, metrics are provided across two specified branches which can then be compared. The latter case isn't fully developed and needs to be worked upon. Also, being able to compare specific commits separated by a decent amount of commits in between would be useful. This is explored in an issue here. Currently the performance analysis of commits can be done in a sequential manner. More features can be thought upon and added.

  • Development of a test suite: Rperform currently lacks adequate number of unit tests. With increased functionality, working on this aspect will become even more important.

Expected impact

Rperform can have a significant positive impact on the R community. Package developers will find it extremely useful to have such a handy tool at their disposal. It has the potential to become the standard performance analysis tool for the R package ecosystem.

Mentors

Toby Dylan Hocking had initiated the project last year and was one of the mentors. He has agreed to be a mentor again this year. The project is looking another mentor currently. The mentor would ideally have experience in developing performance analysis, or similar, tools and/or development of tools for the R ecosystem.

Tests

  • Easy: something that any useR should be able to do, e.g. download some existing package listed in the Related Work, and run it on some example data.
  • Medium: something a bit more complicated. You can encourage students to write a script or some functions that show their R coding abilities.
  • Hard: Can the student write a package with Rd files, tests, and vigettes? If your package interfaces with non-R code, can the student write in that other language?

Solutions of tests

Students, please post a link to your test results here.

Clone this wiki locally