Skip to content

Commit 8efe4ab

Browse files
psolymosalexellis
authored andcommitted
Revised post for R
Signed-off-by: Peter Solymos <psolymos@gmail.com>
1 parent 8aaab58 commit 8efe4ab

File tree

3 files changed

+22
-14
lines changed

3 files changed

+22
-14
lines changed

_posts/2021-02-26-r-templates.md renamed to _posts/2021-03-05-r-templates.md

Lines changed: 22 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: "Functions for data science with R templates for OpenFaaS"
33
description: "Let's bring R to the cloud! Use the power of R for data science serverless-style."
44
date: 2021-02-26
5-
image: /images/2021-02-r/background.jpg
5+
image: /images/2021-03-r/background.jpg
66
categories:
77
- kubernetes
88
- r
@@ -15,10 +15,18 @@ Let's bring R to the cloud! Use the power of R for data science serverless-style
1515

1616
## Introduction
1717

18-
[R](https://www.r-project.org/) is one of the most popular languages for data science. R's strength is in _statistical computing_ and _graphics_. Its use is most prominent in disciplines relying on classical statistical approaches, such as environmental sciences, public health, finance, just to mention a few. In this post first I will introduce you to the R templates for OpenFaaS. Then I will build a function that pulls data from a COVID-19 API, fits a time series model to the data, and makes a forecast for the future case counts.
18+
In this post first I will introduce you to the [R](https://www.r-project.org/) templates for OpenFaaS, then I will build a function that pulls data from a COVID-19 API, fits a time series model to the data, and makes a forecast for the future case counts.
1919

2020
> This post is written for existing OpenFaaS users, if you're new then you should [try deploying OpenFaaS](https://docs.openfaas.com/deployment/) and following a tutorial to get a feel for how everything works. Why not start with this course? [Introduction to Serverless course by the LinuxFoundation](https://www.openfaas.com/blog/introduction-to-serverless-linuxfoundation/)
2121
22+
### Why R?
23+
24+
[R](https://www.r-project.org/) is one of the most popular languages for data science. R's strength is in _statistical computing_ and _graphics_. Is popularity has been growing steadily as measured by the increase in [Stack Overflow questions](https://stackoverflow.blog/2017/10/10/impressive-growth-r/). R's use is most prominent in disciplines relying on classical statistical approaches, such as academia, healthcare/biostatistics, environmental sciences, and finance, just to mention a few. What makes R ideal in these disciplines is the immediate access to latest statistical methods which often get published as extension packages alongside journal articles.
25+
26+
R's huge ecosystem of extension packages (currently more than 14 thousand in [CRAN](https://cran.r-project.org/) repositories, not counting packages on GitHub etc.) is driving the growth of R and the community around it. Some of the most well known packages include [ggplot2](https://CRAN.R-project.org/package=ggplot2) for the grammar of graphics, and [dplyr](https://CRAN.R-project.org/package=ggplot2) for the grammar of data manipulation, both part of the [tidyverse](https://www.tidyverse.org/).
27+
28+
Besides the interactive data wrangling use cases, R is also a capable scripting language from the command line, and has powerful web server frameworks for building interactive applications or publishing APIs with minimal effort. This post naturally ties into these frameworks. So let's dive right into how to turn your first R script into a serverless function using `faas-cli` templates.
29+
2230
### The R templates
2331

2432
Use the [`faas-cli`](https://github.com/openfaas/faas-cli) and pull R templates:
@@ -27,19 +35,19 @@ Use the [`faas-cli`](https://github.com/openfaas/faas-cli) and pull R templates:
2735
faas-cli template pull https://github.com/analythium/openfaas-rstats-templates
2836
```
2937

30-
Now `faas-cli new --list` should give you a list with the available R/rstats templates to choose from (rstats refers to the Twitter hashtag used for R related posts). The templates differ with respect to the Docker base image, the OpenFaaS watchdog type, and the server framework used.
38+
Now `faas-cli new --list` should give you a list with the available R/rstats templates to choose from (rstats refers to the #rstats Twitter hashtag used for R related posts). The templates differ with respect to the Docker base image, the OpenFaaS watchdog type, and the server framework used.
3139

3240
You can choose between the following base images:
3341

3442
- Debian-based `rocker/r-base` Docker image from the [rocker](https://github.com/rocker-org/rocker/tree/master/r-base) project for bleeding edge,
3543
- Ubuntu-based `rocker/r-ubuntu` Docker image from the [rocker](https://github.com/rocker-org/rocker/tree/master/r-ubuntu) project for long term support (uses [RSPM](https://packagemanager.rstudio.com/client/) binaries for faster R package installs),
3644
- Alpine-based `rhub/r-minimal` Docker image from the [r-hub](https://github.com/r-hub/r-minimal) project for smallest image sizes.
3745

38-
> The use of Docker with R is discussed in the original article introducing the [Rocker](https://journal.r-project.org/archive/2017/RJ-2017-065/RJ-2017-065.pdf) project and also in a recent review of the [Rockerverse](https://journal.r-project.org/archive/2020/RJ-2020-007/RJ-2020-007.pdf).
46+
> The use of Docker with R is discussed in the original article introducing the [Rocker](https://journal.r-project.org/archive/2017/RJ-2017-065/RJ-2017-065.pdf) project and also in a recent review of the [Rockerverse](https://journal.r-project.org/archive/2020/RJ-2020-007/RJ-2020-007.pdf), packages and applications for containerization with R.
3947
4048
The template naming follows the pattern `rstats-<base_image>-<server_framework>`. Templates without a server framework (e.g. `rstats-base`) use the classic [watchdog](https://github.com/openfaas/faas/tree/master/watchdog) which passes in the HTTP request via STDIN and reads a HTTP response via STDOUT. The other templates use the he HTTP model of the [of-watchdog](https://github.com/openfaas-incubator/of-watchdog) that provides more control over your HTTP responses and is more performant due to caching and pre-loading data and libraries.
4149

42-
R has an ever increasing number of server frameworks available. There are templates for the following frameworks (R packages): [httpuv](https://CRAN.R-project.org/package=httpuv), [plumber](https://www.rplumber.io/), [fiery](https://CRAN.R-project.org/package=fiery), [beakr](https://CRAN.R-project.org/package=beakr), [ambiorix](https://ambiorix.john-coene.com/). Each of these frameworks have their own pros and cons for building standalone applications. But for serverless purposes, the most important aspect of picking one comes down to support and ease of use.
50+
R has an ever increasing number of server frameworks available. There are templates for the following frameworks (R packages): [httpuv](https://CRAN.R-project.org/package=httpuv), [plumber](https://www.rplumber.io/), [fiery](https://CRAN.R-project.org/package=fiery), [beakr](https://CRAN.R-project.org/package=beakr), [ambiorix](https://ambiorix.john-coene.com/). Each of these frameworks have their own pros and cons for building standalone applications. But for our serverless purposes, the most important aspect of picking one comes down to support and ease of use.
4351

4452
In this post I focus on the [plumber](https://www.rplumber.io/) R package and the `rstats-base-plumber` template. Plumber is one of the oldest of these frameworks. It has gained popularity, corporate adoption, and there are many [examples](https://github.com/rstudio/plumber/tree/master/inst/plumber) and tutorials out there to get you get started.
4553

@@ -164,9 +172,9 @@ The result of the call is a list with six elements, all elements are vectors of
164172

165173
The following plot combines the historical daily case counts and the 30-day forecast for Canada. The point forecast is the white line, the 80% and 95% forecast intervals are the blue shaded areas. I made two forecasts, the first on December 1st, 2020, the second on February 18th, 2021:
166174

167-
![COVID-19 Canada](/images/2021-02-r/covid-canada-2021-02-18.png)
175+
![COVID-19 Canada](/images/2021-03-r/covid-canada-2021-02-18.png)
168176

169-
The last part of the script defines the Plumber endpoint `/` for a GET request. One of the nicest features of Plumber is that it allows you to create a web API by [decorating the R source code](https://www.rplumber.io/articles/quickstart.html) with special `#*` comments. These annotations will tell Plumber how to handle the requests, what kind of parsers and formatters to use, etc. The current setup will treat the function arguments as URL parameters. The default content type for the response is JSON, thus we do not need to specify it.
177+
The last part of the script defines the Plumber endpoint `/` for a GET request. One of the nicest features of Plumber is that you can create a web API by [decorating the R source code](https://www.rplumber.io/articles/quickstart.html) with special `#*` comments. These annotations will tell Plumber how to handle the requests, what kind of parsers and formatters to use, etc. The current setup will treat the function arguments as URL parameters. The default content type for the response is JSON, thus we do not need to specify it.
170178

171179
```R
172180
#* COVID
@@ -178,17 +186,17 @@ function(region, cases, window, last) {
178186
}
179187
```
180188

181-
The `covid_forecast` arguments can be missing except for region. This makes the corresponding URL parameters optional. We have to remember that URL form encoded parameters will be of character type, thus checking type and making appropriate type conversions is necessary (i.e. `as.numeric()` for the `window` argument).
189+
The `covid_forecast` arguments can be missing except for region. This makes the corresponding URL parameters optional. We have to remember that URL form encoded parameters will be of type character, thus checking type and making appropriate type conversions is necessary (i.e. `as.numeric()` for the `window` argument).
182190

183191
### Build, push, and deploy the function
184192

185-
Now you can use `faas-cli up` to build, push, and deploy the COVID-19 forecast function to the OpenFaaS cluster:
193+
Now you can use `faas-cli up` to build, push, and deploy the COVID-19 forecast function to your OpenFaaS cluster:
186194

187195
```bash
188196
faas-cli up -f covid-forecast.yml
189197
```
190198

191-
You can test the function's deployed instance with curl:
199+
Test the function's deployed instance with curl:
192200

193201
```bash
194202
curl -X GET -G \
@@ -199,7 +207,7 @@ curl -X GET -G \
199207
- last=2021-02-18
200208
```
201209

202-
Or simply by visiting the URL `$OPENFAAS_URL/function/covid-forecast?region=canada-combined&window=4&last=2021-02-18`. The output should be something like this (depending on the day you make the request):
210+
Or simply by visiting the URL `$OPENFAAS_URL/function/covid-forecast?region=canada-combined&window=4&last=2021-02-18`. The output should be something like this:
203211

204212
```bash
205213
{
@@ -214,10 +222,10 @@ Or simply by visiting the URL `$OPENFAAS_URL/function/covid-forecast?region=cana
214222

215223
### Wrapping up
216224

217-
In this post I showed how to use the R templates for OpenFaaS. We built a serverless function that consumes data from an external APIs, fits exponential smoothing model, and makes a forecast. The data API with the forecasting function can be added to web applications to provide timely updates on the fly.
218-
219-
The function presented here could be extended to a microservice that might also provide a summary of past case counts in a [dynamic document](https://rmarkdown.rstudio.com/) building on R's powerful authoring tools.
225+
In this post I showed how to use the R templates for OpenFaaS. We built a serverless function that consumes data from an external API, fits exponential smoothing model, and makes a forecast. The data API with the forecasting function can be added to web applications to provide timely updates on the fly. Here are some resources for taking you further:
220226

221227
- [Learn about alternative ways of passing parameters to the COVID-19 function](https://github.com/analythium/openfaas-rstats-examples/tree/main/02-time-series-forecast)
222228
- [See the list of available R templates for OpenFaaS](https://github.com/analythium/openfaas-rstats-templates#readme)
223229
- [Check out other R examples with OpenFaaS](https://github.com/analythium/openfaas-rstats-examples)
230+
231+
Now you might ask why you should care about R in a serverless landscape. R is being used by data science teams. The OpenFaaS templates I presented here can reduce time to production by wrapping R code into serverless functions or microservices without significant overhead. If you are an R user, go ahead and try the templates with one of your use cases. If you are not an R user but this introduction piqued your interest, you can find great introductory courses on [freeCodeCamp](https://www.freecodecamp.org/news/tag/r-programming/). The [_R for data science_](https://r4ds.had.co.nz/) book gives a concise and modern overview of working with data in R.
File renamed without changes.

0 commit comments

Comments
 (0)