See how to test drive R 4.0 in a Docker container, plus a look at three new R 4.0 features for color palettes and strings
There are some interesting changes and updates in R 4.0. Here Iโll take a look at three of them. Plus Iโll give you step-by-step instructions on installing R 4.0 so it wonโt interfere with your existing R installation โ by running R with Docker.
Docker is a platform for creating โcontainersโ โ completely self-contained, isolated environments on your computer. Think of them like a mini system on your system. They include their own operating system, and then anything you want to add to that โ application software, scripts, data, etc. Containers are useful for a lot of things, but here Iโll focus on just one: testing new versions of software without screwing up your current local setup.
Running R 4.0 and the latest preview release of RStudio in a Docker container is pretty easy. If you donโt want to follow along with the Docker part of this tutorial, and you just want to see whatโs new in R, scroll down to the โThreeย new R 4.0 featuresโ section.
Run R 4.0 in a Docker container
If you would like to follow along, install desktop Docker on your system if you donโt already have it: Head to https://www.docker.com/products/docker-desktop and download the right desktop version for your computer (Windows, Mac, or Linux). Then, launch it. You should see a whale Docker icon running somewhere on your system.
Sharon Machlis, IDG
Docker icon
Next, we need a Docker image for R 4.0. You can think of a Docker image as a set of instructions to create a container with specific software included. Thanks to Adelmo Filho (a data scientist in Brazil) and the Rocker R Docker project, who provide some very useful Docker images. I modified their Docker images just slightly to make the one I used in this tutorial.
Here is the syntax to run a Docker image on your own system to create a container.
docker run --rm -p 8787:8787 -v /path/to/local/dir:/home/rstudio/newdir username/docker_image_name:image_tag
docker is how you need to start any Docker command. run means I want to run an image and create a container from that image. The --rm flag means remove the container when itโs finished. You donโt have to include --rm; but if you run a lot of containers and donโt delete them, theyโll start taking up a lot of disk space. The -p 8787:8787 is only needed for images that have to run on a system port, which RStudio does (as does Shiny if you plan to include that someday). The command above specifies port 8787, which is RStudioโs usual default.
The -v creates a volume. Remember when I said Docker containers are self-contained and isolated? That means isolated. By default, the container canโt access anything outside of it, and the rest of your system canโt access anything inside the container. But if you set up a volume, you can link a local folder with a folder inside the container. Then they automatically sync up. The syntax:
-v path/to/local/directory:/path/to/container/directory
With RStudio, you usually use /home/rstudio/name_of_new_directory for the container directory.
At the end of the docker run command is the name of the image you want to run. My image, like many Docker images, is stored on Docker Hub, a service set up by Docker for sharing images. Like with GitHub, you access a project by specifying a username/reponame. In this case you also usually add :the_tag, which helps if there are different versions of the same image.
Below is code you can modify to run my image with R 4.0 and the latest preview release of RStudio on your system.ย Make sure to substitute a path to one of your directories for /Users/smachlis/Document/MoreWithR.ย You can run this in a Mac terminal window or Windows command prompt or PowerShell window.
docker run --rm -p 8787:8787 -v /Users/smachlis/Documents/MoreWithR:/home/rstudio/morewithr sharon000/my_rstudio_image:version1
When you run this command for the first time, Docker will need to download the image from Docker Hub, so it might take awhile. After that, unless you delete your local copy of the image, it should be much faster.
Now when you open localhost:8787 in a browser, you should see RStudio.
Sharon Machlis, IDG
RStudio running in a browser window via a Docker container.
The default user name and password are both rstudio, which of course would be terrible if you were running this in the cloud. But I think itโs fine on my local machine, since I donโt normally have any password on my regular RStudio desktop.
If you check the R version in your containerized R/RStudio, youโll see itโs version 4.0. RStudio should be version 1.3.947, the latest preview release at the time this article first published. Those are both different versions from those installed on my local machine.
Three new R 4.0 features
So now letโs look at a few new features of R 4.0.ย
New stringsAsFactors default
In the code below, Iโm creating a simple data frame with info about four cities and then checking the structure.
<span tabindex="-1"><span class="GGBOEFPDFVB ace_keyword">City <- c("New York", "San Francisco", "Boston", "Seattle")
</span><span class="GGBOEFPDFVB ace_keyword">State <- c("NY", "CA", "MA", "Seattle")
</span><span class="GGBOEFPDFVB ace_keyword">PopDensity <- c(26403, 18838, 13841, 7962)
</span><span class="GGBOEFPDFVB ace_keyword">densities <- data.frame(City, State, PopDensity)
</span><span class="GGBOEFPDFVB ace_keyword">str(densities)
</span><span class="GGBOEFPDPVB">'data.frame': 4 obs. of 3 variables:
$ City : chr "New York" "San Francisco" "Boston" "Seattle"
$ State : chr "NY" "CA" "MA" "Seattle"
$ PopDensity: num 26403 18838 13841 7962</span></span>
Notice anything unexpected? City and State are character strings, even though I didnโt specify stringsAsFactors = FALSE. Yes, at long last, the R data.frame default is stringsAsFactors = FALSE. If I run the same code in an older version of R, City and State will be factors.
New color palettes and functions
Next, letโs look at a new built-in function in R 4.0: palette.pals(). This shows some built-in color palettes.
<span tabindex="-1"><span class="GGBOEFPDFVB ace_keyword">palette.pals()
</span><span class="GGBOEFPDPVB"> [1] "R3" "R4" "ggplot2" "Okabe-Ito"
[5] "Accent" "Dark 2" "Paired" "Pastel 1"
[9] "Pastel 2" "Set 1" "Set 2" "Set 3"
[13] "Tableau 10" "Classic Tableau" "Polychrome 36" "Alphabet"</span></span>
Another new function, palette.colors(), gives info about a built-in palette.
<span tabindex="-1"><span class="GGBOEFPDFVB ace_keyword">palette.colors(palette = "Tableau 10")
</span><span class="GGBOEFPDPVB"> blue orange red lightteal green yellow purple
"#4E79A7" "#F28E2B" "#E15759" "#76B7B2" "#59A14F" "#EDC948" "#B07AA1"
pink brown lightgray
"#FF9DA7" "#9C755F" "#BAB0AC" </span></span>
If you then run the scales packageโs show_col() function on the results, you get a nice color display of the palette.
scales::show_col(palette.colors(palette = "Tableau 10"))
Sharon Machlis, IDG
Results ofย scales::show_col(palette.colors(palette = โTableau 10โ)).
I made a small function combining the two that could be useful for looking at some of the built-in palettes in a single line of code:
display_built_in_palette <- function(my_palette) {
scales::show_col(palette.colors(palette = my_palette))
}
display_built_in_palette("Okabe-Ito")
Sharon Machlis, IDG
Viewing the Okabe-Ito palette with my function display_built_in_palette().
None of this code works in earlier versions of R, since only scales::show_col() is available before R 4.0.
Escaping characters within strings
Finally, letโs look at a new function that makes it easier to include characters that usually need to be escaped in strings.ย
The syntax is r"(my string here)". Here is one example:
string1 <- r"("I no longer need to escape these " double quotes inside a quote," they said.)"
That string includes an un-escaped quotation mark inside a pair of double quotes. If I display that string, I get this:
<span tabindex="-1"><span class="GGBOEFPDFWB ace_keyword">> </span><span class="GGBOEFPDFVB ace_keyword">cat(string1)
</span><span class="GGBOEFPDPVB">"I no longer need to escape these " double quotes inside a quote," they said.</span></span>
I can also print a literal n inside the new function.
<span tabindex="-1"><span class="GGBOEFPDFVB ace_keyword">string2 <- r"(Here is a backslash n n)"
</span><span class="GGBOEFPDFVB ace_keyword">cat(string2)
</span><span class="GGBOEFPDPVB">Here is a backslash n n</span></span>
Without the special r"()" function, that n is read as a line break and doesnโt display.
<span tabindex="-1"><span class="GGBOEFPDFVB ace_keyword">string3 <- "Here is a backslash n n"
</span><span class="GGBOEFPDFVB ace_keyword">cat(string3)
</span><span class="GGBOEFPDPVB">Here is a backslash n </span></span>
Before this in base R, you needed to escape that backslash with a second backslash.ย
<span tabindex="-1"><span class="GGBOEFPDFVB ace_keyword">string4 <- "Usual escaped n"
</span><span class="GGBOEFPDFVB ace_keyword">cat(string4)
</span><span class="GGBOEFPDPVB">Usual escaped n</span></span>
Thatโs not a big deal in this example, but it can get complicated when youโre working on something like complex regular expressions.
Thereโs lots more new in R 4.0. You can check out all the details at the R project website.ย
For more on using Docker with R, check out rOpenSci Labsโ short but excellent R Docker Tutorial.
And for more R tips, head to theย InfoWorld Do More With R page!


