Track your goals with color-coded calendars created with R packages ggplot2 and ggcal
A color-coded calendar can be a quick and easy way to see whether youโre achieving a daily goal. Did you meet a daily business metric like sales or social-media posts? Or, how are you doing with personal goals, like exercising every day? With one glance, you can get a feel for how youโve been doing. Itโs great for tracking those New Yearโs resolutionsโand a whole lot more.
R can help. For this example, Iโll create a calendar that tracks daily exerciseโmore specifically, whether you did cardio, did strength training, or rested each day.
You need to get your data before you can visualize it. For simple manual data entry, I usually use Microsoft Excel or Google Sheets. (As much of an R enthusiast as I am, R generally isnโt ideal for data entry.)ย
One way to set up the spreadsheet is with two columns: one for Day and another for Activity.ย What I donโt want to do, however, is enter freeform text into a column where R expects specific categories. Even if Iโll remember that the exact format for strength training is โstrength trainingโ and not โweights,โ thereโs always the risk of typos. So, I suggest either creating a form for spreadsheet data entry or adding data validation to the category column (in this case, Activity).

Acceptable data-entry options in Excel
For a task like this, I prefer data validation instead of complicating things with a separate form. An easy way to set up the validation is to create a column of acceptable options in another tabโin this case, Cardio and Strength Training. Next, select the cells where you want to restrict data entryโin this case, the whole Activity row except for the header.
Then choose Data Validation in the Excel data ribbon and select list, and enter the cells with the acceptable options in the source field. Now you can enter the data you want to use in R.

Excel data validation
To make an easy color-coded calendar, Iโll use the ggplot2 library and the ggcal package by Jay Jacobs on GitHub. Iโll also load dplyr, because I almost always end up using dplyr, whatever Iโm doing; readxl to read the spreadsheet; and lubridate to work with dates.
Install the ggcal package if itโs not yet on your system with devtools::install_github("jayjacobs/ggcal") or remotes::install_github("jayjacobs/ggcal")ย .
Hereโs code to load needed packages and import data from a spreadsheet called tracker.xlsx into an R object called daily_exercise:
library(ggplot2)
library(ggcal)
library(dplyr)
library(readxl)
library(lubridate)
daily_exercise <- readxl::read_xlsx("tracker.xlsx", col_types = c("date", "text"))
If you want to follow along with the sample data Iโm using but donโt want to set up a tracker.xlsx spreadsheet right now, thereโs code to create that initial daily_exercise object at the end of this article. (Youโll need the tibble package installed.)
The readxl package imports dates as POSIXct objects, but the ggcal function wants them as Dates. Youโll need to change the column class with daily_exercise <- mutate(daily_exercise, Day = as.Date(Day)).
The daily_exercise data frame only has a few days of the month. If you want an entire monthโs calendar to print, youโll need to fill in the rest of month with additional code. Hereโs one way to do that (explanation below the code):
last_day_in_file <- max(daily_exercise$Day)
end_this_month <- as.Date(cut(last_day_in_file, "month")) + months(1) - 1
alldates <- data.frame(Day = seq.Date(min(daily_exercise$Day), end_this_month, by ="1 day"))
daily_exercise <- left_join(alldates, daily_exercise)
Line 1 finds the latest date in the data frame. Line 2 calculates the last day of the month for that date, in a bit of a roundabout way. Initially, I calculate theย first day of the month for that last date in the fileโthat would be January 1 for any date in Januaryโand sets it to be a Date class. I then add one month to the result; in this case, the value is February 1 for any date in January. I donโt want February 1, though; I want 1 day earlier than that. So I subtract 1 (which means one day), and then Iโve got the end of the month. Why? Itโs a lot easier to find the beginning of a month, which is always the first, than the end of a month, which can be the 28th, 29th, 30th, or 31st.)
Line 3 generates all dates starting with the earliest date in my data and ending with the end of the month that we just calculated. I can use base Rโs seq.Date() function, creating a sequence incrementing by 1 day. I store that in a new data frame with one column.
Why did I create a data frame of 1 column instead of a vector? Because now I can use a dplyr left_join() to combine the two data frames. A left join keeps everything in the left, or first, data frame (in this case alldates) and merges it with a second data frame (daily_exercise) by a common column (Day).ย Now, the data is ready for ggcal.
The syntax for the ggcal function is ggcal(myDateVector, myDataVector)โin other words, dates as the first argument and values as the second argument. The values can be categories, like weโre using now, or numbers, if you want a calendar heatmap. Runย
ggcal(daily_exercise$Day, daily_exercise$Activity)
and you should see a color-coded calendar visualization with ggplot2 default colors.

A color-coded calendar created with the ggcal package using default ggplot2 colors.
Customize colors
If you want to set your own color scheme, you can use the same functions youโd use for other ggplot2 visualizations. For example, below I used scale_fill_manual() and added a legend name, color values for each category, and a lighter grey color for NA values. That last theme() line adds back a title for the legend.
ggcal(daily_exercise$Day, daily_exercise$Activity)+
scale_fill_manual(name ="Exercise",
values = c(
"Cardio" ="steelblue",
"Strength Training" ="forestgreen"
),
na.value ="grey88"
) +
theme(legend.title = element_text())

A color-coded calendar with customized colors
Calendar heatmap
I set up another Excel worksheet that includes minutes in addition to categoriesย for daily exercise, so I can demonstrate a calendar heatmap. Code to create that second daily_exercise object is at the end of the article.
I process that data for ggcal in the same way that I did for the first version: changing the Day column to Date objects and merging it with my alldates data frame to fill in blank values for the rest of the current month.
daily_exercise <- mutate(daily_exercise, Day = as.Date(Day))
daily_exercise <- left_join(alldates, daily_exercise)
Hereโs what a heatmap of minutes looks like with ggcal defaults:
ggcal(daily_exercise$Day, daily_exercise$Minutes)

A calendar heatmap with ggcal and ggplot2 default colors
Iโd rather have the darkest color for the highest number of minutes, though, not the lowest. And, Iโd like a lighter gray for the empty blocks. Hereโs code for that:
ggcal(daily_exercise$Day, daily_exercise$Minutes)+
scale_fill_gradient(low ="#f7fbff", high ="#08519c", na.value ="grey75")

A ggcal heatmap with a color palette going from light for low numbers to dark with high numbers
Other ggplot2 customizations work as well, such as the scale_fill_distiller() function to use an RColorBrewer palette for continuous, numerical data. Below, I use a yellow-to-orange-to-red palette.ย
ggcal(daily_exercise$Day, daily_exercise$Minutes)+
scale_fill_distiller(palette ="YlOrRd", na.value ="grey75")

A calendar heatmap created with ggcal and an RColorBrewer palette.
Code to create the first daily_exercise object
datapasta::df_paste()
daily_exercise <- tibble::tibble(
Day = as.POSIXct(c("2019-01-01", "2019-01-02", "2019-01-03", "2019-01-04",
"2019-01-05", "2019-01-06", "2019-01-07", "2019-01-08"), tz ="UTC"),
Activity = c("Cardio", "Strength Training", "Cardio", "Cardio", NA,
"Strength Training", "Cardio", "Cardio")
Code to create the second daily_exercise object with minutes
daily_exercise <- tibble::tibble(
Day = as.POSIXct(c("2019-01-01", "2019-01-02", "2019-01-03", "2019-01-04",
"2019-01-05", "2019-01-06", "2019-01-07", "2019-01-08"), tz ="UTC"),
Activity = c("Cardio", "Strength Training", "Cardio", "Cardio", NA,
"Strength Training", "Cardio", "Cardio"),
Minutes = c(40, 35, 30, 60, 0, 25, 45, 40)
)


