Batch exporting plots in R

Author

Matthew Harris

Published

February 2, 2021

Abstract
Exporting multiple plots created by {ggplot2}

Creating insightful plots with a couple of lines of code is what makes ggplot such a powerful tool. There are times where I want to export multiple version of the same plot based upon slightly different criteria. Batch processing plots is super easy with map() family of functions from the {purrr} package.

Packages

Code
library(dplyr)
library(tidyr)
library(purrr)
library(ggplot2)
library(scales)
library(glue)

Data Import

The data used for this analysis can be found at the following link.

Formula One Data

Data Wrangling

For this demonstration I’ll be revisiting my Formula 1 data. The code below identifies the top ten constructors based upon total podiums from the year 2010 and greater.

Code
top_constructors <- f1_data %>% 
  filter(race_year >= 2010) %>% 
  mutate(podium = ifelse(positionOrder >= 3, TRUE, FALSE)) %>% 
  group_by(c_name) %>% 
  summarize(total_podiums = sum(podium, na.rm = TRUE),
            .groups = "drop") %>% 
  slice_max(total_podiums, n = 10, with_ties = FALSE) %>% 
  pull(c_name)

top_podiums <- f1_data %>% 
  filter(c_name %in% top_constructors, race_year >= 2010) %>% 
  mutate(podium = ifelse(positionOrder >= 3, TRUE, FALSE),
         win = ifelse(positionOrder == 1, TRUE, FALSE)) %>%
  group_by(c_name, race_year) %>% 
  summarize(total_podiums = sum(podium, na.rm = TRUE),
            total_wins = sum(win, na.rm = TRUE),
            .groups = "drop")

Creating the First Plot

Before I can batch process these plots I want to fine tune what one plot will look like. I’ll start by filtering for race_year 2010 and creating a nice plot detailing each constructor’s performance.

Code
top_podiums %>% 
  filter(race_year == 2010) %>% 
  mutate(avg_podiums = mean(total_podiums, na.rm = TRUE)) %>% 
  ggplot(aes(reorder(c_name, -total_podiums), total_podiums, fill = total_wins)) + geom_col() +
  geom_hline(aes(yintercept = max(avg_podiums)),
             linetype = 3,
             col = "black",
             size = 1.2) +
  scale_y_continuous(breaks = breaks_width(5)) +
  scale_fill_gradient(low = "blue", high = "red") + 
  labs(x = "Constructor", y = "Podiums", 
       title = "Total Podiums - 2010",
       fill = "Wins") +
  theme(plot.title = element_text(hjust = 0.5)) + 
  coord_flip()

Batch Processing

I’m happy with my plot. Now comes the fun part. I have a couple options for displaying the information for the other years. I could facet the additional plots but that gets cluttered way to fast. Batch processing is a great solution if I needed to use this plots in another presentation or share them outside of R.

Code
top_podiums %>% 
  nest(data = -race_year) %>% 
  mutate(avg_podiums = map_dbl(.x = data,
                               .f = ~mean(.x$total_podiums, 
                                          na.rm = TRUE))) %>% 
  mutate(plts = pmap(.l = list(data, avg_podiums, race_year),
                     .f = ~ggplot(data = ..1, 
                                  aes(reorder(c_name, -total_podiums), 
                                      total_podiums, fill = total_wins)) + 
                       geom_col() +
                       geom_hline(aes(yintercept = ..2),
                                  linetype = 3,
                                  col = "black",
                                  size = 1.2) +
                       scale_y_continuous(breaks = breaks_width(5)) +
                       labs(x = "Constructor", y = "Podiums", 
                            title = glue("Total Podiums - {..3}"),
                            fill = "Wins") +
                       theme(plot.title = element_text(hjust = 0.5)) + 
                       coord_flip())) %>% 
  walk2(.x = .$plts, .y = .$race_year,
        .f = ~ggsave(glue("local_data/{.y}_Performance.png"),
                     plot = .x))