Nintendo Switch Games - 2022

Author

Matthew Harris

Published

March 19, 2022

Abstract
Nintnendo Switch video game release trends.

Packages

Code
library(dplyr)
library(purrr)
library(stringr)
library(readr)
library(httr2)
library(rvest)
library(lubridate)
library(glue)

Load Custom Palette

Loading the custom palette that I created from my Creating a Cyberpunk 2077 color palette post.

Code
raw_source <- "?raw=true"
source(glue("https://github.com/mhdemo/custom_palette_collection/blob/main/palettes/cyberpunk2077.R{raw_source}"))

Scraping

It looks like changes were made to how the URLs are broken out for the Nintendo Switch games library. Each URL is saved as a string that will be read fed into wiki_scrape(). There’s a good chance that the URLs or naming convention will change again in the future.

Code
games_url_0a <- "https://en.wikipedia.org/wiki/List_of_Nintendo_Switch_games_(0-9_and_A)"
games_url_b <- "https://en.wikipedia.org/wiki/List_of_Nintendo_Switch_games_(B)"
games_url_cg <- "https://en.wikipedia.org/wiki/List_of_Nintendo_Switch_games_(C-G)"
games_url_hp <- "https://en.wikipedia.org/wiki/List_of_Nintendo_Switch_games_(H-P)"
games_url_qz <- "https://en.wikipedia.org/wiki/List_of_Nintendo_Switch_games_(Q-Z)"

wiki_scrape <- function(wiki_url) {
  # Be polite
  # Sleep between requests for 1-3 seconds
  Sys.sleep(sample(1:3, 1))
  
  wiki_url %>% 
    read_html() %>%
    html_nodes(css = "#softwarelist") %>% 
    html_table(fill = TRUE) %>% 
    as.data.frame() %>%
    as_tibble()
}

game_data <- list(games_url_0a, games_url_b, games_url_cg,
                  games_url_hp, games_url_qz) %>% 
  map_df(.f = ~wiki_scrape(.x))
Code
game_data %>%
  head()
# A tibble: 6 × 6
  Title                 Genre.s.                   Devel…¹ Publi…² Relea…³ Ref. 
  <chr>                 <chr>                      <chr>   <chr>   <chr>   <chr>
1 0 Degrees             Action, platformer, puzzle EastAs… EastAs… May 19… [1][…
2 #1 Anagrams           Board game, edutainment, … Eclips… Eclips… May 14… [5][…
3 #1 Crosswords         Board game, edutainment, … Eclips… Eclips… Februa… [8][…
4 1-2-Switch            Party                      Ninten… Ninten… March … <NA> 
5 10 Second Ninja X     Action platformer, puzzle  Four C… Thalam… July 3… <NA> 
6 10 Second Run Returns Party, racing              Blue P… Blue P… Decemb… [11]…
# … with abbreviated variable names ¹​Developer.s., ²​Publisher.s., ³​Release.date

Data Wrangling

Some additional wrangling is necessary. I’m also choosing to “explode” the data frame by genre. This will duplicate the game titles for each genre that it has listed. This “exploded” format allows me to count the frequency of genres mentioned. Another approach would be to determine which single genre best describes a game.

Code
switch_library <- game_data %>% 
  select(-Ref.) %>% 
  janitor::clean_names() %>% 
  setNames(str_remove_all(names(.), "_s")) %>% 
  mutate(release_date = as.Date(release_date,format = "%B %d, %Y"),
         release_ym = floor_date(release_date, "month"),
         release_year = year(release_date),
         release_day = yday(release_date),
         common_date = as.Date(release_day, origin = glue("{year(Sys.Date()) - 1}-12-31")))

# Separate all genre string into their on rows
switch_library <- switch_library %>% 
  drop_na() %>% 
  filter(release_date <= Sys.Date()) %>%
  mutate(genre = tolower(genre),
         genre = str_remove_all(genre, "-")) %>% 
  separate_rows(genre) %>%
  mutate(genre_title_case = stringr::str_to_title(genre))

switch_library %>%
  head()
# A tibble: 6 × 10
  title   genre devel…¹ publi…² release_…³ release_ym relea…⁴ relea…⁵ common_d…⁶
  <chr>   <chr> <chr>   <chr>   <date>     <date>       <dbl>   <dbl> <date>    
1 0 Degr… acti… EastAs… EastAs… 2021-05-19 2021-05-01    2021     139 2022-05-19
2 0 Degr… plat… EastAs… EastAs… 2021-05-19 2021-05-01    2021     139 2022-05-19
3 0 Degr… puzz… EastAs… EastAs… 2021-05-19 2021-05-01    2021     139 2022-05-19
4 #1 Ana… board Eclips… Eclips… 2021-05-14 2021-05-01    2021     134 2022-05-14
5 #1 Ana… game  Eclips… Eclips… 2021-05-14 2021-05-01    2021     134 2022-05-14
6 #1 Ana… edut… Eclips… Eclips… 2021-05-14 2021-05-01    2021     134 2022-05-14
# … with 1 more variable: genre_title_case <chr>, and abbreviated variable
#   names ¹​developer, ²​publisher, ³​release_date, ⁴​release_year, ⁵​release_day,
#   ⁶​common_date
# ℹ Use `colnames()` to see all variable names

Next I want to identify the top genres by count.

Code
switch_library <- switch_library %>%
  group_by(genre_title_case) %>%
  add_count() %>%
  ungroup() %>%
  mutate(genre_rank = dense_rank(desc(n))) %>%
  select(-n)

switch_library %>%
  filter(genre_rank <= 4) %>%
  distinct(genre_title_case)
# A tibble: 4 × 1
  genre_title_case
  <chr>           
1 Action          
2 Puzzle          
3 Adventure       
4 Roleplaying     

Visualizations

Code
switch_library %>% 
  filter(genre_rank <= 4) %>% 
  group_by(release_ym, genre_title_case) %>% 
  count() %>% 
  ggplot(aes(release_ym, n, col = genre_title_case)) + 
  geom_line(size = 1) + 
  geom_point(size = 2.5, col = "white") +
  geom_point(size = 2) +
  facet_grid(rows = vars(genre_title_case)) +
  scale_color_cp_2077_d() +
  scale_x_date(breaks = breaks_width("1 year"), date_labels = "%b %Y") +
  labs(x = "Release Date", y = "Release Count", col = "Genre") +
  theme_minimal(base_size = 14) +
  theme(legend.position = "none", axis.title.x = element_blank())

Code
switch_library %>% 
  mutate(common_floor_month = floor_date(common_date, "month")) %>%
  group_by(release_year, common_floor_month) %>% 
  count() %>% 
  ggplot(aes(common_floor_month, n, col = factor(release_year))) + 
  geom_line(size = 1.5) +
  geom_point(size = 4, col = "white") +
  geom_point(size = 3) +
  scale_color_cp_2077_d() +
  scale_x_date(breaks = breaks_width("1 month"), date_labels = "%b") +
  labs(x = "Release Month", y = "Release Count", col = "Release Year") +
  theme_minimal(base_size = 14) +
  theme(legend.position = "bottom", axis.title.x = element_blank()) +
  guides(col = guide_legend(nrow = 1))

Tableau

I also decided to create a Tableau dashboard using the same data. Feel free to check that out.

Tableau Public Dashboard