You picked a challenging problem for your first steps with the magrittr pipe and the map functions! I'll do my best to give you a helpful answer, but I would also recommend that you find some easier data to work with as you practice. A good place to learn about the pipe %>% is with the "Pipes" chapter in Hadley Wickham's book. The chapter on iteration also offers a good intro into the map_* functions. You can return to more complex problems once you have a firmer conceptual understanding. I think Hadley explains these tools better than I ever could, so I won't go into great detail about them here, and instead focus on explaining why your code doesn't work, and why mine does.
An analysis of your code
Map functions allow a couple of useful shortcuts, one of which you've already discovered - namely, if you pass in vectors or lists as the function argument, they are automatically converted into extractor functions. So, you're on the right track!
The thing to remember is that map functions return a vector that is the same length, and has the same names, as the input vector. Your input vector is jsonData, which has 5 elements with names [1] "copyright" "allPlays" "currentPlay" "scoringPlays" "playsByInning". When you run jsonData %>% map("playEvents") %>% map("hitData"), data is being extracted, but R still returns a vector with five elements and the same names as the original vector. If you take a look at the following code, you'll see that your code is, indeed, peeling away the uppermost layers, but the length remains the same, which isn't very helpful:
> unlist(map(jsonData, class))
copyright allPlays currentPlay scoringPlays playsByInning
"character" "data.frame" "list" "integer" "data.frame"
> unlist(map(jsonData %>% map("playEvents"), class))
copyright allPlays currentPlay scoringPlays playsByInning
"NULL" "list" "data.frame" "NULL" "NULL"
> unlist(map(jsonData %>% map("playEvents") %>% map("hitData"), class))
copyright allPlays currentPlay scoringPlays playsByInning
"NULL" "NULL" "data.frame" "NULL" "NULL"
The final output, and what you are trying to combine with your call to bind_rows above, is this:
> jsonData %>% map("playEvents") %>% map("hitData")
$copyright
NULL
$allPlays
NULL
$currentPlay
launchSpeed launchAngle totalDistance trajectory hardness location coordinates.coordX coordinates.coordY
1 NA NA NA <NA> <NA> <NA> NA NA
2 81.3 61.92 187.5 popup medium 6 75.78 167.97
$scoringPlays
NULL
$playsByInning
NULL
Obviously that's not what you want. After some tinkering I came up with the following solution.
My own strategy
The libraries:
library(jsonlite)
library(purrr)
library(dplyr)
library(readr)
library(stringr)
library(magrittr)
I use a slightly different method to download and parse the JSON because I need to see the structure. I'll include it just in case you might find it useful:
url <- paste0("http://statsapi-prod-alt-968618993.us-east-1.elb.amazonaws",
".com/api/v1/game/565711/playByPlay")
url %>% read_file() %>% prettify() %>% write_file("bball.json")
jsonData <- fromJSON("bball.json")
I first extract and clean the hitData dataframes. I know they can all be found in playEvents, so I can skip a few steps by using the $ syntax. The first call to map extracts hitData from each element of the list playEvents. The hitData dataframes are nested (they contain other dataframes), so the second call to map with jsonlite::flatten flattens them. The function safely ensures that R doesn't throw an error when something other than a dataframe is encountered (only 46 elements contain hitData). Many of the hitData dataframes contain rows full of NAs, so the third call to map uses an anonymous function (again in safely) to get rid of those. The fourth call to map then extracts the dataframe from each element's result variable, which was created by safely (along with an error variable that we don't need):
hitdata_list <- jsonData$allPlays$playEvents %>%
map("hitData") %>%
map(safely(jsonlite::flatten)) %>%
map(safely(~.$result[complete.cases(.$result),])) %>%
map("result")
Now I have a list of hitData dataframes. As I mentioned above, only 46 of 80 entries contain hitData, so I need a way to get the corresponding values from atBatIndex. I can do that by generating a logical vector with TRUE when an element in hitdata_list contains a dataframe, and FALSE otherwise. I use map_lgl to return a logical vector instead of a list:
lgl_index <- map_lgl(hitdata_list, ~ !is.null(.))
atbatindex_vec <- jsonData$allPlays$atBatIndex[lgl_index]
I then use a stringr function to get game_pk from the URL. I'm not sure if it would work with every URL, but it works fine in this case:
game_pk_vec <- str_match(url, "/(\\d+)/")[2] %>%
as.integer()
Finally, I combine atBatIndex and game_pk in a tibble, then combine that tibble with with the hitData data using bind_cols. The hitData dataframes are still in a list, so I'll need to combine those first with bind_rows. The set_colnames function is from the magrittr package and does just what it says. I need to set the column names because some compound names were created when I flattened the hitData dataframes:
hitdata_df <- tibble(game_pk = game_pk_vec, atBatIndex = atbatindex_vec) %>%
bind_cols(bind_rows(hitdata_list)) %>%
set_colnames(str_extract(names(.), "\\w+$"))
The only thing I didn't do was extract pitchNumber. Calling jsonData$allPlays$playEvents %>% map("pitchNumber") returns a list of sequences 1 through n, where each vector has length > 1. I assume you only need the final number in each sequence, but I'm not sure so I'll spare myself the effort. You can do what I did with atBatIndex to get the relevant elements, and then extract what you need. Here's the final dataframe:
# A tibble: 46 x 10
game_pk atBatIndex launchSpeed launchAngle totalDistance trajectory hardness location coordX coordY
<chr> <int> <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl>
1 565711 4 76.6 2.74 188. ground_ball medium 9 178. 145.
2 565711 5 101. 15.4 328. line_drive hard 8 145. 62.2
3 565711 6 103. 29.4 382. line_drive medium 9 237. 79.4
4 565711 8 109. 15.6 319. line_drive hard 9 181. 102.
5 565711 9 75.8 47.8 239. fly_ball medium 7 99.8 103.
6 565711 10 91.6 44.1 311. fly_ball medium 8 140. 69.3
7 565711 12 79.1 23.4 246. line_drive medium 7 52.3 126.
8 565711 13 67.3 -21.3 124. ground_ball medium 6 108. 156.
9 565711 14 89.9 -21.6 7.41 ground_ball medium 6 108. 152.
10 565711 15 110. 27.7 420. fly_ball medium 9 250. 69.0
# … with 36 more rows
mapexpects a vector and a function. Could you elaborate further what you would like to do? Maybe show the expected result?