How to convert list of lists to dataframe in R

Question

I am using quantmod to download option chains which come in the form of nested lists.

However for my purposes I would rather have the information in the form of a dataframe where the name of each list is contained in a column of the dataframe (thus two columns would be needed one containing the date of the strike of the option and the second the type of the option -- call or put).

How can this be accomplished in R?

For a reproducible example:

library(quantmod)
AAPL.2015 <- getOptionChain("AAPL", "2019/2021")

And if possible, what I should do to get the dates of the option strikes in English?

also, for converting list of lists into dataframes, see this: stackoverflow.com/questions/58019884/… — Vitali Avagyan
– Vitali Avagyan, Commented Sep 21, 2019 at 19:54

Pedro Cavalcante · Accepted Answer · 2019-09-21 20:01:17Z

2

I couldn't reproduce your example, but what you're trying to do is simple. You could use do.call to call the rbind function on the list and what you get at the end is a pretty dataframe.

list <- getOptionChain("AAPL", "2019/2021")

data <- do.call(rbind, list)

answered Sep 21, 2019 at 20:01

Pedro Cavalcante

4461 gold badge4 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Parfait Over a year ago

Interesting that this works with nested lists unless there is no calls and puts!

camille · Accepted Answer · 2019-09-21 22:50:53Z

It might be more flexible to work with a couple functions from dplyr and purrr. dplyr::bind_rows can take a list of data frames, and they can have different names, whereas the base rbind just works on 2 data frames at once. bind_rows also has an argument .id that will create a column of list item names. purrr::map_dfr calls a function over a list and returns a data frame of them all row-bound together; because it wraps around bind_rows, it also has an .id argument.

Having access to setting those IDs is helpful because you have 2 sets of IDs: one of dates, and one of calls vs puts. Setting one ID within the inner bind_rows and one within map_dfr gets both.

Written out with a function, to make it a little easier to see:

library(quantmod)
AAPL.2015 <- getOptionChain("AAPL", "2019/2021")

aapl_df <- purrr::map_dfr(AAPL.2015, function(d) {
  dplyr::bind_rows(d, .id = "type")
  }, .id = "date")

head(aapl_df)
#>          date  type Strike  Last   Chg   Bid   Ask Vol OI
#> 1 Sep.27.2019 calls    140 79.50  9.50 77.60 77.90  10 30
#> 2 Sep.27.2019 calls    145 75.85  0.00 72.70 73.30  NA 28
#> 3 Sep.27.2019 calls    150 72.22  0.00 67.85 67.90  10 91
#> 4 Sep.27.2019 calls    155 52.53  0.00 65.80 69.90  NA 10
#> 5 Sep.27.2019 calls    160 60.10  0.00 57.85 58.15   2 11
#> 6 Sep.27.2019 calls    165 54.40 15.95 52.65 52.90   9 16

Or in more common dplyr piping with function shorthand notation:

library(dplyr)
aapl_df <- AAPL.2015 %>%
  purrr::map_dfr(~bind_rows(., .id = "type"), .id = "date")

Parfait · Accepted Answer · 2019-09-22 14:09:48Z

1

Consider Map to rbind the individual calls and puts data frames adding needed indicator columns. Because the individual data frames resides within a nested, named list, the extract function, [ is used.

Then, run a final do.call + rbind across resulting list of data frames. NOTE: rbind assumes the call and put data frames maintain exact same names and number of columns.

call_put_func <- function(nm, call_df, put_df) {      
     cbind(rbind(transform(call_df, option_type = "call"),
                 transform(put_df, option_type = "put")
           ), date_of_strike = nm)
}

APPL_flat_df_list <- Map(call_put_func, nm = names(AAPL.2015), 
                                        call_df = lapply(AAPL.2015, "[[", "calls"), 
                                        put_df = lapply(AAPL.2015, "[[", "puts")
                        )

APPL_df <- do.call(rbind, unname(APPL_flat_df_list))

edited Sep 22, 2019 at 14:09

answered Sep 21, 2019 at 20:52

Parfait

108k19 gold badges103 silver badges138 bronze badges

4 Comments

user8270077 Over a year ago

I am getting an error: Error in (function (nm, call_df, put_df) : unused arguments (calls = dots[[2]][[1]], puts = dots[[3]][[1]]) Traceback: 1. Map(call_put_func, nm = names(AAPL.2015), calls = lapply(AAPL.2015, . [, "calls"), puts = lapply(AAPL.2015, [, "puts")) 2. mapply(FUN = f, ..., SIMPLIFY = FALSE)

user8270077 Over a year ago

I still get error message: Error in match.names(clabs, names(xi)): names do not match previous names Traceback: 1. Map(call_put_func, nm = names(AAPL.2015), call_df = lapply(AAPL.2015, . [, "calls"), put_df = lapply(AAPL.2015, [, "puts")) 2. mapply(FUN = f, ..., SIMPLIFY = FALSE) 3. (function (nm, call_df, put_df) . { . cbind(rbind(transform(call_df, option_type = "call"), transform(put_df, . option_type = "put")), date_of_strike = nm) . })(nm = dots[[1L]][[1L]], call_df = dots[[2L]][[1L]], put_df = dots[[3L]][[1L]])

user8270077 Over a year ago

4. cbind(rbind(transform(call_df, option_type = "call"), transform(put_df, . option_type = "put")), date_of_strike = nm) # at line 2-4 of file <text> 5. rbind(transform(call_df, option_type = "call"), transform(put_df, . option_type = "put")) 6. rbind(deparse.level, ...) 7. match.names(clabs, names(xi)) 8. stop("names do not match previous names")

Parfait Over a year ago

Check colnames of both calls and puts. Do they match exactly in spelling and number? Also, try doubling [[ instead of single.

Collectives™ on Stack Overflow

How to convert list of lists to dataframe in R

3 Answers 3

1 Comment

Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related