1

I am struggling with creating efficient code for importing SAS data file.

My code is the follow:

library(foreign)
library(haven)
f <- file.path(path = "E:/Cohortdata/Raw cohort/Nationalscreeningcohort/01.jk", 
               c("nhis_heals_jk_2002.sas7bdat","nhis_heals_jk_2003.sas7bdat" ,"nhis_heals_jk_2004.sas7bdat",
                 "nhis_heals_jk_2005.sas7bdat","nhis_heals_jk_2006.sas7bdat","nhis_heals_jk_2007.sas7bdat",
                 "nhis_heals_jk_2008.sas7bdat","nhis_heals_jk_2009.sas7bdat","nhis_heals_jk_2010.sas7bdat",       "nhis_heals_jk_2011.sas7bdat","nhis_heals_jk_2012.sas7bdat","nhis_heals_jk_2013.sas7bdat"))
d <- lapply (f, read_sas)

I know rewriting it with for loop would be much more efficient, but don't know how the code should be look like

I would be very thankful if you help me.

2
  • 1
    Are you trying to read every sas7bdat file in the folder? That would make it easier to simplify the code. Commented Mar 18, 2019 at 6:05
  • Yes, that exactly what I am trying to do. My sas7bdat files are stored in a folder. What I don't like in my code is that I am writing name of all sas files, but the desired code is with for loop. Commented Mar 18, 2019 at 6:10

1 Answer 1

4

It's a variation of a code that I posted here but you can use it for SAS files too.

Please note that instead of using file.path() I used list.files(). That allowed me to read all the files in the path "E:/Cohortdata/Raw cohort/Nationalscreeningcohort", which is where I assumed your files are. In addition, I used the argument pattern to look only for sas7bdat files.

list.files() returns a vector, here you can use your *apply method that you'd like. However, I like changing the vector to tbl_df and to use the the tidyverse approach. Which means reading all the files using purrr::map() (part of tidyverse) and create a big data tbl_df of all of the files.

library(tidyverse)
library(foreign)
library(haven)

df <- list.files(path = "E:/Cohortdata/Raw cohort/Nationalscreeningcohort",
                 full.names = TRUE,
                 recursive = TRUE,
                 pattern = "*.sas7bdat") %>% 
  tbl_df() %>%
  mutate(data = map(value, read_sas)) %>%
  unnest(data) 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.