0

I have 49 .db files. I want to open them in R and then store its content in a dataframe for further use. I am able to do it for one file but I want to modify the code to be able to do it for all the 49 .db file in one go. This is the code that I am trying to do for one file:

sqlite <- dbDriver("SQLite")
dbname <- "en_Whole_Blood.db" 
db = dbConnect(sqlite,dbname)
wholeblood_df <- dbGetQuery(db,"SELECT * FROM weights") 
View(wholeblood_df)

I tried to use the list.files function to do it for all the .db file but its not happening.Its only creating a dataframe for the last object This is the code for it:

library("RSQLite")
sqlite <- dbDriver("SQLite")
sqlite <- dbDriver("SQLite")
dbname <- data_files
dbname
for (i in length(dbname){
db=dbConnect(sqlite,dbname[i])
df <- dbGetQuery(db,"SELECT * FROM weights")
}

##This only gives me last .db file as a dataframe. Does anyone know how can I edit this code to get 49 dataframe for each sql file. Thank you.

2 Answers 2

1

Try replacing the for loop with lapply:

list_of_df <- lapply(dbname, function(x) {
db <- dbConnect(sqlite, x)
df <- dbGetQuery(db, "SELECT * FROM weights")
})

I'm not experience in handling SQL and / or connections, but I think it might work.

Edit

Second alternative maintaining the for loop:

df <- list()
for (i in 1:length(dbname)) {
db <- dbConnect(sqlite,dbname[i])
df <- c(df, dbGetQuery(db,"SELECT * FROM weights"))
}

Hope it helps

Sign up to request clarification or add additional context in comments.

4 Comments

I tried to this. but its giving an error to me. x <- function(df){ sqlite <- dbDriver("SQLite") dbname <- datafiles db=dbConnect(sqlite,dbname) df <- dbGetQuery(db,"SELECT * FROM weights") } list_of <- lapply(datafiles,x) Error in .local(drv, ...) : length(dbname) == 1 is not TRUE
Ok, can you share a dput() of dbname? Also, I will ad a second alternative maintaining the loop.
yes. this one worked for me. I was able to do it.
You should really dbDisconnect each connection after you get the data.
0

Another suggestion:

files <- list.files(pattern = "\\.db$")
list_of_frames <- lapply(files, function(fn) {
  db <- dbConnect(RSQLite::SQLite(), fn)
  on.exit(dbDisconnect(db))
  dbGetQuery(db, "select * from weights")
})
oneframe <- do.call(rbind, list_of_frames)

Reproducible example

Create data (you don't need this):

for (i in 1:3) {
  db <- DBI::dbConnect(RSQLite::SQLite(), sprintf("mtcars%i.db", i))
  DBI::dbWriteTable(db, "weights", mtcars[i * 5 + 1:3,], append = FALSE, create = TRUE)
  DBI::dbDisconnect(db)
}

Working solution:

files <- list.files(pattern = "\\.db$")
files
# [1] "mtcars1.db" "mtcars2.db" "mtcars3.db"
list_of_frames <- lapply(files, function(fn) {
  db <- dbConnect(RSQLite::SQLite(), fn)
  on.exit(dbDisconnect(db))
  dbGetQuery(db, "select * from mt")
})
list_of_frames
# [[1]]
#    mpg cyl  disp  hp drat   wt  qsec vs am gear carb
# 1 18.1   6 225.0 105 2.76 3.46 20.22  1  0    3    1
# 2 14.3   8 360.0 245 3.21 3.57 15.84  0  0    3    4
# 3 24.4   4 146.7  62 3.69 3.19 20.00  1  0    4    2
# [[2]]
#    mpg cyl  disp  hp drat   wt qsec vs am gear carb
# 1 17.8   6 167.6 123 3.92 3.44 18.9  1  0    4    4
# 2 16.4   8 275.8 180 3.07 4.07 17.4  0  0    3    3
# 3 17.3   8 275.8 180 3.07 3.73 17.6  0  0    3    3
# [[3]]
#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# 1 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
# 2 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
# 3 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1

oneframe <- do.call(rbind, list_of_frames)
oneframe
#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# 1 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
# 2 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
# 3 24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
# 4 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
# 5 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
# 6 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
# 7 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
# 8 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
# 9 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1

Tidyverse alternative:

library(dplyr) # just for %>%, could use magrittr as well
library(purrr) # map_dfr
oneframe <- files %>%
  map_dfr(~ {
    db <- DBI::dbConnect(RSQLite::SQLite(), .)
    on.exit(DBI::dbDisconnect(db))
    DBI::dbGetQuery(db, "select * from mt")
  })
### same result

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.