0

I'm trying to merge two data frames into one. The first df is acutedm11 with 4682 columns and the second is gwlfullflattened22 with 4903 columns. I can't post the data here because its too big and it contains sensitive information. I'm trying to merge these two dfs based off of mrn=mrn_G and date difference <= 30

Code:

library(sqldf)
acutedm3 <- sqldf::sqldf("
    select acutedm11.*, gwlfullflattened22.*
    from acutedm11
       left join gwlfullflattened22 on acutedm11.mrn = gwlfullflattened22.mrn_G
        and gwlfullflattened22.EncounterDate_G between acutedm11.Date_m30 and acutedm11.Date_p30") %>%
  select(-Date_m30, -Date_p30)

Error: Error: too many columns on acutedm11

Is there a better way to merge/join the data frames?

0

1 Answer 1

1
  1. The maximum number of columns is a compile time parameter in SQLite (which is included in the RSQLite package). You can reset the limit higher and rebuild that package. For more info see: Maximum number of columns in a table for sqlite

  2. Also sqldf supports 4 different back end: SQLite, H2, MySQL and PostgreSQL. Try one of the others.

For example, using H2 this does not give me any errors.

library(RH2)
library(sqldf)

nr1 <- 100
nc1 <- 4682 
df1 <- as.data.frame(matrix(seq_len(nr1*nc1), nr1))

nr2 <- 100
nc2 <- 4903
df2 <- as.data.frame(matrix(seq_len(nr2*nc2), nr2))

res <- sqldf("select * from df1 a
       left join df2 b on a.V1 = b.V1
        and a.V2 between b.V3 and b.V4")
Sign up to request clarification or add additional context in comments.

10 Comments

I'm using RStudio. I looked at the guide you sent me but there aren't any instructions on how to increase the maximum number of columns in R.
I increased the max columns in R by typing DSQLITE_MAX_COLUMN=1234567 and RSQLITE_MAX_COLUMN=1234567 . I still got the same error. How can I use H2, MySQL, or PostgreSQL without downloading anything?
Those parameters must be changed in the C code source and then rebuild RSQLite. To use H2 make sure java is installed and then install the RH2 package from CRAN. Be sure to read ?sqldf .
Thank you for providing the sample code. I tried it and got the same error. Error: too many columns on acutedm11
If you got the same error you are likely still using SQLite and didn't load RH2. Add the verbose=TRUE argument to sqldf to check which database is being used.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.