Every day, I connect to a corporate Database from within R. Currently, I have an R-script which sets up the connection and undertakes all pre-analysis data preparation (mutates and joins). I would like to include this in an R-package DWH, so I can share this process with colleagues.
Here is some sample code:
creds <- read.csv2('C:/creds.csv')
con <- ROracle::dbConnect(drv = DBI::dbDriver("Oracle"),
username = creds$user, password= creds$pwd,
dbname = 'DWH',
prefetch = FALSE, bulk_read = 1000L, stmt_cache = 0L,
external_credentials = FALSE, sysdba = FALSE)
PREST <- dplyr::tbl(con, dbplyr::in_schema('ADWH', 'PREST')) %>%
transmute(date = A_DWH_DATE_TODAY,
amount = A_DWH_POL_IND_COLL_AMOUNT)
Ideally, I would like to load the package with library(DWH), and then the PREST object should be available to everyone for further use, e.g.
library(DWH)
PREST %>% filter(date > 201912)
This works perfectly well while sourcing the R-script. How can I implement this inside an R-package? I am using ROracle, but this should be the same process with any other odbc database connection.
ROracleis not using ODBC. You can connect to Oracle withodbcbut requires the Oracle ODBC driver. To integrate that solution, users will need to supply the dynamic connections with available additional driver installs.configpackage or perhaps envvars to pass user/pass, much better to be provided/controller by the user(s).PRESTin your package, you would likely need to make it a function that pulls the data dynamically instead, so your users would need to use something likePREST() %>% transmute(...). You'll need to do connection management within your package, as in (1)library(DWH)does not initiate a connection; (2) functions either need to open, query, and close, or all of those functions reference an internal variable that stores the connection, tests if it is still open (timeout?), reconnects if needed, etc.poolin addition to yourDBIpackages, since it does auto-reconnect fairly robustly (in my experience).