1

I asked a similar questions before but i still need some help/be pointed into the right direction.

I am trying to locate certain words within a column that consists of a SQL statement on all the rows and extract the next word in R studio.

Example: lets call this dataframe "SQL

      |    **UserID**    |      **SQL Statement** 

 1   |    N781          |   "SELECT A, B FROM Table.1 p JOIN Table.2 pv ON 
                            p.ProdID.1ProdID.1 JOIN Table.3 v ON pv.BusID.1 = 
                            v.BusID WHERE SubID = 1 ORDER BY v.Name;"

2      |  N283          |   "SELECT D, E FROM Table.11 p JOIN Table.2 pv ON 
                           p.ProdID.1ProdID.1 JOIN Table.3 v ON pv.BusID.1 = 
                           v.BusID WHERE SubID = 1 ORDER BY v.Name;"

So I am trying to pull out the table name. So I am trying to find the words "From" and "Join" and pulling the next table names.

I have been using some code with help from earlier:

I make the column "SQL Statement" in a list of 2 name "b"

I use the code:

z <- mapply(grepl,"(FROM|JOIN)",b)

which gives me a True and fasle for each word in each list.

z <- mapply(grep,"(FROM|JOIN)",b)

The above is close. It give me a position of every match in each of the lists.

But I am just trying to find the word Join or From and take the text word out. I was trying to get an output something like

      |    **UserID**    |      **SQL Statement**                                | Tables 

 1   |    N781          |   "SELECT A, B FROM Table.1 p JOIN Table.2 pv ON       | Table.1, Table.2          
                            p.ProdID.1ProdID.1 JOIN Table.3 v ON pv.BusID.1 =       
                            v.BusID WHERE SubID = 1 ORDER BY v.Name;"

2      |  N283          |   "SELECT D, E FROM Table.11 p JOIN Table.2 pv ON 
                           p.ProdID.1ProdID.1 JOIN Table.3 v ON pv.BusID.1 =    | Table.11, Table.31 
                           v.BusID WHERE SubID = 1 ORDER BY v.Name;"

1 Answer 1

1

Here is a working script which uses base R options. The inspiration here is to leverage strsplit to split the query string on the keywords FROM or JOIN. Then, the first separate word of each resulting term (except for the first term) should be a table name.

sql <- "SELECT A, B FROM Table.1 p JOIN Table.2 pv ON 
        p.ProdID.1ProdID.1 JOIN Table.3 v ON pv.BusID.1 = 
        v.BusID WHERE SubID = 1 ORDER BY v.Name;"

terms <- strsplit(sql, "(FROM|JOIN)\\s+")

out <- unlist(lapply(terms, function(x) gsub("^([^[:space:]]+).*", "\\1", x)))

out <- out[2:length(out)]
out

[1] "Table.1" "Table.2" "Table.3"

Demo

To understand better what I did, follow the demo and have a look at the terms list which resulted from splitting.

Edit:

Here is a link to another demo which shows how you might use the above logic on a vector of query strings, to generate a list of vector of tables, for each query

Demo

Sign up to request clarification or add additional context in comments.

6 Comments

This seems to get the tables and I tried to use some of this logic to be able to use it on a columns and a list, but it just gives me a long list of table names even if they are from a different row. Could you give me a pointer on what to use to be able to get the format i mentioned in the questions ?
@Jim.W Given that I have a bit of logic here, I might just throw everything into a function. Then, you could apply over a vector of query strings, and generate the output you expect.
@Jim.W I added a second demo link to my answer which shows what you might do to handle a vector of query strings.
Thanks for the help! I am still getting familiar with the functions you have made. Do you have any resources that break it down logically. I have visited many websites but not seems to break it in a nice way? In any case, appreciate the quick answer and demo
I don't think I can break it down any more than I already have. Try to go line by line until it makes sense. Stack Overflow is not really about giving full tutorials.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.