2

As title, I would like to concatenate a few columns from a table in SQL Server, I tried to use the paste function as below but give the following error:

> tbl(channel,'##iris') %>% 
+   mutate(string=paste(Species,'-',
+                       Sepal.Length,'-',
+                       Sepal.Width,'-',
+                       Petal.Length,'-',
+                       Petal.Width,sep=''))
Error: PASTE() is not available in this SQL variant

3 Answers 3

2

I found a solution here provided by Ben Baumer., and want to share it here.

The approach is to use CONCAT instead of paste.

> tbl(channel,'##iris') %>% 
+   group_by(Species,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width) %>%
+   summarise(string=MAX(CONCAT(Species,'-',
+                               Sepal.Length,'-',
+                               Sepal.Width,'-',
+                               Petal.Length,'-',
+                               Petal.Width))) %>%
+   head(.,1)
# Source:   lazy query [?? x 6]
# Database: Microsoft SQL Server 11.00.6251[dbo@WCDCHCMSAH01\CMSAH_DC7_MP1/data_ha_amr]
# Groups:   Species, Sepal.Length, Sepal.Width, Petal.Length
  Species Sepal.Length Sepal.Width Petal.Length Petal.Width string              
  <chr>          <dbl>       <dbl>        <dbl>       <dbl> <chr>               
1 setosa          4.30        3.00         1.10       0.100 setosa-4.3-3-1.1-0.1
Sign up to request clarification or add additional context in comments.

1 Comment

Why are you using summarize and MAX ? concat alone seems to do what you need
2

Using the tidyverse on R data.frames, tidyr::unite would be the idiomatic way to go.

Not being a dplyr verb though, it has not been translated to be used through dbplyr / SQL.

You can define your own unite this way in SQL server (I couldn't test unfortunately but it should work) :

unite.tbl <- function (data, col, ..., sep = "_", remove = TRUE) 
{
  dot_names <- sapply(substitute(list(...))[-1], deparse)
  shown_cols <- if (remove) 
    setdiff(data$ops$vars, dot_names)
  else data$ops$vars
  shown_col_str <- paste(shown_cols, collapse = ", ")
  concat_str <- paste0("CONCAT(",paste(dot_names, collapse = paste0(",'",sep,"',")),")")
  col <- deparse(substitute(col))
  subquery <- capture.output(show_query(data), type = "message")[-1] %>% paste(collapse = " ")
  query    <- paste("SELECT",shown_col_str,",",concat_str,"AS",col,"FROM (",subquery,")")
  tbl(data$src$con, sql(query))
}

and then :

tbl(channel,'##iris') %>%
  unite(string,
        Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width,
        sep = '',remove=FALSE)

For a DBMS that supports the || concatenation operator (e.g. Oracle), just replace the concat_str definition by :

concat_str <- paste(dot_names, collapse = paste0(" || '", sep, "' || "))

1 Comment

This should be the answer, and dbplyr should adopt this.
0

I had to make a couple of small modifications to @moodymudskipper's solution to get it to work. This was using Oracle.

unite.tbl <- function (data, col, ..., sep = "_", remove = TRUE) 
{
  # remove the list call
  dot_names <- sapply(substitute(...)[-1], deparse)
  shown_cols <- if (remove) 
    # replace $ops$vars with colnames
    setdiff(data %>% colnames(), dot_names)
  else data %>% colnames()
  shown_col_str <- paste(shown_cols, collapse = ", ")
  concat_str <- paste(dot_names, collapse = paste0(" || '", sep, "' || "))
  col <- deparse(substitute(col))
  # remove type arg
  subquery <- capture.output(show_query(data))[-1] %>% paste(collapse = " ")
  query    <- paste(
    "SELECT", shown_col_str, ",",
    concat_str, "AS", col,
    "FROM (",
    subquery,
    ")"
  )
  tbl(data$src$con, sql(query))
}

What I need to add for my use case is a way for NA and/or NULL to be ignored, as the na.rm arg from tidyr::unite does.

EDIT: And here is the version with an na.rm argument. I did need the list wrapper around .... This may need adapted for other RDBMSs than Oracle.

unite.tbl <- function (data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE) 
{
  dot_names <- sapply(substitute(list(...))[-1], deparse)
  shown_cols <- data %>% colnames()
  shown_cols <- `if`(
    remove,
    setdiff(shown_cols, dot_names),
    shown_cols
  )
  shown_col_str <- paste(shown_cols, collapse = ", ")
  concat_str <- ifelse(
    na.rm,
    paste0(
      paste0(
        "NVL2(",
        dot_names%>% head(-1), ", ",
        dot_names%>% head(-1), " || '", sep, "'", 
        ", '')", 
        collapse = " || "
      ),
      " || NVL(", dot_names%>% tail(1), ", '')"
    ),
    paste0(dot_names, collapse = paste0(" || '", sep, "' || "))
  )
  col <- deparse(substitute(col))
  subquery <- capture.output(show_query(data))[-1] %>% paste(collapse = " ")
  query    <- paste(
    "SELECT", shown_col_str, ",",
    concat_str, "AS", col,
    "FROM (",
    subquery,
    ")"
  )
  tbl(data$src$con, sql(query))
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.