0

What is the idiomatic way to do the following string concatenation in R?

Given two vectors of strings, such as the following,

titles <- c("A", "B")
sub.titles <- c("x", "y", "z")

I want to produce the vector

full.titles <- c("A_x", "A_y", "A_z", "B_x", "B_y", "B_z")

Obviously, this could be done with two for-loops. However, I would like to know what an “idiomatic” (i.e., elegant and natural) solution would be in R.

In Python, an idiomatic solution might look like this:

titles = ['A', 'B']
subtitles = ['x', 'y', 'z']
full_titles = ['_'.join([title, subtitle])
               for title in titles for subtitle in subtitles]

Does R allow for a similar degree of expressiveness?

Remark

The consensus among the solutions proposed thus far is that the idiomatic way to do this in R is, basically,

full.titles <- c(t(outer(titles, sub.titles, paste, sep = "_")))

Interestingly, this has an (almost) literal translation in Python:

full_titles = map('_'.join, product(titles, subtitles))

where product is the cartesian-product function from the itertools module. However, in Python, such a use of map is considered more convoluted—that is, less expressive—than the equivalent use of list comprehension, as above.

8
  • 1
    @brittenb that does not produce the vector as required in the question. Commented Apr 19, 2016 at 11:23
  • @zacdav Yeah, I'm looking at the output now and am confused as to why it's not producing the expected output. I'll delete the comment. Commented Apr 19, 2016 at 11:24
  • Somewhat direct translation: mapply(function(x,y) sprintf("%s_%s", x, y), rep(titles, each=length(subtitles)), subtitles) Commented Apr 19, 2016 at 11:43
  • R is more "colorful" than Python in this example...wonder how many diff ways everyone can think of... Commented Apr 19, 2016 at 11:45
  • 1
    stackoverflow.com/questions/16143700/… Commented Apr 19, 2016 at 11:46

6 Answers 6

5

There are a couple of ways to go about this, either using the 'outer()' function to define your function as the matrix product of two vectors, along the lines of:

outer(titles, sub.titles, paste, sep='_')

and then wrangling it from a matrix into a vector, or converting your input to dataframe, using expand.grid()

do.call(paste, expand.grid(titles, sub.titles, sep='_', stringsAsFactors=FALSE))

Sign up to request clarification or add additional context in comments.

2 Comments

You can probably wrap it into c as in c(outer(titles, sub.titles, paste, sep='_'))
Elegant. This almost produces the correct output. Unfortunately, the resulting components are in the wrong order. (I've updated the question to clarify this.) A transpose fixes this: c(t(outer(...)))
3

Using do.call combined with paste and expand.grid

sort(do.call(paste, c(sep='_', expand.grid(titles, sub.titles))))
#[1] "A_x" "A_y" "A_z" "B_x" "B_y" "B_z"

Or using tidyr::unite combined with expand.grid

unite(expand.grid(titles, sub.titles), Res, everything()) %>% .$Res

2 Comments

sure it's base R ;) But I tend use Curry quite a lot. Also looking for an elegant one liner using tidyr, but unite_(expand.grid(titles, sub.titles), everything()) does not seem to work
you can post it as an answer since you solved my question.
2
apply(expand.grid(titles, sub.titles), 1, paste, collapse = "_")

expand.grid creates a matrix of combinations between titles and sub.titles.
apply goes down the matrix of combinations and pastes them together.

Comments

1

Try this code:

unlist(lapply(1:length(titles), function(x){paste(titles[x], sub.titles, sep="_")}))

Comments

1

This code also works: as.vector(outer(titles, subtitles, FUN=paste, sep="_"))

outer essentially performs a function element-wise to each element from each vector. So it'll take each element from titles and perform a function with each element from subtitles. The default function is multiplication, but we change that default by passing a new argument to the FUN parameter. Arguments that are used in our new function are appended after a comma. So we're telling R to take the first element from titles and paste it together with each element from subtitles and separate the two elements with a "_". Then do it again with the second element from titles.

Comments

1
full.titles  <-  paste0(expand.grid(titles,sub.titles)$Var1,'_',
expand.grid(titles,sub.titles)$Var2)
>full.titles
[1] "A_x" "B_x" "A_y" "B_y" "A_z" "B_z"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.