2

I have a dataset that looks like this:

DT <- data.frame( rnorm(5),
              rnorm(5),
              rnorm(5),
              rnorm(5),
              rnorm(5),
              rnorm(5))
names(DT) = c('a1[1]','a1[2]','a1[3]','a2[1]','a2[2]','a2[3]')
str(DT)

I would like to create new columns like:

diffa1 = a1[1] - a2[1] 
diffa2 = a1[2] - a2[2]
diffa3 = a1[3] - a2[3]

I am wondering if there is anyway to do it without having to manually mutate through the IDs in the brackets because I have a1[1] up to a1[100], a2[1] up to a2[100], etc. Thanks!

3
  • I'd avoid square brackets in column names. Use something like a1_1. And you can assign names in data.frame, no need to add them later: data.frame(a1_1 = rnorm(5). For adding columns look at dplyr::mutate. Commented Aug 15, 2018 at 0:08
  • 1. Thank you for pointing out, the data was imported elsewhere and the brackets were already in the column names so I had to deal with it. 2. To replicate how the column names actually looked like, I had to assign brackets to the names, which cannot be done in the data.frame command (will be coerced to a.1.1). Commented Aug 15, 2018 at 0:20
  • dplyr::rename is another friend for those issues. Basically, if you're dealing with data frames, it's worth getting to know dplyr. Commented Aug 15, 2018 at 0:22

2 Answers 2

1

We can use lapply to loop through the numbers in your column name.

diffa <- as.data.frame(lapply(1:3, function(x){
  DT[paste0("a1[", x, "]")] - DT[paste0("a2[", x, "]")] 
}))
diffa
#        a1.1.      a1.2.       a1.3.
# 1  0.9160836 -0.3508354  0.04981186
# 2  0.7397111  1.9147110 -1.47307780
# 3  0.6889159 -0.7672135 -4.24234927
# 4 -0.2701030 -1.3199004  2.55248732
# 5  1.2267170 -2.0815192 -1.97941609

Or use grepl to select columns to create two data frames, and then conduct the operation.

DT1 <- DT[grep("^a1", names(DT))]
DT2 <- DT[grep("^a2", names(DT))]
diffa <- DT1 - DT2
diffa
#        a1[1]      a1[2]       a1[3]
# 1  0.9160836 -0.3508354  0.04981186
# 2  0.7397111  1.9147110 -1.47307780
# 3  0.6889159 -0.7672135 -4.24234927
# 4 -0.2701030 -1.3199004  2.55248732
# 5  1.2267170 -2.0815192 -1.97941609

DATA

set.seed(158)

DT <- data.frame( rnorm(5),
                  rnorm(5),
                  rnorm(5),
                  rnorm(5),
                  rnorm(5),
                  rnorm(5))
names(DT) = c('a1[1]','a1[2]','a1[3]','a2[1]','a2[2]','a2[3]')
Sign up to request clarification or add additional context in comments.

Comments

1

Here is another option with map2 to subtract the corresponding columns

library(tidyverse)
map2_df(DT %>% 
          select(matches("a1")),
        DT %>% 
          select(matches("a2")), `-`)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.