4

I have, for example, this three datasets (in my case, they are many more and with a lot of variables):

data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame2 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))

on each data frame I want to add a variable resulting from a transformation of an existing variable on that data frame. I would to do this by a loop. For example:

datasets <- c("data_frame1","data_frame2","data_frame3")
vars <- c("a","b","c")
for (i in datasets){
    for (j in vars){
        # here I need a code that create a new variable with transformed values
        # I thought this would work, but it didn't...
        get(i)$new_var <- log(get(i)[,j])
    }
}

Do you have some valid suggestions about that?

Moreover, it would be great for me if it were possible also to assign the new column names (in this case new_var) by a character string, so I could create the new variables by another for loop nested in the other two.

Hope I've not been too tangled in explain my problem.

Thanks in advance.

3
  • thanks. Could you also explain me the other method? Commented Jan 20, 2013 at 21:52
  • I had read your deleted comment. You sayd that there is a less complicated method to do this. Commented Jan 20, 2013 at 21:57
  • No, a log for some columns, and other transformations for other columns... Commented Jan 20, 2013 at 21:58

2 Answers 2

7

You can put your dataframes in a list and use lapply to process them one by one. So no need to use a loop in this case.

For example you can do this :

data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame3 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))

ll <- list(data_frame1,data_frame2,data_frame3)
lapply(ll,function(df){
  df$log_a <- log(df$a)          ## new column with the log a
  df$tans_col <- df$a+df$b+df$c  ## new column with sums of some columns or any other           
                                 ##   transformation
  ###  .....
  df

})

the dataframe1 becomes :

[[1]]
  a b c     log_a tans_col
1 1 3 4 0.0000000        8
2 5 6 4 1.6094379       15
3 3 1 1 1.0986123        5
4 3 5 9 1.0986123       17
5 2 5 2 0.6931472        9
Sign up to request clarification or add additional context in comments.

5 Comments

@agstudy I tried your solution on my data. I notice that it doesn't really write the new variables. In you example, log_a and tans_col are not inserted in the data frames. Certainly I'm mistakiing something...
@this.is.not.a.nick in this case I use the $ will create the new variable.. df$log_a will create a variable of name log_a...what have you tried with your data?
@agstudy Yes, but if, following your example, I then type ll[[1]]$log_a, R returns me NULL
@this.is.not.a.nick it is normal , R do the transformation in a copy of ll so you need to do something like ll <- lapply(ll,function(df)... to change the value of your list..
...as I imagined, my problem was stupid. :) Thanks a lot @agstudy
1

I had the same need and wanted to change also the columns in my actual list of dataframes.

I found a great method here (the purrr::map2 method in the question works for dataframes with different columns), followed by

list2env(list_of_dataframes ,.GlobalEnv)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.