Add a column in a list of data frames

Question

I want to add a column to each of my data frames in my list table after I do this code :

#list of my dataframes
df <- list(df1,df2,df3,df4)

#compute stats
stats <- function(d) do.call(rbind, lapply(split(d, d[,2]), function(x) data.frame(Nb= length(x$Year), Mean=mean(x$A), SD=sd(x$A)  )))

#Apply to list of dataframes
table <- lapply(df, stats)

This column which I call Source for example, include the names of my dataframes along with Nb, Mean and SD variables. So the variable Source should contain df1,df1,df1... for my table[1], and so on.

Is there anyway I can add it in my code above?

Thomas · Accepted Answer · 2014-06-23 13:38:16Z

Here's a different way of doing things:

First, let's start with some reproducible data:

set.seed(1)
n = 10
dat <- list(data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)),
            data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)),
            data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)),
            data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)))

Then, you want a function that adds columns to a data.frame. The obvious candidate is within. The particular things you want to calculate are constant values for each observation within a particular category. To do that, use ave for each of the columns you want to add. Here's your new function:

stat <- function(d){
    within(d, {
        Nb = ave(a, b, FUN=length)
        Mean = ave(a, b, FUN=mean)
        SD = ave(a, b, FUN=sd)
    })        
}

Then just lapply it to your list of data.frames:

lapply(dat, stat)

As you can see, columns are added as appropriate:

> str(lapply(dat, stat))
List of 4
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] -0.626 0.184 -0.836 1.595 0.33 ...
  ..$ b   : int [1:10] 3 1 2 1 1 2 1 2 3 2
  ..$ SD  : num [1:10] 0.85 0.643 0.738 0.643 0.643 ...
  ..$ Mean: num [1:10] -0.0253 0.649 -0.3058 0.649 0.649 ...
  ..$ Nb  : num [1:10] 2 4 4 4 4 4 4 4 2 4
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] -0.0449 -0.0162 0.9438 0.8212 0.5939 ...
  ..$ b   : int [1:10] 2 3 2 1 1 1 1 2 2 2
  ..$ SD  : num [1:10] 1.141 NA 1.141 0.136 0.136 ...
  ..$ Mean: num [1:10] -0.0792 -0.0162 -0.0792 0.7791 0.7791 ...
  ..$ Nb  : num [1:10] 5 1 5 4 4 4 4 5 5 5
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] 1.3587 -0.1028 0.3877 -0.0538 -1.3771 ...
  ..$ b   : int [1:10] 2 3 2 1 3 1 3 1 1 1
  ..$ SD  : num [1:10] 0.687 0.668 0.687 0.635 0.668 ...
  ..$ Mean: num [1:10] 0.873 -0.625 0.873 0.267 -0.625 ...
  ..$ Nb  : num [1:10] 2 3 2 5 3 5 3 5 5 5
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] -0.707 0.365 0.769 -0.112 0.881 ...
  ..$ b   : int [1:10] 3 3 2 2 1 1 3 1 2 2
  ..$ SD  : num [1:10] 0.593 0.593 1.111 1.111 0.297 ...
  ..$ Mean: num [1:10] -0.318 -0.318 0.24 0.24 0.54 ...
  ..$ Nb  : num [1:10] 3 3 4 4 3 3 3 3 4 4

Thats a clever and easy way to do it! Actually the results I got are ok, I just wanted to add another column to my final list! All i want to do, is add another column that contains the name of which data frame are the statistics calculated from. Can you please show me how to do that from your example?

Collectives™ on Stack Overflow

Add a column in a list of data frames

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related