5

I have data frames A, B, C, ... and want to modify each data frame in the same way, e.g. re-ordering factors levels of a factor which is present in all of the data frames:

A = data.frame( x=c('x','x','y','y','z','z') )
B = data.frame( x=c('x','y','z') )
C = data.frame( x=c('x','x','x','y','y','y','z','z','z') )

A$x = factor( A$x, levels=c('z','y','x') )
B$x = factor( B$x, levels=c('z','y','x') )
C$x = factor( C$x, levels=c('z','y','x') )

This gets laborious if there are lots of data frames and/or lots of modifications to be done. How can I do it concisely, using a loop or something better? A straightforward approach like

for ( D in list( A, B, C ) ) {
D$x = factor( D$x, levels=c('z','y','x') )
}

does not work, because it doesn't modify the original data frames.

EDIT: added definitions of A, B, and C to make it reproducible.

3
  • 1
    Could you provide reproducible example? Commented Nov 2, 2013 at 3:14
  • Definitions of A, B, and C have been added so that you can run the code. Commented Nov 2, 2013 at 3:26
  • 1
    Thanks. I know it can annoying especially when situation is obvious but it is a good practice and makes our lives easier :) Commented Nov 2, 2013 at 6:31

2 Answers 2

4

One thing to note about R is that, with respect to assignment, <- is transitive, whereas = is not. Thus, if your data frames are all the same in this respect, you should be able to do something like this:

A$x <- B$x <- C$x <- factor( C$x, levels=c('z','y','x') )
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the reply. Unfortunately it will not work for my case (I think) because the data frames all have different numbers of rows. I'll modify my example to make this clear.
This is why the reproducible example is necessary.
3

If you don't need explicit loop you can use lapply:

ll <- lapply(
    list(A, B, C),
    function(df) {
        df$x <- factor(df$x, levels=c('z', 'y', 'x'))
        return(df)
    }
)

Since data is only copied you'll have to use list returned from lapply.

Edit

dfs <- list('A', 'B', 'C')
levels <- c('z', 'y', 'x')

l <- lapply(
    dfs,
    function(df) {
        # Get data frame by name
        df <- get(df)
        df$x <- factor(df$x, levels=levels)
        return(df)
    }
)


for ( i in 1:length(dfs)) {
    assign(dfs[[i]], l[[i]])
}

3 Comments

If you don't put in return(df) you will not get back dataframe elements.
This is OK, but I would like a way to modify the original data frames, or more precisely, I want to continue referring to them by their original names. Is there an easy way to get that result using the output of this solution?
I've posted an edit with example solution but I cannot say I like it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.