5

I'm trying the following function:

stest <- data.frame(group=c("John", "Jane", "James"), mean=c(3, 5, 1))
transform(stest, group = reorder(group, mean))

And expect the output be ordered by mean. Instead, I get:

  group mean
1  John    3
2  Jane    5
3 James    1

That is, same order as in the original dataframe.

Do I miss something? How to order a data frame correctly by one of its numerical variables?

Recommendations around is about using reorder, but I can't make it work as expected. Can any loaded packages interfere?

4
  • 1
    Maybe I don't get what you're after, but is : stest[order(stest$mean),] would be sufficient ? Commented Dec 2, 2013 at 14:28
  • @Chargaff Yep, it returns the right order, but when I'm trying to use this dataframe in ggplot, ggplot still plots it in the previous order. Commented Dec 2, 2013 at 14:38
  • 1
    @BlueMagister from the OPs last comment it looks like it may actually be a dupe of stackoverflow.com/q/5208679/1317221 Commented Dec 2, 2013 at 14:44
  • @user1317221_G Agreed. I cannot change my close vote to that question, however - only retract the close vote altogether. At the least, the title is ambiguous enough to point to both questions. Commented Dec 2, 2013 at 14:47

3 Answers 3

5

from the documentation

reorder is a generic function. The "default" method treats its first argument as a categorical variable, and reorders its levels based on the values of a second variable, usually numeric.

Note : Reordering levels, not the values of the factor variable(group in your case).

Compare:

levels(stest$group)
[1] "James" "Jane"  "John" 

with

>  reorder(stest$group, c(1,2,3))
[1] John  Jane  James
attr(,"scores")
James  Jane  John 
    3     2     1 
Levels: John Jane James

EDIT 1

From your comment:

"@Chargaff Yep, it returns the right order, but when I'm trying to use this dataframe in ggplot, ggplot still plots it in the previous order."

it seems you do actually want to reorder levels for a ggplot. I suggest you do:

stest$group <- reorder(stest$group, stest$mean)

EDIT 2

RE your last comment that the above line of code has "no effect". Clearly it does:

> stest$group
[1] John  Jane  James
Levels: James Jane John         # <-------------------------------
> stest$group <- reorder(stest$group, stest$mean)              # |
> stest$group                                                  # |
[1] John  Jane  James                                          # |
attr(,"scores")                                                # | DIFFERENT :)
James  Jane  John                                              # |
    1     5     3                                              # | 
Levels: James John Jane        # <--------------------------------
Sign up to request clarification or add additional context in comments.

8 Comments

Sorry, I didn't understand the difference. The docs says that it reorders the levels, so why Jane with 5 isn't on the top or bottom?
look at my example. your original levels are in the order "James" "Jane" "John" I changed them by 1,2,3 hence now the levels, not the data in the columns , are John Jane James . Perhaps you should read ?levels
I've tried levels(stest$group) <- reorder(stest$group, stest$mean) on the initial data, and it returned the same results "John" "Jane" "James" of levels(stest$group). Can you help me to clarify why this happens?
use stest$group <- reorder(stest$group, stest$mean)
...............only paste the code infront of > . At this point I give up.
|
1

I think you are wanting the order function which returns an index, not reorder which is used to change the order of factor levels. This would do it.

> stest[order(stest$mean),]

Comments

1

I've found my mistake thanks to user1317221_G and others.

The correct code that would order my dataset is:

stest$group <- reorder(stest$group, stest$mean, FUN=identity)

While

stest$group <- reorder(stest$group, stest$mean)

didn't order my dataframe. Not sure why FUN = mean didn't work, but I had to specify identity.

Possible reason is this: Reordering factor gives different results, depending on which packages are loaded

UPDATE

It's not enough to have the first line of code. reorder does not coerce the second argument to factors, thus final ordering may be incomplete (e.g., higher values below lower values in descending order).

Therefore, to be sure you have the right order:

stest$group <- reorder(stest$group, as.factor(stest$mean), FUN=identity)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.