0

I was writing a loop with if function in R. The table is like below:

ID  category
1   a
1   b
1   c
2   a
2   b
3   a
3   b
4   a
5   a

I want to use the for loop with if function to add another column to count each grouped ID, like below count column: ID category Count

1   a   1
1   b   2
1   c   3
2   a   1
2   b   2
3   a   1
3   b   2
4   a   1
5   a   1

My code is (output is the table name):

for (i in 2:nrow(output1)){
  if(output1[i,1] == output[i-1,1]){
    output1[i,"rn"]<- output1[i-1,"rn"]+1
  } 

  else{
     output1[i,"rn"]<-1
   } 

}

But the result returns as all count column values are all "1".

ID  category    Count
1   a   1
1   b   1
1   c   1
2   a   1
2   b   1
3   a   1
3   b   1
4   a   1
5   a   1

Please help me out... Thanks

3
  • There are functions that can do this operation quickly, but it is always good to practice logical control flows with the loop. try adding output1$rn <- 1 before the loop Commented Sep 17, 2015 at 17:23
  • try grouping by id and counting rows, library(dplyr); dat %>% group_by(ID) %>% mutate(Count = 1:n()) Commented Sep 17, 2015 at 17:24
  • you need just base R to do this see my answer. Commented Sep 17, 2015 at 17:42

3 Answers 3

3

There are packages and vectorized ways to do this task, but if you are practicing with loops try:

output1$rn <- 1
for (i in 2:nrow(output1)){
  if(output1[i,1] == output1[i-1,1]){
    output1[i,"rn"]<- output1[i-1,"rn"]+1
  } 

  else{
     output1[i,"rn"]<-1
   } 
}

With your original code, when you made this call output1[i-1,"rn"]+1 in the third line of your loop, you were referencing a row that didn't exist on the first pass. By first creating the row and filling it with the value 1, you give the loop something explicit to refer to.

output1
#   ID category rn
# 1  1        a  1
# 2  1        b  2
# 3  1        c  3
# 4  2        a  1
# 5  2        b  2
# 6  3        a  1
# 7  3        b  2
# 8  4        a  1
# 9  5        a  1

With the package dplyr you can accomplish it quickly with:

library(dplyr)
output1 %>% group_by(ID) %>% mutate(rn = 1:n())

Or with data.table:

library(data.table)
setDT(output1)[,rn := 1:.N, by=ID]

With base R you can also use:

output1$rn <- with(output1, ave(as.character(category), ID, FUN=seq))

There are vignettes and tutorials on the two packages mentioned, and by searching ?ave in the R console for the last approach.

Sign up to request clarification or add additional context in comments.

1 Comment

From the next version on (v1.9.8), we'd be able to do this simply as: dt[, rn := rowid(ID)]
1

looping solution will be painfully slow for bigger data. Here is one line solution using data.table:

require(data.table)
a<-data.table(ID=c(1,1,1,2,2,3,3,4,5),category=c('a','b','c','a','b','a','b','a','a'))
a[,':='(category_count = 1:.N),by=.(ID)]

Comments

1

what you want is actually a column of factor level. do this

df$count=as.numeric(df$category)

this will give out put as

  ID category count
1  1        a     1
2  1        b     2
3  1        c     3
4  2        a     1
5  2        b     2
6  3        a     1
7  3        b     2
8  4        a     1
9  5        a     1

provided your category is already a factor. if not first convert to factor

df$category=as.factor(df$category)
df$count=as.numeric(df$category)

5 Comments

This will only work for this specific example. Provided that category is some real name this could mess up, for example as.numeric(factor(c("shoe","bag","tie")))
Cant understand why. It will work for any factor. Even if factor levels are of any length .
Did you try the code I posted in my previous comment?
yes..by mess up you meant it will generate factor levels in order of alphabets and will assign shoe=2,bag=1,tie=3?
Yes, so I'm guessing that OP could have different categories which could produce wrong result.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.