4

Just getting started using R and I need some help in understanding the application of for/nested loop.

StudyID<-c(1:5)
SubjectID<-c(1:5)

df<-data.frame(StudyID=rep(StudyID, each=5), SubjectID=rep(SubjectID, each=1))

How can I create a new column called as ID, which would use the combination of studyID and subjectID to create a unique ID ?

So for this data, unique ID should be from 1:25.

So the final data looks like this:

UniqueID<- c(1:25)

df<-cbind(df,UniqueID)

View(df)

Is there any other way which is faster and more efficient that looping ?

3 Answers 3

2

Using the dplyr package, you could do:

library(dplyr)
df$Id = group_indices(df,StudyID,SubjectID)

This returns:

#StudyID   SubjectID   Id
#   1         1        1
#   1         2        2
#   1         3        3
#   1         4        4
#   1         5        5
#   2         1        6
#   2         2        7
#   2         3        8
#   2         4        9
#   2         5       10
#   3         1       11
#   3         3       13
#   3         4       14
#   3         5       15
#   4         1       16
#   4         2       17
#   4         3       18
#   4         4       19
#   4         5       20
#   5         1       21
#   5         2       22
#   5         3       23
#   5         4       24
#   5         5       25
Sign up to request clarification or add additional context in comments.

Comments

2

Another method to achieve that without loading any library (base R) would be this (assuming data frame is sorted based on the two columns):

StudyID<-c(1:5)
SubjectID<-c(1:5)
df<-data.frame(StudyID=rep(StudyID, each=5), SubjectID=rep(SubjectID, each=1))

df$uniqueID <- cumsum(!duplicated(df[1:2]))

or you can use this solution, mentioned in the comments (I prefer this over the first solution):

df$uniqueID <- as.numeric(factor(do.call(paste, df)))

The output would be:

> print(df, row.names = FALSE)
#StudyID  SubjectID  uniqueID
#   1         1          1
#   1         2          2
#   1         3          3
#   1         4          4
#   1         5          5
#   2         1          6
#   2         2          7
#   2         3          8
#   2         4          9
#   2         5         10
#   3         1         11
#   3         2         12
#   3         3         13
#   3         4         14
#   3         5         15
#   4         1         16
#   4         2         17
#   4         3         18
#   4         4         19
#   4         5         20
#   5         1         21
#   5         2         22
#   5         3         23
#   5         4         24
#   5         5         25

3 Comments

Just imagine the last observation were not (5, 5) but (1, 1). Your code will not recognize that it's about the first observation. It will just stop counting. Compare with the above solution to see the difference!
another base R interation: as.numeric(factor(do.call(paste, df)))
@And that can be solved by sorting. Notice that I am not changing the dataframe by sorting. Just apply it on the sorted df and transform back to the original order. user20650 answer would be much better and easier in that case and that's why I included it. Thank you both.
1

You could go for interaction in base R:

df$uniqueID <- with(df, as.integer(interaction(StudyID,SubjectID)))

For example (this example expresses better what you are after):

set.seed(10)
df <- data.frame(StudyID=sample(5,10,replace = T), SubjectID=rep(1:5,times=2))
df$uniqueID <- with(df, as.integer(interaction(StudyID,SubjectID)))

     # StudyID SubjectID uniqueID
# 1        3         1        3
# 2        2         2        6
# 3        3         3       11
# 4        4         4       16
# 5        1         5       17
# 6        2         1        2
# 7        2         2        6
# 8        2         3       10
# 9        4         4       16
# 10       3         5       19

1 Comment

This give the solution, however sorting has to be done in case you desire the ID's in an order.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.