0

I have data of tourism interactions with individually identified whales, where I have the whale ID, date of encounter and time of encounter

Id    Date     Time  
A   20110527    10:42
A   20110527    11:24
A   20110527    11:52
A   20110603    10:29
A   20110603    10:59
B   20110503    11:23
B   20110503    11:45
B   20110503    12:05
B   20110503    12:17

I would now like to add to additional columns that label the day of each encounter for each individual and the number of encounters within that day as follows:

Id     Date     Time  Day   Encounter
A   20110527    10:42   1   1
A   20110527    11:24   1   2
A   20110527    11:52   1   3
A   20110603    10:29   2   1
A   20110603    10:59   2   2
B   20110503    11:23   1   1
B   20110503    11:45   1   2
B   20110503    12:05   1   3
B   20110503    12:17   1   4

Is this possible? Any help would be greatly appreciated!

3 Answers 3

2

We could use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by "Id", we match the 'Date' with unique values of 'Date' to create the 'Day' column. Then, we group by 'Id', 'Date' and assign (:=) the sequence of rows to "Encounter".

library(data.table)
setDT(df1)[, Day:= match(Date, unique(Date)), by = Id
         ][, Encounter := seq_len(.N), by = .(Id, Date)]
df1
#    Id     Date  Time Day Encounter
#1:  A 20110527 10:42   1         1
#2:  A 20110527 11:24   1         2
#3:  A 20110527 11:52   1         3
#4:  A 20110603 10:29   2         1
#5:  A 20110603 10:59   2         2
#6:  B 20110503 11:23   1         1
#7:  B 20110503 11:45   1         2
#8:  B 20110503 12:05   1         3
#9:  B 20110503 12:17   1         4

data

df1 <- structure(list(Id = c("A", "A", "A", "A", "A", 
 "B", "B", "B", 
"B"), Date = c(20110527L, 20110527L, 20110527L, 
 20110603L, 20110603L, 
 20110503L, 20110503L, 20110503L, 20110503L), 
 Time = c("10:42", 
 "11:24", "11:52", "10:29", "10:59", "11:23", "11:45", "12:05", 
 "12:17")), .Names = c("Id", "Date", "Time"),
  class = "data.frame", row.names = c(NA, -9L))
Sign up to request clarification or add additional context in comments.

Comments

1

here is a reproducible example:

df <- structure(list(
  Id = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
                 .Label = c("A", "B"), class = "factor"),
  Date = c(20110527L, 20110527L, 20110527L, 20110603L,
           20110603L, 20110503L, 20110503L, 
           20110503L, 20110503L),
  Time = structure(c(2L, 5L, 7L, 1L, 3L, 4L, 6L, 8L, 9L),
                   .Label = c("10:29", "10:42", "10:59", "11:23", "11:24", "11:45", "11:52", "12:05", "12:17"), class = "factor")),
  .Names = c("Id",  "Date", "Time"), class = "data.frame", row.names = c(NA, -9L))

then one can use dplyr and

library(dplyr)
group_by(df, Id, Date) %>% mutate(Encounter=1:n()) %>% ungroup()

Source: local data frame [9 x 4]

Id     Date   Time Encounter
(fctr)    (int) (fctr)     (int)
1      A 20110527  10:42         1
2      A 20110527  11:24         2
3      A 20110527  11:52         3
4      A 20110603  10:29         1
5      A 20110603  10:59         2
6      B 20110503  11:23         1
7      B 20110503  11:45         2
8      B 20110503  12:05         3
9      B 20110503  12:17         4

Comments

1

Or Base R using ave and by:

I used the data posted by Vincent Bonhomme (Data should be sorted by Date and Id):

# Function to count the days per individual using factor levels 
foo <- function(x){as.numeric(as.character(factor(x,labels = 1:nlevels(factor(x)))))}

# Add the columns Day & Encounter
df$Day <-unlist(by(df$Date,list(df$Id),FUN=foo))
df$Encounter <- ave(1:nrow(df),list(df$Id,df$Date),FUN=seq_along)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.