0

I have a data frame with IDs for web page ('Webpage'), department ('Dept') and employee ('Emp_ID'):

df <- data.frame(Webpage = c(111, 111, 111, 111, 222, 222), 
                 Dept = c(101, 101, 101, 102, 102, 103), 
                 Emp_ID = c(1, 1, 2, 3, 4, 4)) 

#   Webpage Dept Emp_ID
# 1     111  101      1
# 2     111  101      1
# 3     111  101      2
# 4     111  102      3
# 5     222  102      4
# 6     222  103      4

I want to know how many unique individual has seen the different webpages.

enter image description here

For e.g. in the following dataset webpage 111 has been seen by three individual (unique combination of Dept and emp ID). So webpage 111 has been seen by emp_ID 1,2 and 3 in Dept 101 and 102. Similarly webpage 222 has been seen by two different individual.

My first attempt is:

nrow(unique(data[ , c("Dept", "Emp_ID)]))  

Using unique I can do for one web page, but can someone please suggest how I can calculate this for all web pages

3 Answers 3

2
df <- data.frame(Webpage = c(111, 111, 111, 111, 222, 222), 
                 Dept = c(101, 101, 101, 102, 102, 103), 
                 Emp_Id = c(1, 1, 2, 3, 4, 4))
library(dplyr)

df %>% 
  group_by(Webpage) %>% 
  summarise(n = n_distinct(Dept, Emp_Id))
#> # A tibble: 2 x 2
#>   Webpage     n
#>     <dbl> <int>
#> 1     111     3
#> 2     222     2

library(data.table)
setDT(df)[, list(n = uniqueN(paste0(Dept, Emp_Id))), by = Webpage]
#>    Webpage n
#> 1:     111 3
#> 2:     222 2

Created on 2021-03-30 by the reprex package (v1.0.0)

Sign up to request clarification or add additional context in comments.

2 Comments

An alternative data.table solution would be df[ , .(n = uniqueN(.SD)), by = Webpage]
Or more explicit about which columns to include in .SD: df[ , .(n = uniqueN(.SD)), by = Webpage, .SDcols = c("Dept", "Emp_Id")], if there are additional columns which should not be considered in the calculation.
2

For each Webpage count unique number based on two columns using duplicated.

library(dplyr)

df %>%
  group_by(Webpage) %>%
  summarise(n_viewers = sum(!duplicated(cur_data())))

#  Webpage n_viewers
#    <dbl>     <int>
#1     111         3
#2     222         2

data

Provide data in a reproducible format which is easier to copy rather than an image.

df <- data.frame(Webpage = c(111, 111, 111, 111, 222, 222), 
                 Dept = c(101, 101, 101, 102, 102, 103), 
                 Emp_Id = c(1, 1, 2, 3, 4, 4))

Comments

0

Hope aggregate can help

> aggregate(cbind(n_viewer = Emp_Id) ~ Webpage, unique(df), length)
  Webpage n_viewer
1     111        3
2     222        2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.