0

So, I have a sample data

structure(list(Conversation = c(1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 
2L, 2L, 3L, 3L), ID.Number = c("ID 11", "ID 11", "ID 11", "ID 11", 
"ID 11", "ID 11", "ID 14", "ID 14", "ID 14", "ID 14", "ID 14", 
"ID 14"), Swear.word = c(0L, 2L, 4L, 3L, 0L, 0L, 1L, 0L, 3L, 
1L, 0L, 4L)), class = "data.frame", row.names = c(NA, -12L))

And, I am trying to have a result that looks like

structure(list(IDNumber = c(11L, 14L), Convo1 = 2:1, Convo2 = c(7L, 4L), Convo3 = c(0L, 4L)), class = "data.frame", row.names = c(NA, -2L))

So, basically, I am trying to see swear words usage (sum of the word usage) by conversation type (convo#) for each participant.

How can I do this using R?

Thanks!

4 Answers 4

2

Try this tidyverse approach. I have used the data shared as A. You can use pivot_wider() in one code shot to obtain the desired result. Here the code:

library(tidyverse)
#Code
New <- A %>% mutate(Conversation=paste0('Conv.',Conversation)) %>%
  pivot_wider(names_from = Conversation,values_from=Swear.word,values_fn = sum)

Output:

# A tibble: 2 x 4
  ID.Number Conv.1 Conv.2 Conv.3
  <chr>      <int>  <int>  <int>
1 ID 11          2      7      0
2 ID 14          1      4      4

And an optimal code shot can be (Many thanks and credit to @starja):

#Code 2
Newdf <- A %>% pivot_wider(names_from = Conversation,
                  values_from=Swear.word,
                  values_fn = sum,names_prefix='Conv.')

Output:

# A tibble: 2 x 4
  ID.Number Conv.1 Conv.2 Conv.3
  <chr>      <int>  <int>  <int>
1 ID 11          2      7      0
2 ID 14          1      4      4
Sign up to request clarification or add additional context in comments.

2 Comments

Smart use of values_fn! To only use pivot_wider, you could use names_prefix
@starja Great advice, let me add that piece with credit to you!
1

This should work

library(tidverse)


df <- x %>%
    group_by(ID.Number, Conversation) %>%
    summarize(
        total = sum(Swear.word, na.rm = TRUE)
    ) %>%
    spread(Conversation, total) %>%
    magrittr::set_colnames(c("IDNumber","Convo1","Convo2", "Convo3"))
df

Comments

1

Here is an approach with dplyr, tidyr and stringr:

library(dplyr)
library(tidyr)
library(stringr)

data %>% 
  mutate(ID.Number = as.integer(str_extract(ID.Number, "\\d+"))) %>% 
  group_by(ID.Number, Conversation) %>% 
  summarise(count = sum(Swear.word)) %>% 
  pivot_wider(
    id_cols = ID.Number,
    names_from = Conversation,
    values_from = count,
    names_prefix = "Convo"
  ) %>% 
  rename(IDNumber = ID.Number)
# A tibble: 2 x 4
# Groups:   IDNumber [2]
  IDNumber Convo1 Convo2 Convo3
     <int>  <int>  <int>  <int>
1       11      2      7      0
2       14      1      4      4

Comments

1

We can use xtabs from base R

xtabs(Swear.word ~ ID.Number + Conversation, df1)
#        Conversation
#ID.Number 1 2 3
#    ID 11 2 7 0
#    ID 14 1 4 4

Or using dcast from data.table

library(data.table)
dcast(setDT(df1), ID.Number ~ paste0('Conv.', Conversation), 
     value.var = 'Swear.word', sum)
#   ID.Number Conv.1 Conv.2 Conv.3
#1:     ID 11      2      7      0
#2:     ID 14      1      4      4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.