I have a large data set with text comments and their ratings on different variables, like so:
df <- data.frame(
comment = c("commentA","commentB","commentB","commentA","commentA","commentC"
sentiment=c(1,2,1,4,1,2),
tone=c(1,5,3,2,6,1)
)
Every comment is present between one and 3 times, since multiple people are asked to rate the same comment sometimes.
I'm looking to create a data frame where the "comment" column only has unique values, and the other columns are appended, so any one text comment has as many "sentiment" and "tone" columns as there are ratings (which will result in NA's for comments that have not been rated as often, but that's okay):
df <- data.frame(
comment = c("commentA","commentB","commentC",
sentiment.1=c(1,2,2),
sentiment.2=c(4,1,NA),
sentiment.3=c(1,NA,NA),
tone.1=c(1,5,1),
tone.2=c(2,3,NA),
tone.3=c(6,NA,NA)
)
I've been trying to figure this out using reshape to go from long to wide using
reshape(df,
idvar = "comment",
timevar = c("sentiment","tone"),
direction = "wide"
)
But that results in all possible combinations between sentiment and tone, rather than simply duplicating sentiment and tone independently.
I also tried using gather like so df %>% gather(key, value, -comment), but that only gets me halfway there...
Could anyone please point me in the right direction?