16

I have a relational dataset, where I'm looking for dyadic information.

I have 4 columns. Sender, Receiver, Attribute, Edge

I'm looking to take the repeated Sender -- Receiver counts and convert them as additional edges.

df <- data.frame(sender = c(1,1,1,1,3,5), receiver = c(1,2,2,2,4,5), 
                attribute = c(12,12,12,12,13,13), edge = c(0,1,1,1,1,0))

   sender receiver attribute edge
1       1        1        12    0
2       1        2        12    1
3       1        2        12    1
4       1        2        12    1
5       3        4        13    1

I want the end result to look like this:

  sender receiver attribute edge
1      1        1        12    0
2      1        2        12    3
3      3        4        13    1

Where the relationship between duplicate sender-receivers have been combined and the number of duplicates incorporated in the number of edges.

Any input would be really appreciated.

Thanks!

0

2 Answers 2

20

For fun, here are two other options, first using the base function aggregate() and the second using data.table package:

> aggregate(edge ~ sender + receiver + attribute, FUN = "sum", data = df)
  sender receiver attribute edge
1      1        1        12    0
2      1        2        12    3
3      3        4        13    1
4      5        5        13    0
> require(data.table)
> dt <- data.table(df)
> dt[, list(sumedge = sum(edge)), by = "sender, receiver, attribute"]
     sender receiver attribute sumedge
[1,]      1        1        12       0
[2,]      1        2        12       3
[3,]      3        4        13       1
[4,]      5        5        13       0

For the record, this question has been asked many many many times, perusing my own answers yields several answers that would point you down the right path.

Sign up to request clarification or add additional context in comments.

1 Comment

Any answer using only base functions always gets +1 from me.
7

plyr is your friend - although I think your end result is not quite correct given the input data.

library(plyr)

ddply(df, .(sender, receiver, attribute), summarize, edge = sum(edge))

Returns

  sender receiver attribute edge
1      1        1        12    0
2      1        2        12    3
3      3        4        13    1
4      5        5        13    0

1 Comment

I think the OP was not intending to group by sender + receiver + attribute, but just by sender + receiver, and attribute goes along for the ride. In the example, attribute just happens to be unique for the sender + receiver pairings, but I think that was accidental

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.