2

I have 2 dataframes:

df1: (3 rows, 3 columns)

Type     CA     AR     OR   Total
alpha    2      3      1      6
beta     1      5      2      8
delta    8      1      1      10

df2: (4 rows, 2 columns)

Type     CA     AR     Total
alpha    3      4        7
beta     2      6        8
gamma    9      1        10
delta    4      1        5

I want to add the values in the two dataframes. The result should be as below: I tried df1 + df2, but the problem I face is that the two dataframes are different dimensions / sizes. Is there any sort of function that will let me add the rows/columns that have the same row/column name (some sort of match function to match the names.) In the event that there are no occurences of the value (in this case --> gamma, OR, I want it to print 0/NA --> I don't care what actually.) And I need to be some sort of match and not hard-coded because I will be doing this for several tables.

Type     CA     AR       OR   Total
alpha    5      7        1     13
beta     3      11       2     16
gamma    9      1        0     10
delta    12     2        1     15

Thanks!!

1
  • will df1 always have more columns than df2? Commented Jul 16, 2015 at 17:18

3 Answers 3

3

I personally would use a melt/cast reshaping strategy

library(reshape2)
melted <- rbind(melt(df1), melt(df2))
dcast(melted, Type~variable, sum)

#    Type CA AR OR Total
# 1 alpha  5  7  1    13
# 2  beta  3 11  2    16
# 3 delta 12  2  1    15
# 4 gamma  9  1  0    10
Sign up to request clarification or add additional context in comments.

1 Comment

Nice. And the extension to more dfs is pretty clear: melted <- do.call(rbind,lapply(dflist,melt))
1

Using sqldf:

library(sqldf)
sqldf('SELECT Type, sum(CA) CA, sum(AR) AR, sum ([OR]) [OR], sum(Total) Total 
      FROM (SELECT Type, CA, AR, [OR], Total FROM df1
      UNION ALL
      SELECT Type, CA, AR, Null as [OR], Total FROM df2) 
      GROUP by Type')

Output:

   Type CA AR OR Total
1 alpha  5  7  1    13
2  beta  3 11  2    16
3 delta 12  2  1    15
4 gamma  9  1 NA    10

Comments

0

Another way to do it is to install package plyr, this will allow you to use the rbind.fill function which does not require the same columns. The the aggregate function will sum your values.

install.packages("plyr")
data<-rbind.fill(df1,df2)
data[is.na(data)]<-0
aggregate(data[c("CA","AR","OR","Total")],by=data["Type"], FUN = sum) 


   Type CA AR OR Total
1 alpha  5  7  1    13
2  beta  3 11  2    16
3 delta 12  2  1    15
4 gamma  9  1  0    10

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.