1

I've looked through several other related questions but haven't really found something that meets my case.

I have a column that dictates how many columns will be summed into a new column.

  • If DEPENDENCY == INDEP then NET_AGI = IND_AGI
  • If DEPENDENCY == DEP then NET_AGI = PRO_AGI + IND_AGI
  • Otherwise NET_AGI = PRO_AGI
DEPENDENCY IND_AGI PRO_AGI  NET_AGI <- NET_AGI will be the summed column
INDEP      0049995    -     0049995
DEP        0000500 0090500  0091000
DEP        0009000 0121095  0130950
DEP           -    0375001  0375001
INDEP      0123456    -     0123456
DEP        0012070 1023030  1035100
...

What's the best way to do that?

1
  • What did you try? How didn't it work? Seems like a case for ifelse() or dplyr::case_when() Commented Oct 24, 2019 at 19:26

2 Answers 2

1
library(dplyr)

df1 %>% 
  mutate(NET_AGI_2 = case_when (DEPENDENCY == "DEP" ~ as.character(sprintf('%07d', rowSums(
                                                      cbind(as.numeric(IND_AGI), 
                                                            as.numeric(PRO_AGI)) , 
                                                      na.rm = T))),
                                DEPENDENCY == "INDEP" ~ IND_AGI,
                                TRUE ~ PRO_AGI))

#>   DEPENDENCY IND_AGI PRO_AGI NET_AGI NET_AGI_2
#> 1      INDEP 0049995       -   49995   0049995
#> 2        DEP 0000500 0090500   91000   0091000
#> 3        DEP 0009000 0121095  130950   0130095
#> 4        DEP       - 0375001  375001   0375001
#> 5      INDEP 0123456       -  123456   0123456
#> 6        DEP 0012070 1023030 1035100   1035100

Data:

read.table(text="DEPENDENCY IND_AGI PRO_AGI  NET_AGI
INDEP      0049995    -     0049995
DEP        0000500 0090500  0091000
DEP        0009000 0121095  0130950
DEP           -    0375001  0375001
INDEP      0123456    -     0123456
DEP        0012070 1023030  1035100",stringsAsFactors = F, header=T) -> df1
Sign up to request clarification or add additional context in comments.

2 Comments

One thing to note is that the types of every column in this answer is character. This means that if you're looking to summarize or visualize this data (e.g., mean, summary, hist), you'll have to convert it back into a numeric class (which is done as part of this answer but then reversed to preserve the 7 character formatting).
Extra steps can be easily excluded from a solution while invalid assumptions would invalidate the solution altogether. If one wants to get numeric column, can exclude as.character and sprintf from my solution. Cheers.
1

Probably the fastest (and one of the most simple) ways to do this would be

df$NET_AGI = df$PRO_AGI
df[df$DEPENDENCY == 'INDEP', 'NET_AGI'] = df[df$DEPENDENCY == 'INDEP', 'IND_AGI']
df[df$DEPENDENCY == 'DEP', 'NET_AGI'] = rowSums(df[df$DEPENDENCY == 'DEP', c('PRO_AGI', 'IND_AGI')], na.rm = T)

If you want to read in the data set as is and have this work as is, use the following. Note that this assumes that the seven character formatting is not necessary.

df <- read.table(text="DEPENDENCY IND_AGI PRO_AGI  NET_AGI
INDEP      0049995    -     0049995
DEP        0000500 0090500  0091000
DEP        0009000 0121095  0130950
DEP           -    0375001  0375001
INDEP      0123456    -     0123456
DEP        0012070 1023030  1035100",
  stringsAsFactors = F, header=T, na.strings = c('NA', '-'))

2 Comments

If the - are NAs, then IND_AGI and PRO_AGI are integer classes. I'd prefer not to assume that the makeshift table above should be taken literally when the rest of the question is not in a reproducible format.
And I suppose it's coincidence I received downvotes without comment on answers from years ago yesterday? The result of your answer doesn't reproduce the formatting above (the dashes is right aligned instead of centered). The data set is in a non-standard R format. If the user had taken the time to make R produce that format, they'd know enough about R to not have to ask this question. To make the original data work literally all you have to do is tell R that - should be treated as NA. I added that as a section in my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.