Conditional replacement of values in data frame column based off multiple other columns - R

Question

My data frame looks like this

> tornado_frame
         tornado_names Level      value
1     node per cluster   low  -34.72222
2          TB per node   low  -52.08333
3  expense per cluster   low -104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high   20.83333
7          TB per node  high   41.66667
8  expense per cluster  high   52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

I want the table to transform into this

> tornado_frame
         tornado_names Level      value
1     node per cluster   low   34.72222
2          TB per node   low   52.08333
3  expense per cluster   low  104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high  -20.83333
7          TB per node  high  -41.66667
8  expense per cluster  high  -52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

Where the negative sign in "value" changes if its absolute value is greater than that of the "high" Level column and of the same tornado_name column.

I tried a few nested if's but that got messy for me. Any help would be appreciated!

Here is my data:

> dput(tornado_frame)
structure(list(tornado_names = structure(c(2L, 4L, 1L, 5L, 3L, 
2L, 4L, 1L, 5L, 3L), .Label = c("expense per cluster", "node per cluster", 
"revenue per cluster", "TB per node", "Total TB"), class = "factor"), 
    Level = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L
    ), .Label = c("high", "low"), class = "factor"), value = c(34.72222, 
    52.08333, 104.16667, -62.5, -52.08333, -20.83333, -41.66667, 
    -52.08333, 145.83333, 156.25)), .Names = c("tornado_names", 
"Level", "value"), class = "data.frame", row.names = c(NA, -10L
))

akrun · Accepted Answer · 2016-01-27 18:35:16Z

3

Here's a possible data.table solution

library(data.table)
setDT(df)[, value := if(diff(abs(value)) < 0) value * -1,
                                            by = tornado_names]
df
#           tornado_names Level     value
#  1:    node per cluster   low  34.72222
#  2:         TB per node   low  52.08333
#  3: expense per cluster   low 104.16667
#  4:            Total TB   low -62.50000
#  5: revenue per cluster   low -52.08333
#  6:    node per cluster  high -20.83333
#  7:         TB per node  high -41.66667
#  8: expense per cluster  high -52.08333
#  9:            Total TB  high 145.83333
# 10: revenue per cluster  high 156.25000

This will check your condition per tornado_names and only change the sign for the values within the groups where the condition is satisfied.

edited Jan 27, 2016 at 18:35

akrun

891k38 gold badges590 silver badges700 bronze badges

answered Jan 27, 2016 at 16:31

David Arenburg

92.4k18 gold badges145 silver badges202 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Dom Over a year ago

what if I wanted to add a second condition within that if statement to check a condition in another data frame's column @david

David Arenburg Over a year ago

So just add it with an & like if(cond1 & cond2)

Dom Over a year ago

Is there a way I can force the if statement to look at all elements of cond2's data frame (I'm getting the "the condition has length > 1 and only the first element will be used") and I'd prefer not to have to switch to ifelse

David Arenburg Over a year ago

Yes you can do stuff like if(value[1] > 1) or similar stuff. It's hard to tell what is your situation exactly.

Collectives™ on Stack Overflow

Conditional replacement of values in data frame column based off multiple other columns - R

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related