1

I'm trying to visualize health insurance benefit options for my company to help others make a decision. I have a table like so:

| plan |        ded |  oop | exp_oop |
|------+------------+------+---------|
|    a |        400 | 2100 | 17400   |
|    b |       1300 | 2600 | 14300   |
|    c |       2600 | 5200 | 28600   |
  • ded = deductible; expense level where 90% co-insurance kicks in
  • oop = out of pocket maximum
  • exp_oop = amount of medical expense at which oop is reached

I want to plot cost to the employee vs. medical expenses incurred. Health insurance works in ranges...

cost = expenses for 0 < expenses < ded
cost = deductible + (0.10 x (expenses - ded)) for ded <= expenses < exp_oop
cost = oop for oop <= expenses <= infinity

How might I plot each of these ranges? Basically, one gets a line of slope = 1 for 0 to each plan's deductible, then a line of slope = 0.1 from x = deductible to x = oop, and then a line of slope = 0 from oop upward.

I'm not sure how to conditionally plot with ggplot2. If you'd like to use the above, here's reproducible code for these cutoffs:

dat <- data.frame(plan = c("a", "b", "c"), ded = c(400, 1300, 2600), oop = c(2100, 2600, 5200), exp_oop = c(17400, 14300, 28600))

Do I have to create the x/y values myself? In other words an intermediate table like so?

| plan |     x |    y |
|------+-------+------|
|    1 |     0 |    0 |
|    1 |   400 |  400 |
|    1 | 17400 | 2100 |
|    2 |     0 |    0 |
|    2 |  1300 | 1300 |
|    2 | 14300 | 2600 |
|    3 |     0 |    0 |
|    3 |  2600 | 2600 |
|    3 | 28600 | 5200 |

I'm doing this for several variants (employee only, employee + spouse, etc.) so it would be great if I didn't need separate data tables for each plan but could just work with the already defined deductibles and out of pocket max values I already have in a data frame...

Thanks for any suggestions!

3
  • Sounds like you want to write a custom function to calculate expenses based on the rules you describe, and then use that function to calculate cost on a range of x<-0:some high value for each plan, probably using apply or similar. Commented Oct 25, 2012 at 16:29
  • 1
    From your description, it seems that the boundary between the second and third tier should be exp_oop, not oop. Commented Oct 25, 2012 at 19:01
  • @BrianDiggs Fixed -- good catch! Commented Oct 25, 2012 at 21:15

2 Answers 2

1

My approach basically follows Drew's, but just does the steps differently. I start with a function which takes the plan, ded, oop, and exp_oop and returns a function which gives a cost for a given expense (based on those parameters). [Note: I've assumed the break between the second and third tier is exp_oop, not oop as originally stated in the question.]

cost_generator <- function(ded, oop, exp_oop, ...) {
  function(expenses) {
    ifelse(expenses < ded, 
           expenses, 
           ifelse(expenses < exp_oop, 
                  ded + (0.1 * (expenses-ded)),
                  oop))
  }
}

Now using plyr, I can create a list of functions which map expenses to cost, one for each plan

library("plyr")
funs <- mlply(dat, cost_generator)

For each function, determine the cost for a given range of expenses. Here, I've picked a range from 0 to $50,000 in increments of $100.

pts <- ldply(funs, function(f) {
  expenses <- seq(0, 50000, 100)
  data.frame(expenses=expenses, cost=f(expenses))
})

This gives a data frame in long form which is easy to plot.

library("ggplot2")
ggplot(pts, aes(expenses, cost, colour=plan)) +
  geom_line()

enter image description here

Of course, this is not really cost, but amount paid out of pocket for a given level of expense. Total cost will include additional things (premiums, at least).

EDIT:

If you want to make sure every change point is included (not relying on rounding to the nearest $100), you can extract the points from dat and use those:

library("reshape2")
exps <- melt(dat, id.var="plan")$value
exps <- c(0, exps, 1.1*max(exps))

pts <- ldply(funs, function(f) {
  data.frame(expenses=exps, cost=f(exps))
})

I added 0 and something larger than the largest value in the table to make the ends reasonable.

enter image description here

Sign up to request clarification or add additional context in comments.

9 Comments

I like it. Thanks. Can you explain what function(f) does? I don't see any references to f... Also, this will be slightly inaccurate to the nearest 100, correct? There's no way to target slope changes at the actual values for those changes without making my sequence incorporate said x values, right?
function(f) declares and anonymous function which takes as its sole argument a function (named f inside the anonymous function). That function is called in the creation of the data.frame (cost=f(expenses)).
Thanks for clarification on the function (haven't gotten into this at all in my learning of R). Re. the second part of my question, I hunted around and believe I can generate my sequence from the data itself: sort(plot[!duplicated(plot$x), "x"]). Those are the only x values that matter for the plot I'm trying to make. Neat stuff.
By the way, looks like you're missing a closing ) after pts <- ldply ... cost = f(exps)) }. Thanks for the update; I missed that bit as the answer. My comment above won't actually work; I was working with the intermediate table above, which was the whole point in asking this question (not having to manually do that), so thanks for the added solution.
@Hendy Thanks for catching the missing parenthesis; must have been a cut-and-paste error. I've fixed the answer now.
|
1

Write a vectorize function to calculate costs to the employee as a function of expenses occurred. It must be vectorized, so that you can feed it to ddply.

costFinder <- function(df, oopActual) {
  #df is your 'dat'; we will throw away exp_oop
  #oopActual should be a vector; it is the x axis of your plot
  ded <- df$ded
  oopMax <- df$oop
  cost <- rep(NA, length(oopActual)) #preallocating with NAs will help ID mistakes
  cost[oopActual<ded] <- oopActual[oopActual<ded]
  cost[ded <= oopActual & oopActual < oopMax] <- 0.1 * (oopActual[ded <= oopActual & oopActual < oopMax] - ded) + ded
  cost[oopMax <= oopActual] <- oopMax
  return(cost)
}

Then define an expense seqence (not too many data points, or it becomes computationally expensive) and calculate the actual out-of-pocket cost foe each value of expense, for each plan:

expense <- seq(0, 50000, by=200)
allCosts <- ddply(dat, .(plan), costFinder, expense)
names(allCosts)[2:ncol(allCosts)] <- expense

Now melt the vector so you can use it with ggplot. Here, I employ the shady trick of renaming the columns of the allCosts data frame with numerical values. This is probably a bad idea, and I'd love to see a better way to do it.

costsM <- melt(allCosts, id.vars="plan") 
names(costsM)[2:3] <- c("expense", "actualOOP")
#melt() interprets the column names as a factor. We have to turn them back into numeric,
#    by turning them into characters first and then numerics.
costsM$expense <- as.character(costsM$expense)
costsM$expense <- as.numeric(costsM$expense)

#Plot the data
p <- ggplot() + geom_line(data=costsM, aes(x=expense, y=actualOOP, colour=plan))
print(p)

enter image description here

#Add vertical lines for the expected OOP, if you like - arguably it makes things more confusing.
p + geom_vline(data=dat, aes(xintercept=exp_oop, colour=plan))

enter image description here

2 Comments

Thanks for the answer. More complicated than I thought. I might just be better off calculating with a spreadsheet and using csv? I created one manually, and I believe something is off with your graph. The three segments should be slope = 1, slope = 0.1, slope = 0 with no big jumps. The blue line jumps drastically for some reason. Take a look at my version. Is this just an image scale thing, or is something not quite right with yours? (Disregard the dotted lines; those just show adjusted costs based on premium savings and company HSA contribution.)
There may be a mistake in the way I encoded the costFinder function.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.