1

Excuse me if this is a stupid question, but if I have a dataframe like this...:

Year Location Value SE.Value 
2010   USA     10      1
2010   USA     11      1
2011   USA     12      2
2011   USA     20      4
2012   USA     13      1

I want a bar chart that will, for each year, plot the average of value and use the aggregate of the SE.Value (Standard error) to determine the error bars.

What is the correct way to do this?

In my approach (below), I keep getting an error where I see multiple error bars because I assume that it is not computing the aggregate values but rather for difference between value - SE.Value at each row.

err_bar_limits <- aes(ymin = (df$Value - df$SE.Value), ymax = (df$Value + df$SE.Value))

ggplot(data=df, aes(x=df$Year, y=df$Value)) + geom_bar(position="dodge", stat="identity") + geom_errorbar(err_bar_limits, width=0.2, position="dodge")

For a sample of the error described above, this is what I mean:

enter image description here

Because of the issue above, I adjusted to trying...:

avg_vals <- aggregate(df$Value, list(df$Year), mean)
avg_se_vals <- aggregate(df$SE.Value, list(df$Year), mean)

I believe that should give me a dataframe that has the average of either value or SE.Value grouped by "Year", right?

Then from there I tried...:

err_bar_limits <- aes(ymin = (avg_vals$Value - avg_se_vals$SE.Value), ymax = (avg_vals$Value + avg_se_vals$SE.Value))

ggplot(data=df, aes(x=df$Year, y=df$Value)) + geom_bar(position="dodge", stat="identity") + geom_errorbar(err_bar_limits, width=0.2, position="dodge")

But I get an error

Aesthetics must be either length 1 or the same as the data 

I know this is probably a dumb mistake but I've never really used ggplot that much before so I'm a little stuck here.

I know my original method was totally wrong and I need to group the error bar min/max by the year, but I'm not sure how to get over the bug when trying it that way.

Hope that made sense...

3
  • Please include the output of dput(df) in your question to make this code reproducible. Commented Jul 24, 2017 at 2:05
  • I added a picture of the error in that scenario -- the img i included uses my working data not the same i posted above, but it should give a sense of the issue (the multiple error bars at each bar). Commented Jul 24, 2017 at 2:09
  • Take a look at the ggplot cookbook if you haven't already: cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2) Commented Jul 24, 2017 at 2:36

1 Answer 1

1

When you call your err_bar_limits you will get the following:

* ymax -> avg_vals$Value + avg_se_vals$SE.Value
* ymin -> avg_vals$Value - avg_se_vals$SE.Value

And geom_bar() fails to understand this. Hence you should feed this directly to geom_bar:

ggplot(data=df, aes(x=df$Year, y=df$Value)) +
geom_bar(position="dodge", stat="identity") +
geom_errorbar(aes(ymin = (avg_vals$Value - avg_se_vals$SE.Value), ymax = (avg_vals$Value + avg_se_vals$SE.Value)),
width=0.2, position="dodge")

Here is my code:

avg_vals <- aggregate(df$Value, list(df$Year), mean)
avg_se_vals <- aggregate(df$SE.Value, list(df$Year), mean)

ndf = merge(avg_vals, avg_se_vals, by = "Group.1")
names(ndf) = c("Year", "Avg", "SE")
ndf

library(ggplot2)
ggplot(data = ndf, aes(x = ndf$Year, y = ndf$Avg)) +
  geom_bar(position = "dodge", stat = "identity") +
  geom_errorbar(aes(ymax = ndf$Avg + ndf$SE, ymin = ndf$Avg - ndf$SE),
                width = 0.2, position = "dodge")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.