Overlaying line graph with barplot in ggplot2

Question

Provided the following dataframe (see below) which was taken out of a questionnaire asking about perceived security to people from different neighborhoods, I have managed to create a bar plot which displays perceived security and groups results per each neighborhood:

questionnaire_raw = read.csv("https://www.dropbox.com/s/l647q2omffnwyrg/local.data.csv?dl=0")

ggplot(data = questionnaire_raw, 
       aes(x = factor(Seguridad.de.tu.barrio..de.día.), # We have to convert x values to categorical data
           y = (..count..)/sum(..count..)*100,
           fill = neighborhoods)) + 
  geom_bar(position="dodge") + 
  ggtitle("Seguridad de día") + 
  labs(x="Grado de seguridad", y="% encuestados", fill="Barrios")

enter image description here

I would like to overlay these results with a line graph representing the mean of each security category (1, 2, 3 or 4) in all neighborhoods (this is, without grouping results), so it is easy to know if a specific neighborhood is over or under the average of all neighborhoods. However, since it's my first job with R, I do not know how to calculate that mean with a dataframe and then overlay it in the previous barplot.

What about adding something like + stat_summary(fun.data="mean_cl_normal", geom = "line", mapping = aes(group = 1)) (untested)? — lukeA
– lukeA, Commented Feb 12, 2015 at 11:56
results in Error: stat_summary requires the following missing aesthetics: y — Rentrop
– Rentrop, Commented Feb 12, 2015 at 12:00

Rentrop · Accepted Answer · 2015-02-12 12:10:07Z

4

using data.table for data-manipulation and lukeA's comment:

require(ggplot2)
require(data.table)
setDT(questionnaire_raw)
setnames(questionnaire_raw, c("Timestamp", "Barrios", "Grado"))

plot_data <- questionnaire_raw[,.N, by=.(Barrios,Grado)]
ggplot(plot_data, aes(x=factor(Grado), y = N, fill = Barrios)) +
  geom_bar(position="dodge", stat="identity") +
  stat_summary(fun.y=mean, geom = "line", mapping = aes(group = 1)) +
  ggtitle("Seguridad de día") + 
  labs(x="Grado de seguridad", y="% encuestados", fill="Barrios")

Result: enter image description here

edited Feb 12, 2015 at 12:10

answered Feb 12, 2015 at 12:04

Rentrop

21.6k12 gold badges75 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

ccamara Over a year ago

Thank you very much for your answer. It's working fine, although I have to understand what are you doing because since the original dataframe is far bigger (we have 72 variables, not 3) it seems that I can't reproduce the setnames line. I think I need to create a vector with all 72 variables, but since I have never heard about that function I am not sure. I will try creating a new dataframe with just the variables I need.

Rentrop Over a year ago

The 'setnames' line just Alters the Column names of the Data. Have a Look at the Data before and after. It is not difficult.

ccamara Over a year ago

I am re-reading your code, and honestly (and shamely) I do not understand almost anything you do on it. I still have to learn a lot about R...

Rentrop Over a year ago

And the line with by counts the occurrences

Collectives™ on Stack Overflow

Overlaying line graph with barplot in ggplot2

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related