0

I am using palmer penguins dataset in R and currently having trouble plotting using the summarized function. The code is below, any tips would be helpful. I've also attached a picture below of the expected graph output.

enter image description here

ex1 <- penguins %>%
  group_by(sex,species) %>%
  summarise(fmean_body = mean(body_mass_g)) %>%
  

ggplot(ex1, aes(x = year))+
  geom_line(aes(x = year-2000, y = body_mass_g/1000, linetype = sex))+
  theme_bw()+
  facet_wrap(~species)+
  labs(title="Average Body Mass Over Time", 
       y="Body Mass (in kg)")

error: 
`summarise()` has grouped output by 'sex'. You can override using the `.groups` argument.
Error: Mapping should be created with `aes()` or `aes_()`.
1
  • You have a pipe before the ggplot2 code that needs to be removed. Also, if you want your analysis by year, it needs to be included in the variables you are grouping by prior to summarizing. Commented Apr 4, 2021 at 15:05

1 Answer 1

1

I believe the main issue is that a pipe operator has been left at the end of the line with the summarise function on it.

The pipe operator tells R to expect another line of code where the first argument will me the object created so far so ggplot has an additional argument with ex1 now in the position of the mapping argument.

This should make it easier to spot the other errors in the code.

The year argument is currently dropped by the summarise function . We are taking the mean over all values and so the year doesn't exist anymore in ex1, we can fix this by adding year to the group_by function.

Additionally the body_mass_g has been replaced by fmean_body and so me have to replace the name when we pick our y axis in geom_line.

library(palmerpenguins)
library(tidyverse)

ex1 <- penguins %>%
  # Added group_by year
  group_by(sex,species, year) %>%
  # Removed unused pipe operator
  summarise(fmean_body = mean(body_mass_g))
#> `summarise()` has grouped output by 'sex', 'species'. You can override using the `.groups` argument.


ggplot(ex1, aes(x = year))+
  # Replaced the body_mass_g variable with fmean_body
  geom_line(aes(x = year-2000, y = fmean_body/1000, linetype = sex))+

  theme_bw()+

  facet_wrap(~species)+

  labs(title="Average Body Mass Over Time", 

       y="Body Mass (in kg)")
#> Warning: Removed 2 row(s) containing missing values (geom_path).

Created on 2021-04-04 by the reprex package (v2.0.0)

To align this more with the intended image we will have to make a couple more changes, these include adding a shape argument to the plots, removing NA values and using the scales package to modify the axis labels.

library(palmerpenguins)
library(tidyverse)
library(scales)

ex1 <- penguins %>%
  group_by(sex,species, year) %>%
  summarise(fmean_body = mean(body_mass_g)) %>% 
  # Removed/Ignored NA values
  filter(!is.na(sex))
#> `summarise()` has grouped output by 'sex', 'species'. You can override using the `.groups` argument.
  

ggplot(ex1, aes(x = year))+

  geom_line(aes(x = year-2000, y = fmean_body/1000, linetype = sex)) +
  #Added points to the plot
  geom_point(aes(x = year-2000, y = fmean_body/1000, shape = sex)) +
  #Modified xticks
  scale_x_continuous(breaks = c(7, 8, 9), labels = label_comma(accuracy = 1)) +

  theme_bw()+

  facet_wrap(~species)+

  labs(title="Average Body Mass Over Time", 

       y="Body Mass (in kg)", shape = "Sex", linetype = "Sex")

Created on 2021-04-04 by the reprex package (v2.0.0)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.