266

With this data frame ("df"):

year pollution
1 1999 346.82000
2 2002 134.30882
3 2005 130.43038
4 2008  88.27546

I try to create a line chart like this:

  plot5 <- ggplot(df, aes(year, pollution)) +
           geom_point() +
           geom_line() +
           labs(x = "Year", y = "Particulate matter emissions (tons)", title = "Motor vehicle emissions in Baltimore")

The error I get is:

geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?

The chart appears as a scatter plot even though I want a line chart. I tried to replace geom_line() with geom_line(aes(group = year)) but that didn't work.

In an answer I was told to convert year to a factor variable. I did and the problem persists. This is the output of str(df) and dput(df):

'data.frame':   4 obs. of  2 variables:
 $ year     : num  1 2 3 4
 $ pollution: num [1:4(1d)] 346.8 134.3 130.4 88.3
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr  "1999" "2002" "2005" "2008"

structure(list(year = c(1, 2, 3, 4), pollution = structure(c(346.82, 
134.308821199349, 130.430379885892, 88.275457392443), .Dim = 4L, .Dimnames = list(
    c("1999", "2002", "2005", "2008")))), .Names = c("year", 
"pollution"), row.names = c(NA, -4L), class = "data.frame")
5
  • 1
    It gives no error when I run it. Its likely that df is not what you think it is. Please state your question in reproducible form, i.e. show the output of dput(df). Commented Nov 22, 2014 at 21:27
  • could be that your variables are factors, then you'd need to convert them to numeric Commented Nov 22, 2014 at 21:36
  • @G.Grothendieck I posted what you said. I also converted to numeric and still have the problem. Commented Nov 22, 2014 at 21:44
  • 1
    You really should state questions in reproducible form. It's hard to help you if we can't recreate the error. Commented Apr 24, 2018 at 20:37
  • is it possible to rank the line point in descending order of "pollution"? Commented Mar 29, 2021 at 7:12

7 Answers 7

547

You only have to add group = 1 into the ggplot or geom_line aes().

For line graphs, the data points must be grouped so that it knows which points to connect. In this case, it is simple -- all points should be connected, so group=1. When more variables are used and multiple lines are drawn, the grouping for lines is usually done by variable.

Reference: Cookbook for R, Chapter: Graphs Bar_and_line_graphs_(ggplot2), Line graphs.

Try this:

plot5 <- ggplot(df, aes(year, pollution, group = 1)) +
         geom_point() +
         geom_line() +
         labs(x = "Year", y = "Particulate matter emissions (tons)", 
              title = "Motor vehicle emissions in Baltimore")
Sign up to request clarification or add additional context in comments.

6 Comments

Of note , grouping has to be done with the group argument. Grouping only e.g. by color would not be sufficient.I just had this trouble and hope this helps someone running into the same
is this answer still valid? Adding group=1 in the aesthetics doesn't seem to be working anymore.
@Giacomo -- works for me, on 3.6.2 on a Mac. Was getting the dreaded warning, but adding group=1 fixed the problem. ggplot(lakemeta, mapping=aes(x=Lake, y=Area, group=1)) + geom_line(size=2, color="blue")
is it possible to rank the point in descending order of "pollution"?
Is there any reason geom_line() couldn't assume group = 1 if it's omitted?
|
51

You get this error because one of your variables is actually a factor variable . Execute

str(df) 

to check this. Then do this double variable change to keep the year numbers instead of transforming into "1,2,3,4" level numbers:

df$year <- as.numeric(as.character(df$year))

EDIT: it appears that your data.frame has a variable of class "array" which might cause the pb. Try then:

df <- data.frame(apply(df, 2, unclass))

and plot again?

1 Comment

This is for me a convenient answer because it fix the issue from the root
9

I had similar problem with the data frame:

group time weight.loss
1 Control  wl1    4.500000
2    Diet  wl1    5.333333
3  DietEx  wl1    6.200000
4 Control  wl2    3.333333
5    Diet  wl2    3.916667
6  DietEx  wl2    6.100000
7 Control  wl3    2.083333
8    Diet  wl3    2.250000
9  DietEx  wl3    2.200000

I think the variable for x axis should be numeric, so that geom_line knows how to connect the points to draw the line.

after I change the 2nd column to numeric:

 group time weight.loss
1 Control    1    4.500000
2    Diet    1    5.333333
3  DietEx    1    6.200000
4 Control    2    3.333333
5    Diet    2    3.916667
6  DietEx    2    6.100000
7 Control    3    2.083333
8    Diet    3    2.250000
9  DietEx    3    2.200000

then it works.

Comments

1

Start up R in a fresh session and paste this in:

library(ggplot2)

df <- structure(list(year = c(1, 2, 3, 4), pollution = structure(c(346.82, 
134.308821199349, 130.430379885892, 88.275457392443), .Dim = 4L, .Dimnames = list(
    c("1999", "2002", "2005", "2008")))), .Names = c("year", 
"pollution"), row.names = c(NA, -4L), class = "data.frame")

df[] <- lapply(df, as.numeric) # make all columns numeric

ggplot(df, aes(year, pollution)) +
           geom_point() +
           geom_line() +
           labs(x = "Year", 
                y = "Particulate matter emissions (tons)", 
                title = "Motor vehicle emissions in Baltimore")

4 Comments

Start up R in a fresh session and paste the code in my post into it.
Have you figured out this problem. I have same problem to yours which I have only one value for each x value. Waiting for your response. Thanks.
Can you explain why converting everything to numeric fixes the issue? My ordered factor variable is a character one, so I can't use numerics in its stead.
pollution is a 1d array rather than a plain vector. Look at str(df)
1

I got a similar prompt. It was because I had specified the x-axis in terms of some percentage (for example: 10%A, 20%B,....). So an alternate approach could be that you multiply these values and write them in the simplest form.

Comments

1

I found this can also occur if the most of the data plotted is outside of the axis limits. In that case, adjust the axis scales accordingly.

Comments

0

In case someone has the same headscratch as I did:

If you're doing a geom_line plot with facet_wrap and just one of the groups that your facetting with has an issue with the amount of data, you're going to get the message (ggplot 3.5.2)

`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?

It took me a while to realize what was going on since I had a lot of groups and the plot looked mostly fine. The message concerns the panel which has only one observation. Simple example:

foo <- tibble(
  value = c(1:4,1:4,1),
  year = c(2001L:2004L, 2001L:2004L, 2002L),
  g = c(rep("G1", 4), rep("G2", 4), "G3")
)

ggplot(foo, aes(year, value)) +
  geom_line() +
  facet_wrap(~g)

A graph with three panels, of which the frist two show a simple line plot and the third has nothing.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.