1

I need to plot multiple lines in a connected scatterplot, but am running into issues. I can plot one line, but not the other (or more).

Code

library(dplyr)
library(ggplot2)
library(ggrepel)
library(tidyr)

year<-c(2010,2011,2012,2013,2014,2015,2016,2017,2018,2019)

variableA1<-c(56,169,313,595,797,989,934,869,824,662)
variableB1<-c(0,0,5,12,23,44,73,71,78,103)

variableA2<-c(22,58,159,342,603,1021,1589,2071,2268,2044)
variableB2<-c(1,1,0,3,7,9,33,59,84,98)


data<-data.frame(year,variableA1,variableB1,variableA2,variableB2)

data %>% 
  ggplot(aes(x=variableA1, y=variableB1, label=year)) +
     geom_point(color="#333333") + 
     geom_text_repel() +
     geom_segment(color="#333333", 
                aes(
                    xend=c(tail(variableA1, n=-1), NA), 
                    yend=c(tail(variableB1, n=-1), NA)
                ),
                arrow=arrow(length=unit(0.3,"cm"))
                ) +
     geom_point(color="#a8a8a8") + 
     geom_text_repel() +
     geom_segment(color="#a8a8a8", 
                aes(
                    xend=c(tail(variableA2, n=-1), NA), 
                    yend=c(tail(variableB2, n=-1), NA)
                ),
                arrow=arrow(length=unit(0.3,"cm"))
                )

ggplot Chart

Connected Scatterplot

1 Answer 1

1

This is not really the way to do things with ggplot2.

Instead, you define the x and y coordinates for both series in the same columns and add another column that specifies which data series the values belong to.

EDIT

The segment coordinates (for the arrows) should also be set explicitly for each data series:

data_points<-data.frame(year = year, varA = variableA1, varB = variableB1, series = "series1") %>%
  bind_rows(data.frame(year = year, varA = variableA2, varB = variableB2, series = "series2"))

data_lines<-data.frame(
    x = head(variableA1, n=-1),
    y = head(variableB1, n=-1),
    xend = tail(variableA1, n=-1),
    yend = tail(variableB1, n=-1),
    series = "series1") %>%
  bind_rows(
    data.frame(
      x = head(variableA2, n=-1), 
      y = head(variableB2, n=-1), 
      xend = tail(variableA2, n=-1),
      yend = tail(variableB2, n=-1), 
      series = "series2")
  )

data_points %>% 
  ggplot(aes(x=varA, y=varB, label=year, color=series)) +
  geom_point(color="#333333") + 
  geom_text_repel() +
  geom_segment(data = data_lines,
               aes(x = x, y = y, xend = xend, yend = yend, 
                   color = series, label=NA),
               arrow=arrow(length=unit(0.3,"cm"))
  ) +
  scale_color_manual( values = c('#333333', '#a8a8a8'))

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. That seems closer to what I was looking for, but how do I avoid that this is one continuous graph? In the figure above, 2019 in the first series is connected to 2010 of the second series. I need them to be two "separate" lines.
You're righ @Rahul, the start and end coordinates of the line segments are still off. They need to be specified differently. I'll have a look a bit later.
Have a look now. I edited the answer to include the correct coordinates for the arrow segments.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.