0

I have the following data frames:

df1<-data.frame(id_1=c(1,2,3,4,5),
                value1=c(0,0.2,0.5,0.8,0),
                value2=c(0.1,0.3,0.5,0.7,0.8),
                value3=c(0.5,0.6,0.3,0.2,0.1))

df2<-data.frame(id_2=c(1,2,3,4,5),
                value1=c(0,0.2,0.5,0.8,0),
                value2=c(0.1,0.1,0.5,0.6,0.7),
                value3=c(0.4,0.4,0.8,0.9,0.2))

I want to make the following plot:

ggplot(data.frame(x=df1$value1, y=df2$value1), aes(x=x, y=y)) + 
       geom_point() + 
       geom_point(data.frame(x=df1$value2, y=df2$value2), aes(x=x, y=y)) + 
       geom_point(data.frame(x=df1$value3, y=df2$value3), aes(x=x, y=y))

How can I make that plot without having to copy paste geom_point() for each value column? And afterwards, how can I find the correlation coefficient for the variables in final overlapped plot?

Any help would be much appreciated, thanks!

1
  • Like this? Or this or this? Once you've bound the data frames, just use cor to get the coefficient Commented Jan 18, 2022 at 15:50

1 Answer 1

0

You need to combine your data into one data frame. Here's one way:

## make column names the same
## and add columns indicating the data frame source
df1$var = "x"
df2$var = "y"
names(df1)[1] = "id"
names(df2)[1] = "id"

## put the data together
df = rbind(df1, df2)

## reshaped the data
library(tidyr) 
df = pivot_longer(df, starts_with("value"))
df = pivot_wider(df, names_from = "var", values_from = "value")
df
# # A tibble: 15 × 4
#       id name       x     y
#    <dbl> <chr>  <dbl> <dbl>
#  1     1 value1   0     0  
#  2     1 value2   0.1   0.1
#  3     1 value3   0.5   0.4
#  4     2 value1   0.2   0.2
#  5     2 value2   0.3   0.1
#  6     2 value3   0.6   0.4
#  7     3 value1   0.5   0.5
#  8     3 value2   0.5   0.5
#  9     3 value3   0.3   0.8
# 10     4 value1   0.8   0.8
# 11     4 value2   0.7   0.6
# 12     4 value3   0.2   0.9
# 13     5 value1   0     0  
# 14     5 value2   0.8   0.7
# 15     5 value3   0.1   0.2

Once your data is in a tidy format, plotting is simple. You could further customize your plot using shape or color aesthetics to identify the data source.

ggplot(df, aes(x = x, y = y)) +
  geom_point()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.