Plotting multi-dimensions of data using ggplot2

Question

I have below mentioned directory structure:

Folder named A contains txt files named 1, 2, 3, .., 5
Folder named B contains txt files named 1, 2, 3, .., 5
|
--A (Folder)
  |---1.txt
  |---2.txt
  ....
  |---5.txt

--B (Folder)
  |---1.txt
  |---2.txt
  ....
  |---5.txt

I am reading these text files into data frames through 2 nested for loops. Single data frame looks like this:

df <- data.frame(Comp.1 = c(0.3, -0.2, -1, NA, 1),
         Comp.2 = c(-0.4, -0.1, NA, 0, 0.6),
         Comp.3 = c(0.2, NA, -0.4, 0.3, NA))
row.names(df) <- c("Param1", "Param2", "Param3", "Param4", "Param5")

Values always lie between -1 and +1. Number of rows (parameters) and number of columns (components) of all these data frames are not same. For eg: the above data frame is of 3x5, others can be 5x15, 4x10, 5x40, etc.

I want a plot that has:

1. parameters on x-axis
2. components on y-axis
3. values as points in the above graph 
4. shape of point representing folder name (A = square, B = triangle, C = circle, .., E)
5. color inside the point shape representing file name (1, 2, 3, .., 5)
6. color intensity describing value (For eg: light red [almost white] color representing closer to -1 like -0.98, dark red representing closer to 1 like 0.98)

I have this code:

alphabets = c("A", "B", "C", "D", "E", "F")
numbers = c(1, 2, 3, 4, 5)

pca.plot <- ggplot(data = NULL, aes(xlab="Principal Components",ylab="Parameters"))

for (alphabet in alphabets){
   for(number in numbers){

   filename=paste("/filepath/",alphabet,"/",number,".txt", sep="")

   df <- read.table(filename)

   #Making all row dimensions = 62. Adding rows with NAs
   if(length(row.names.data.frame(df))<62){
      row_length = length(row.names.data.frame(df))
      for(i in row_length:61){
          new_row = c(NA, NA, NA, NA, NA, NA)
          df<-rbind(df, new_row)  
      }
   }

   df$row.names<-rownames(df)
   long.df<-melt(df,id=c("row.names"), na.rm = TRUE)
   pca.plot<-pca.plot+geom_point(data=long.df,aes(x=variable,y=row.names, shape = number, color=alphabet, size = value))
   }
}

Output of this code is this:

EDIT: After following @Gregor's steps mentioned in comments, I have a big_data_frame like this: head(big_data, 3)

Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 params alphabet number 1 NA NA NA NA NA param1 A 1 2 NA NA NA 0.89 NA param2 A 1 3 NA -0.95 NA NA NA param3 A 1

Combine your data into one data frame - one tidy data frame - and this will be trivial. I would recommend reading your data into a list of data frames and then combining them all at once. — Gregor Thomas
– Gregor Thomas, Commented Feb 14, 2017 at 22:48
I have list of data frames ready. Filled with NAs wherever rows/columns weren't there. How should I plot now? How do we access attribute names of data frame list? — Globox
– Globox, Commented Feb 14, 2017 at 23:45
Please notice the first sentence of my comment: Combine your data into one data frame. If you need help with this, see the section called Combining a list of data frames into a single data frame in the answer I linked above. Make sure that the attributes you want to plot, including the file name and folder name, are columns in your data frame. If the file names are the names of your list, then, as stated in the link, dplyr::bind_rows or data.table::rbindlist will automatically add them as columns. — Gregor Thomas
– Gregor Thomas, Commented Feb 15, 2017 at 0:16
Great. Can you show it in your question? If you post dput(droplevels(head(your_data, 10))) we will get a copy/pasteable version of the first 10 rows of your data. — Gregor Thomas
– Gregor Thomas, Commented Feb 15, 2017 at 21:57
when i try to melt this big_data frame, big_data.long <- melt(big_data,id=c("params"), na.rm = TRUE) and then plot using final.plot<-ggplot(data=big_data.long, aes(xlab = "COMPONENTS", ylab = "PARAMETERS"))+geom_point( aes(x=(variable),y=(params))) I don't get what I want. Tried a lot! — Globox
– Globox, Commented Feb 15, 2017 at 22:18

Community · Accepted Answer · 2017-05-23 11:53:24Z

1

You need to melt the data frame to collapse all the Comp columns. The other columns should stay the same:

long_data = reshape2::melt(
    big_data,
    id.vars = c("params", "alphabet", "number"),
    variable.name = "comp",
    value.name = "value",
    na.rm = T
)

Now, most of your requirements are easy:

parameters on x-axis

components on y-axis

values as points in the above graph

shape of point representing folder name (A = square, B = triangle, C = circle, .., E)

color inside the point shape representing file name (1, 2, 3, .., 5)

color intensity describing value (For eg: light red [almost white] color representing closer to -1 like -0.98, dark red representing closer to 1 like 0.98)

ggplot(long_data, aes(
    x = params, y = comp, size = value,
    shape = folder, color = factor(number), alpha = value
)) +
    geom_point()

The tricky part is the requirements for both color intensity and overall color. The only way I know to approximate this using standard ggplot is to use transparency as I did above. This is the approach taken in, e.g., this question.

Note this is untested as your data isn't shared reproducibly. Share data with dput as suggested in the comments if there are issues that need testing.

edited May 23, 2017 at 11:53

CommunityBot

11 silver badge

answered Feb 15, 2017 at 23:22

Gregor Thomas

147k22 gold badges185 silver badges320 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Globox Over a year ago

Worked for me. Thanks Gregor. I can tweak the fancy part. Liked your way of leading me to solution. :)

Gregor Thomas Over a year ago

Thanks! Glad it worked out - and glad you appreciated the approach. Not everyone loves it but I'm convinced you learn more from it :D

Gregor Thomas Over a year ago

Next time you share data though, do it with dput. Makes it so much easier to reproduce for the people trying to help you.

Collectives™ on Stack Overflow

Plotting multi-dimensions of data using ggplot2

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related