0

I have this data frame called mydf. I am trying to plot this data as shown below, but I want to label only those samples that have more than 1.2 contamination (instead of everything). I also want to add a horizontal line at 1.2 contamination margin. How do I do this in R?

 mydf <- structure(list(sample.names = structure(c(2L, 3L, 4L, 5L, 6L, 
    1L, 7L, 8L, 9L, 10L), .Label = c("LPH-001-1", "LPH-001-10", "LPH-001-10_AK1", 
    "LPH-001-10_AK2", "LPH-001-10_PD", "LPH-001-10_SCC", "LPH-001-13", 
    "LPH-001-13_AK1", "LPH-001-13_AK2", "LPH-001-13_PD"), class = "factor"), 
        contamination = structure(c(5L, 1L, 4L, 2L, 2L, 4L, 3L, 8L, 
        7L, 6L), .Label = c("0.7", "1.0", "1.1", "1.2", "1.3", "1.4", 
        "1.7", "2.0"), class = "factor")), .Names = c("sample.names", 
    "contamination"), row.names = c(NA, -10L), class = "data.frame")

cc<- ggplot(mydf, aes(x=sample.names, y=contamination, label= mydf[,"sample.names"])) + geom_point()

        cc + geom_text() 
1
  • Clean up the data before plotting, why numbers stored as factors? Commented Sep 6, 2016 at 8:04

1 Answer 1

2

I would convert sample.names and contamination to character and numeric vectors respectively, then make a new vector of names that has placeholders for samples with contamination <= 1.2. geom_hline can add the horizontal line.

mydf$contamination <- as.numeric(as.character(mydf$contamination))
mydf$sample.names <- as.character(mydf$sample.names)
mydf$sample.names1.2 <- ifelse(mydf$contamination > 1.2, mydf$sample.names, "")

ggplot(mydf, aes(x=sample.names, y=contamination, label = sample.names1.2)) + 
  geom_point() +
  geom_text() +
  geom_hline(yintercept = 1.2)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.