2

I'm using ggplot2 to plot different time series (one for Alice, one for Bob, one for Eve), which have a different number of missing values.

require('ggplot2')
df3 <-  data.frame(name=c(rep("Alice",10),rep("Bob",10),rep("Eve",10)),value=c(seq(1,10), seq(4,13), seq(5,14)), time=rep(seq(1,10),3))
df3$value[c(3,4,15,16,17,22,23,24,25)]<- NA
ggplot(data=df3, aes(time, value)) + 
  geom_line() + 
  geom_point() + facet_wrap(~ name, nrow=1)

I'd like to have the count of NAs displayed in each of the plots, e.g. as an overlay of a number (2 for Alice, 3 for Bob, 4 for Eve). Is there an elegant way to do this?

2
  • 2
    create a dataset with the number of NA for each name and then use geom_text() to add the values to the plot. Commented Jan 8, 2016 at 20:24
  • I asked a similar question and got help here. stackoverflow.com/questions/25081619/… Commented Jan 8, 2016 at 20:43

2 Answers 2

3

As @MLavoie suggested in the comments, generate a new dataframe for the text labels then work with that. This should work for your purposes:

require('ggplot2')
require('dplyr')

df3 <-      data.frame(name=c(rep("Alice",10),rep("Bob",10),rep("Eve",10)),value=c(seq(1,10), seq(4,13), seq(5,14)), time=rep(seq(1,10),3))
df3$value[c(3,4,15,16,17,22,23,24,25)]<- NA

NAdf<-df3 %>%
  group_by(name) %>%
  summarise(ycoor=mean(value, na.rm=TRUE),
            xcoor=mean(time, na.rm=TRUE),
            num_NA=sum(is.na(value)))  


ggplot(data=df3, aes(time, value)) + 
 geom_line() + 
 geom_point() + 
 geom_text(data=NAdf, aes(x=xcoor, y=ycoor, label=paste(num_NA,"for",name))) +
 facet_wrap(~ name, nrow=1) 

enter image description here HTH

Updated

In response to the comment below. Generally I find placing text labels into a facetted plot fairly finicky. In your example you could simply define the x and y coordinates as 5,5 for all panels like this:

NAdf<-df3 %>%
  group_by(name) %>%
  summarise(ycoor=5,
            xcoor=5,
            num_NA=sum(is.na(value)))  

Then you could plot using the same code as before:

 ggplot(data=df3, aes(time, value)) + 
 geom_line() + 
 geom_point() + 
 geom_text(data=NAdf, aes(x=xcoor, y=ycoor, label=paste(num_NA,"for",name))) +
 facet_wrap(~ name, nrow=1)

The issue with this is that it isn't a generalized solution. In practice though I find you need to fiddle with your geom_text plotting coordinates each and every time to get it just right. Truth be told @Sam Dickson's solution is very elegant for this particular problem.

Sign up to request clarification or add additional context in comments.

3 Comments

What's the %>%? I'm getting an error: Error: could not find function "%>%"
You need to load dplyr. Also it is %>% not %<%. Think of %>% (called a pipe) as the word "then". It is automatic for me to use pipes these days.
What if the time series consists entirely of 'NAs'? Is the some way to place the text in the middle of each of the boxes?
2

One option is to add the count to the variable used in the faceting:

df3$NAs <- ave(df$value,df$name,FUN=function(x) sum(is.na(x))))
df3$name1 <- paste0(df3$name,' (NA = ',df3$NAs,')') 
ggplot(data=df3, aes(time, value)) + 
  geom_line() + 
  geom_point() + facet_wrap(~ name1, nrow=1)

enter image description here

1 Comment

Your solution is very elegant and nice, but unfortunately, it doesn't work for facet_grids, where I'd like to use this information as well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.