4

How do I display the total number of observations (n) in a geom_point plot? I know how to include the number by manually adding (e.g.) "n = 1000", but I want to be able to have the number of observations counted automatically for each figure and then displayed somewhere on the figure.

Most of the code I've seen online is for adding n to boxplots (see example below). They don't seem to work for scatter plots (geom_point):

geom_text(aes(label=paste0("N = ", length(disabled)), 
x=length(unique(disabled)), y=max(table(disabled)))) +

This is the code for my figure:

ggplot(scs, aes(x=year, y=disabled, color=unemployed, size=pop)) + 
geom_point(aes(size=pop), alpha = 0.3) +
labs(x = "Year",
    y = "Disabled",
    color = "Unemployed") +
scale_size_continuous("Population size") +
theme(
    axis.title.x = element_text(margin=margin(t=10)),
    panel.background = element_rect(fill=NA),
    legend.title = element_text(size=10),
    legend.key = element_blank())

When I add the geom_point code, it oddly changes the labeling of my size legend.

EDITED:

Thanks for the replies so far. Just to be clear, I don't want n broken down by groups. I want the total number of observations used in the figure.

I don't know how to share my data but this is the output of dput(head(scs, 20)):

> dput(head(scs, 20))
structure(list(
year = c(2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 
    2016, 2017, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013), 
county_name = c("autauga", "autauga", "autauga", "autauga", "autauga", 
    "autauga", "autauga", "autauga", "autauga", "autauga", "autauga", 
    "autauga", "barbour", "barbour", "barbour", "barbour", "barbour", 
    "barbour", "barbour", "barbour"), 
disabled = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 6, 
    6), 
unemployed = c(4, 3, 3, 5, 10, 9, 8, 7, 6, 6, 5, 5, 6, 6, 6, 9, 
    14, 12, 12, 12), 
pop = c(55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 
    55036, 55036, 55036, 26201, 26201, 26201, 26201, 26201, 26201, 
    26201, 26201)), 
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", 
    "12", "25", "26", "27", "28", "29", "30", "31", "32"), 
class = "data.frame")
3
  • 3
    Can you post sample data? Please edit the question with the output of dput(scs). Or, if it is too big with the output of dput(head(scs, 20)). Commented Jul 21, 2019 at 20:01
  • 1
    Yes, in a legend. Just look for posts on legends in ggplot. As to the counts: without having your dataset scs, do you want the (summary) counts broken out by (year, disabled, unemployed)? If so, manually do scs %>% groupby(year, disabled, unemployed) %>% summarize(n=n()). We're really going to need a dataset to post code solutions, can you please edit your question to use e.g. one of the R builtin datasets (diamonds, baseball, mtcars or whatever)? Commented Jul 21, 2019 at 20:20
  • Your desired output is unclear. What is manual about your current solution as you are using variables in dataset and not hard-coding? Please explain: but I want to be able to have the number of observations counted automatically for each figure. Each figure or point? Commented Jul 21, 2019 at 22:35

1 Answer 1

2

Well assuming you mean what you say and all you want is an overall count of the number of rows in scs then that is nrow(scs). You can use paste to add context and make it a string.

I would personally put it in the title, the subtitle, or the caption since scatterplots don't have a natural place to put it like boxplots. But if you want it on the plot figure out the x and y coordinates and add it using annotate.

An example using you data and all of those...

library(tidyverse)
scs <- structure(list(
  year = c(2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015,
           2016, 2017, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013),
  county_name = c("autauga", "autauga", "autauga", "autauga", "autauga",
                  "autauga", "autauga", "autauga", "autauga", "autauga", "autauga",
                  "autauga", "barbour", "barbour", "barbour", "barbour", "barbour",
                  "barbour", "barbour", "barbour"),
  disabled = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 6,
               6),
  unemployed = c(4, 3, 3, 5, 10, 9, 8, 7, 6, 6, 5, 5, 6, 6, 6, 9,
                 14, 12, 12, 12),
  pop = c(55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036,
          55036, 55036, 55036, 26201, 26201, 26201, 26201, 26201, 26201,
          26201, 26201)),
  row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11",
                "12", "25", "26", "27", "28", "29", "30", "31", "32"),
  class = "data.frame")

ggplot(scs, aes(x=year, y=disabled, color=unemployed, size=pop)) +
  geom_point(aes(size=pop), alpha = 0.3) +
  labs(title = paste("Number of observations: ", nrow(scs)),
       subtitle = paste("Number of observations: ", nrow(scs)),
       caption = paste("Number of observations: ", nrow(scs)),
       x = "Year", 
       y = "Disabled",
       color = "Unemployed") +
  scale_size_continuous("Population size") +
  theme(
    axis.title.x = element_text(margin=margin(t=10)),
    panel.background = element_rect(fill=NA),
    legend.title = element_text(size=10),
    legend.key = element_blank()) +
    annotate("text", 
             x = 2012.25, 
             y = 4.5, 
             label = paste("Number of observations: ", nrow(scs)))

Created on 2019-07-22 by the reprex package (v0.3.0)

Sign up to request clarification or add additional context in comments.

3 Comments

This is exactly what I was looking for--thank you!!
Do you know how I can omit rows with missing values (NAs) from the number of observations?
@trinitysara Please consider marking the answer correct. Details depend on whether you want the safe option to eliminate any rows with missing values: which would be tempdf <- scs %>% drop_na or just one or more variables with for example tempdf <- scs %>% filter(!is.na(disabled)) then use tempdf in the ggplot call instead of your scs make sure you replace every scs with tempdf it will do the rest.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.