How do I display the total number of observations (n) in a geom_point plot? I know how to include the number by manually adding (e.g.) "n = 1000", but I want to be able to have the number of observations counted automatically for each figure and then displayed somewhere on the figure.
Most of the code I've seen online is for adding n to boxplots (see example below). They don't seem to work for scatter plots (geom_point):
geom_text(aes(label=paste0("N = ", length(disabled)),
x=length(unique(disabled)), y=max(table(disabled)))) +
This is the code for my figure:
ggplot(scs, aes(x=year, y=disabled, color=unemployed, size=pop)) +
geom_point(aes(size=pop), alpha = 0.3) +
labs(x = "Year",
y = "Disabled",
color = "Unemployed") +
scale_size_continuous("Population size") +
theme(
axis.title.x = element_text(margin=margin(t=10)),
panel.background = element_rect(fill=NA),
legend.title = element_text(size=10),
legend.key = element_blank())
When I add the geom_point code, it oddly changes the labeling of my size legend.
EDITED:
Thanks for the replies so far. Just to be clear, I don't want n broken down by groups. I want the total number of observations used in the figure.
I don't know how to share my data but this is the output of dput(head(scs, 20)):
> dput(head(scs, 20))
structure(list(
year = c(2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015,
2016, 2017, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013),
county_name = c("autauga", "autauga", "autauga", "autauga", "autauga",
"autauga", "autauga", "autauga", "autauga", "autauga", "autauga",
"autauga", "barbour", "barbour", "barbour", "barbour", "barbour",
"barbour", "barbour", "barbour"),
disabled = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 6,
6),
unemployed = c(4, 3, 3, 5, 10, 9, 8, 7, 6, 6, 5, 5, 6, 6, 6, 9,
14, 12, 12, 12),
pop = c(55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036, 55036,
55036, 55036, 55036, 26201, 26201, 26201, 26201, 26201, 26201,
26201, 26201)),
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11",
"12", "25", "26", "27", "28", "29", "30", "31", "32"),
class = "data.frame")

dput(scs). Or, if it is too big with the output ofdput(head(scs, 20)).scs, do you want the (summary) counts broken out by(year, disabled, unemployed)? If so, manually doscs %>% groupby(year, disabled, unemployed) %>% summarize(n=n()). We're really going to need a dataset to post code solutions, can you please edit your question to use e.g. one of the R builtin datasets (diamonds, baseball, mtcars or whatever)?