I have searched and searched in the stacks for an answer to my question; this one approaches my question but I have been unsuccessful in modifying the code to fix my graph.
I have data, reshaped in long format, that looks like this:
ID Var1 GenePosition ContinuousOutcomeVar
1 control X20068492 0.092813611
2 control X20068492 0.001746708
3 case X20068492 0.069251157
4 case X20068492 0.003639304
Each ID has one value for ContinuousOutcomeVar per position, and there are 86 positions and 10 IDs. I want to plot a line graph with position on the x axis and the continuous outcome variable on the y axis. I want two groups: a case group and control group, so there should be two dots for every position: one is the mean value for cases, and one is the mean value for controls. Then I want a line that connects the cases, and a line that connects the controls. I know this is easy, but I'm new to R - I've been working at it for 8 hours and I can't quite get it right. Below is what I have; I'd really appreciate some insight. If this exists somewhere in the stacks, I really apologize...I honestly looked all over and tried modifying a lot of code but still haven't gotten it right.
My code: This code plots all the values for all IDs at each position, and connects them for the two groups. It gives me a black dot at the mean of all 10 values per position (I think):
lineplot <- ggplot(data=seq.long, aes(x=Position, y=PMethyl,
group=CACO, colour=CACO)) +
stat_summary (fun.y=mean, geom="point", aes(group=1), color="black") +
geom_line() + geom_point()
I can't get R to not plot all 10 points; just two means (one per case/control group) per position, with cases' & controls' values each connected by a line across the x axis.
