I have an R script that generates plots based on the run time data from a simulation. However, sometimes there are errors during the runs which result in null run time values and lead to graphics that make it seem like the run time is smaller than it really was.
Here's an example of what the data in the "data" data frame might look like:
| Version | TotalMean | TestNum | Case |
|:-------:|:---------:|:-------:|:-----:|
| 1.0.1 | 350 | 1 | Case1 |
| 1.0.2 | 430 | 2 | Case1 |
| 1.0.4 | 470 | 3 | Case1 |
| 1.0.7 | 445 | 4 | Case1 |
| 1.0.1 | 320 | 1 | Case2 |
| 1.0.2 | 280 | 2 | Case2 |
| 1.0.4 | 450 | 3 | Case2 |
| 1.0.7 | 420 | 4 | Case2 |
| 1.0.1 | 335 | 1 | Case3 |
| 1.0.2 | 415 | 2 | Case3 |
| 1.0.4 | 465 | 3 | Case3 |
| 1.0.7 | 430 | 4 | Case3 |
| 1.0.1 | 310 | 1 | Case4 |
| 1.0.2 | 375 | 2 | Case4 |
| 1.0.4 | 425 | 3 | Case4 |
| 1.0.7 | 410 | 4 | Case4 |
Note that there are no null values listed in that table. That's because the way that the TotalMean column is calculated will never reflect that. However, there are nulls found in the data frame that TotalMean is calculated from. Is there any way that I could make geom_point dependent on whether there are null values in a certain table? Maybe change the shape and size?
Use the code below to create a working example. Version 1.0.2 in Case2 has an anomalous value because it had null values in the original table.
library(ggplot2)
Version <- c("1.0.1","1.0.2","1.0.4","1.0.7","1.0.1","1.0.2","1.0.4","1.0.7","1.0.1","1.0.2","1.0.4","1.0.7","1.0.1","1.0.2","1.0.4","1.0.7")
TotalMean <- c(350,430,470,445,320,280,450,420,335,415,465,430,310,375,425,410)
TestNum <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
Case <- c("Case1","Case1","Case1","Case1","Case2","Case2","Case2","Case2","Case3","Case3","Case3","Case3","Case4","Case4","Case4","Case4")
data <- data.frame(Version,TotalMean,TestNum,Case)
versions <- unique(data[order(data$TestNum), ][,1])
data$Version <- factor(data$Version, levels = versions)
Here's the code that I use to create a chart like I use. (using ggplot2)
g<-ggplot(data, aes(color = Case, x = Version, y = TotalMean, group = Case)) +
geom_line() + geom_point(shape = 16, size = 2) + coord_cartesian(ylim=c(0,550)) +
labs(x="Version", y="Run Time (minutes)") +
stat_summary(fun.y=sum, geom="line") +
theme(plot.title = element_text(face = "bold", size = 16, vjust = 1.5)) +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
theme(axis.title.y = element_text(vjust = 1))
g
TotalMean.any(is.null(x))and set the shape in ggplot according to that column.any(is.null(x))but how would I set the shape in ggplot according to the column that results from that?geom_point(shape = 16, size = 2)togeom_point(shape = IsNullColumn, size = 2). Let's make that column a numeric one instead of Boolean.geom_pointif shape is set toIsNullColumn? How does all of this exactly communicate together? If you're changing the column to a numeric (I assume 1 for T and 0 for F), how doesshape=IsNullColumnwork anymore?