How to plot a Log Pearson III distribution using `ggplot2` in R?

Question

I am fitting a Log Pearson III distribution to my streamflow data. After the fitting, I'd like to plot the observed values and the fitted distribution.

However, the figure is not what I expect:

1. The x axis label does not show 0.99 0.02, and 0.01, although I have set the breaks to be c(0.99, 0.9, 0.5, 0.2, 0.1, 0.02, 0.01).

2. How to plot the fitted Log Pearson III distribution on the figure? I have provided an example figure for reference, thanks for any help.

library(fitdistrplus)
library(PearsonDS)
library(ggplot2)

low_flows <- data.frame(Year = 1981:2010, Flow = c(0.03357143, 0.01328571, 0.02285714, 0.02657143, 0.04957143, 0.04085714, 0.16900000, 0.01057143, 
                                                   0.04128571, 0.10871429, 0.08771429, 0.09585714, 0.22057143, 0.11571428, 0.08300000, 0.11257143, 
                                                   0.13614286, 0.07742857, 0.09785714, 0.04728571, 0.04300000, 0.08385714, 0.02828571, 0.07271429, 
                                                   0.21428571, 0.10142857, 0.04400000, 0.10928571, 0.12471429, 0.31500000))


Log_mydata <- log10(low_flows$Flow)

m <- mean(Log_mydata)
v <- stats::var(Log_mydata)
g <- e1071::skewness(Log_mydata, type = 2)
my_shape <- (2/g)^2
my_scale <- sqrt(v)/sqrt(my_shape) * sign(g)
my_location <- m - my_scale * my_shape
start <- list(shape = my_shape, location = my_location, scale = my_scale)

dPIII <<- function(x, shape, location, scale) PearsonDS::dpearsonIII(x, 
                                                                     shape, location, scale, log = FALSE)
pPIII <<- function(q, shape, location, scale) PearsonDS::ppearsonIII(q, 
                                                                     shape, location, scale, lower.tail = TRUE, log.p = FALSE)
qPIII <<- function(p, shape, location, scale) PearsonDS::qpearsonIII(p, 
                                                                     shape, location, scale, lower.tail = TRUE, log.p = FALSE)

fit_lp3 <- fitdist(Log_mydata, distr = "PIII", start = start)

plotdata = low_flows
plotdata <- plotdata[order(plotdata$Flow), ]
plotdata$prob <- seq_len(length(plotdata$Flow))/(length(plotdata$Flow) + 1)

prob_scale_points = c(0.99, 0.9, 0.5, 0.2, 0.1, 0.02, 0.01)

ggplot(data = plotdata, aes(x = prob, y = Flow)) + 
  geom_point(size = 2) + 
  xlab("Probability") + 
  scale_x_continuous(transform = scales::probability_trans("norm", lower.tail = FALSE), 
                     breaks = prob_scale_points, 
                     sec.axis = dup_axis(name = "Return Period", 
                                         labels = function(x) {ifelse(1/x < 2, round(1/x, 2), round(1/x, 0))})) +
  theme_bw()  + 
  theme(panel.border = element_rect(colour = "black", fill = NA, linewidth = 1), 
        panel.grid = element_line(linewidth = 0.2), 
        axis.title = element_text(size = 12), axis.text = element_text(size = 10), 
        axis.title.x.top = element_text(size = 12), 
        legend.text = element_text(size = 10), legend.title = element_text(size = 10))

Here is what I get from the above code:

Here is my objective:

Following Allan's suggestion, the Log Pearson III can be correctly plotted! To simplify my original code, I have tried to use scale_x_log10instead of scale_x_continuous. However, the fitted line of Log Pearson III is not correctly plotted. I wonder how to fix this?

ggplot(data = plotdata, aes(x = prob, y = Flow)) + 
  geom_point(size = 2, color="blue") + 
  geom_function(fun = ~ 10^(qpearsonIII(.x, params = fit_lp3$estimate)),
                color = "black", linewidth = 1.5, alpha = 0.7) + 
  scale_x_log10(name = "Probability", 
                trans = "reverse", 
                breaks = prob_scale_points, 
                limits = rev(range(prob_scale_points)), 
                sec.axis = dup_axis(name = "Return Period", 
                                    labels = function(x) {ifelse(1/x < 2, round(1/x, 2), round(1/x, 0))})) + 
  scale_y_log10(breaks = seq(0.05, 0.4, 0.05)) + 
  theme_bw()  + 
  theme(panel.border = element_rect(colour = "black", fill = NA, linewidth = 1), 
        panel.grid = element_line(linewidth = 0.2), 
        axis.title = element_text(size = 12), axis.text = element_text(size = 10), 
        axis.title.x.top = element_text(size = 12), 
        legend.text = element_text(size = 10), legend.title = element_text(size = 10))

Here is the figure. We can see that the fitted line is not correct.

Allan Cameron · Accepted Answer · 2024-11-20 22:01:15Z

3

You can use geom_function to draw the output of qPearsonIII with your fit parameters, though you should plot the logged data to match its output. This would mean relabelling the y axis to make it appear as a log scale.

To get the x axis the way you want, include limits as well as breaks.

plotdata = low_flows
plotdata$logflow <- Log_mydata
plotdata <- plotdata[order(plotdata$Flow), ]
plotdata$prob <- seq_len(length(plotdata$Flow))/(length(plotdata$Flow) + 1)

ggplot(data = plotdata, aes(x = prob, y = logflow)) + 
  geom_point(size = 2, aes(color = "7-Day")) + 
  geom_function(fun = ~ qpearsonIII(.x, params = fit_lp3$estimate),
                aes(color = "7-Day"), linewidth = 1, alpha = 0.8) +
  scale_x_continuous("Probability",
                     transform = scales::probability_trans("norm", 
                                                           lower.tail = FALSE), 
                     breaks = prob_scale_points, 
                     limits = rev(range(prob_scale_points)),
                     sec.axis = dup_axis(name = "Return Period", 
                                         labels = function(x) {
                                           ifelse(1/x < 2, round(1/x, 2), 
                                                  round(1/x, 0))})) +
  scale_y_continuous("Discharge (cms)",
                     breaks = log10(seq(0.05, 0.4, 0.05)),
                     labels = ~10^.x,
                     limits = log10(c(0.01, 0.4))) +
  scale_color_manual("Events and\nComputed Curve", values = "#440154") +
  theme_bw(20)  + 
  theme(panel.border = element_rect(colour = "black", fill = NA, linewidth = 1), 
        panel.grid = element_line(linewidth = 0.4), 
        axis.title = element_text(size = 12), 
        axis.text = element_text(size = 10), 
        axis.title.x.top = element_text(size = 12), 
        legend.text = element_text(size = 10), 
        legend.title = element_text(size = 10),
        legend.position = "inside",
        legend.position.inside = c(0.9, 0.9))

edited Nov 20, 2024 at 22:01

answered Nov 20, 2024 at 21:24

Allan Cameron

178k7 gold badges70 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Yang Yang Over a year ago

Thanks a lot for your help. In you code geom_function(fun = ~ exp(qpearsonIII(.x, params = fit_lp3$estimate)), color = "purple3"), I wonder why you use "exp"? Should we need to use "10^" as the original data transformation is "log10"? Thanks.

Yang Yang Over a year ago

Also, is it possible to use scale_x_log10() to simplify thescale_x_continuous() in my code? Thank you.

Allan Cameron Over a year ago

@YangYang actually, probably best to just plot the logged data and relabel the y axis - that way, you can be confident the fit matches your logged data. (see my update). You can't use a simple logged axis for x because the scale is reversed.

Yang Yang Over a year ago

Hi Allan, I notice that scale_x_log10() has a trans option of reverse. However, when I use it, the fitted distribution seems to be incorrect. I have updated my question. Could you please take another look? Thank you.

Allan Cameron Over a year ago

That doesn't make it a log scale. If it were a log scale, there would be a much smaller spacing between 0.99 and 0.9 too (as in your actual log version). You have a normal probability transform, such that 0 is infinitely far to the right and 1 is infinitely far to the left. The distance each probability is plotted at is proportional to qnorm(x), not log10(x)

|

Collectives™ on Stack Overflow

How to plot a Log Pearson III distribution using `ggplot2` in R?

1 Answer 1

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related