0

I have a data frame that contains the yield of bonds of different durations at different point in time.

For example, my dataframe would look like that

bond_duration <- c("three_mth", "one_yr", "two_yr", "five_yr", "seven_yr", "ten_yr")
Jan_2007 <- c(3.12, 2.98, 3.01, 3.07, 3.11, 3.18)
Feb_2007 <- c(2.93, 2.89, 2.91, 2.99, 3.02, 3.08)
Mar_2007 <- c(2.62, 2.53, 2.51, 2.70, 2.79, 2.91)
df <- as.data.frame(cbind(bond_duration, Jan_2007, Feb_2007, Mar_2007))
df[, 2:4] <- apply(df[, 2:4], 2, as.numeric)

The first column contains bonds with different durations. In the next three columns (columns 2 to 4), it shows the yield of each bond at that particular point in time (e.g. January 2007).

What I want to achieve is to use the Apply function to create multiple line graph from the data found within each time point (e.g. line graph of the yield of all bond duration for January 2007, line graph of the yield of all bond duration for February 2007, etc).

My x-axis will be the different bond durations while my y-axis will be the yield.

I can individually plot the yield curve for each time point with success with the following code:

ggplot(data, aes(x = bond_duration, y = Jan_2007, group = 1)) + geom_point() + geom_line() + 
scale_x_discrete(limits = c("three_mth", "one_yr", "two_yr", "five_yr", "seven_yr", 
                            "ten_yr")) + 
ggtitle(paste(colnames(data)[2], " Yield Curve", sep = "")) +ylab("Yield (%)")

However, when I attempt to use the Apply function to loop the creation of multiple line graphs for each time point, my script works. The script is able to create multiple line graphs for each time point, however the title for each line graph is the same. I used the following code:

apply(data, 2, function(x) ggplot(data, aes(x = bond_duration, y = x, group = 1)) + geom_point() + geom_line() + 
      scale_x_discrete(limits = c("three_mth", "one_yr", "two_yr", "five_yr", "seven_yr", 
                                  "ten_yr")) + 
      ggtitle(paste(colnames(data)[x], " Yield Curve", sep = "")) + ylab("Yield (%)"))

I suspect something is wrong with the ggtitle section of my code. I want each line graph to be named (particular_timepoint)_yield curve.

Any help is appreciated. Thanks!

1 Answer 1

3

Using your dataframe df as above, this will create a list p containing your 3 plots.

p <- lapply(names(df)[2:4], function(x) {
  ggplot(df, aes_string(x = "bond_duration", y = x, group = 1)) + 
   geom_point() + 
   geom_line() + 
   scale_x_discrete(limits = c("three_mth", "one_yr", "two_yr", "five_yr", 
                               "seven_yr", "ten_yr")) + 
   ggtitle(paste0(x, " Yield Curve")) + ylab("Yield (%)")
})

You can access each plot with the double bracket syntax p[[i]].

The lapply function passes the column names for each of the 3 months as strings so you need to use the aes_string variation of aes in the ggplot function for it to recognise what you are passing to it.

You may want to consider reshaping the data to a tidy format (gathering the month variables into one column) and using the ggplot facet_wrap function to produce 1 plot with each month split into it's own facet, like so:

tidy_df <- df %>% 
  gather(Month, Yield, 2:4) %>% 
  mutate(bond_duration = factor(bond_duration, levels = c("three_mth", "one_yr", "two_yr", "five_yr", "seven_yr", "ten_yr")),
         Month = factor(Month, levels = c("Jan_2007", "Feb_2007", "Mar_2007")))

ggplot(tidy_df, aes(bond_duration, Yield, group = Month)) +
  facet_wrap(~ Month, ncol = 1) +
  geom_point() +
  geom_line() +
  labs(title = "Bond Duration Yield Curve by Month", x = "Bond Duration", y = "Yield (%)")
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! It works. Just curious, why does the lapply take in names(df) instead of df? Do you pass the vector of names into it instead of the entire dataset? In addition, why is x = "bond_duration" instead of x = bond duration? Thanks again!
the apply function are used to apply a function over a list or vector. So you need to provide a list of the variables you want to apply a function on, in your case, the 3 different months you want to create a plot for. This is supplied as a character vector in names(df)[2:4]. aes_string is useful when writing functions that create plots because you can use strings or quoted names/calls to define the aesthetic mappings (the list of month strings we pass to lapply). so with aes_string, all variables must be quoted.
I've added the code to produce a facet plot which might be more useful if you want to display all plots together
Also when you have a list of plots you can just use: cowplot::plot_grid(plotlist = list) to display all

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.