2

I am trying to add legend names to a ggplot but I always get an "object not found error".

  mycol = pft$colour
  myname = pft$name
  #mycol "#E5E503" "#9FFF8C" "#44CC29" "#137300" "#B2B224" "#0066CC" "#99CCFF" "#00407F" "#FF999B" "#E5171A" "#990003" "#A38FCC" "#7F40FF" "#A1E5CF" "#6B998A" "#F2F291" "#BF60A7" "#404040"
  #myname "C4 grass" "Early tropical" "Mid tropical" "Late tropical" "Temperate C3 Grass" "North Pine" "South Pine" "Late conifer" "Early hardwood" "Mid hardwood" "Late hardwood" "C3 crop" "C3 pasture" "C4 crop" "C4 pasture" "C3 grass" "Lianas" "Total"             
  test_data_long = melt(szpft[[vnam]][,ndbh+1,],
                        varnames = c("year","mpft"), na.rm = T)
  ggplot(data=test_data_long,aes(x=year,y=value, colour = mycol[mpft])) +
    geom_line(aes(group = mpft)) +
    scale_colour_identity(guide = "legend") +
    scale_fill_continuous(name="PFT", labels = myname[mpft])

test_data_long is a data.frame that looks like this.

    year mpft     value
20  2004    2  2.294562
21  2005    2  2.415901
22  2006    2  2.532214
23  2007    2  2.649968
24  2008    2  2.760934
25  2009    2  2.849097
26  2010    2  2.967846
27  2011    2  3.102287
28  2012    2  3.244338
29  2013    2  3.386014
30  2014    2  3.528662
31  2015    2  3.675095
32  2016    2  3.828054
33  2017    2  3.976928
34  2018    2  4.133859
35  2019    2  4.305039
36  2020    2  4.488999
37  2021    2  4.673952
38  2022    2  4.861845
39  2004    3  4.518262
40  2005    3  4.668800
41  2006    3  4.821924
42  2007    3  4.973597

I would like to use the mpft column as an index to define grouping, color, title ecc.

mycol and myname are vectors that contain colours (hex) and names corresponding to the different mpft lines to plot. The exact error I get is

Error in check_breaks_labels(breaks, labels) : object 'mpft' not found

Removing the last line of the script produce the next figure, so the problem lies in the last line. Why is mpft recognized before and not after?

EDIT

To be clearer,

  mycol = pft$colour
  myname = pft$name
  test_data_long = melt(szpft[[vnam]][,ndbh+1,],
                        varnames = c("year","mpft"), na.rm = T)
  ggplot(data=test_data_long,aes(x=year,y=value, colour = mycol[mpft])) +
    geom_line(aes(group = mpft)) +
    scale_colour_identity(guide = "legend") 

produces the graph in figure and gives no error.

EDIT

I'll provide a simplified, reproducible example of what I want to achieve here.

mycol = c("#A38FCC","#7F40FF")
mynam = c("random_line1", "random_line2")
set.seed(123)
df=data.frame(month = month.abb, 
          mpft  = c(rep(1,6),rep(2,6)), 
          ran = runif(12,0.,10.))

That produces a dataframe with month, mpft, and ran value. I want the ggplot to have month on the x-axis and ran on the y-axis. Furthermore I want the points to be plotted with mycol[1] colour ("#A38FCC" color) and the legend to display mynam[1] as title (random_line1 as title) if mpft = 1.

16
  • 1
    You don't define mypft in your example code. Commented Sep 14, 2016 at 10:46
  • 1
    szpft isn't there as well, but your main problem is not doing what ggplot2 likes best which is to pass it a data.frame with everything it needs in it vs referencing external vectors like you are. That is doable, but it's fraught with peril for inexperienced ggplot2 users. It would help SO folks answer if the question was reproducible. Commented Sep 14, 2016 at 10:49
  • 1
    No, you probably meanmyname["mypft"] Commented Sep 14, 2016 at 11:41
  • 1
    Why are you arguing with me? You are trying to call an object mypft and R tells you it can't find it, which can have two reasons: (i) the object does not exist, i.e., has not been definded, or (ii) it's not on the search path, i.e. a scoping issue. Based on the code you have shared, option (i) seems to apply. Commented Sep 14, 2016 at 11:47
  • 1
    @manfredo the column gets created anyway inside ggplot2. You're just postponing the inevitable. If you really want a custom computed color scale, write a scale_ function. Commented Sep 14, 2016 at 12:22

2 Answers 2

5

This is similar to the other answer but without needless dplyr mumbo-jumbo and pointing out some important details. The point is that if you want manual colors, you should use a manual color scale.

mycol = c("#A38FCC","#7F40FF")
mynam = c("random_line1", "random_line2")
set.seed(123)
df=data.frame(
              #month needs to be an ordered factor to get correct order in the plot
              #an unordered factor would be ordered alphabetically by ggplot2
              month = ordered(month.abb, levels = month.abb), 
              mpft  = c(rep(1,6),rep(2,6)), 
              ran = runif(12,0.,10.))

library(ggplot2)

ggplot(df, aes(x=month,y=ran, 
               colour = factor(mpft) #you want a discrete color scale
               )) +
  geom_line(aes(group = mpft), size = 1) +
  scale_colour_manual(name = "mynam",
                      #always pass named character vectors here to ensure correct mapping
                      values = setNames(mycol, unique(df$mpft)), 
                      labels = setNames(mynam, unique(df$mpft))) 

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

This works well for both the simplified example and the general case. Thanks for your help. Regards
Sorry, I talked a bit too fast. There's still a little problem but I think it's almost done. The problem lies in the fact that if you substitute mpft = c(rep(1,6),rep(2,6)) with mpft = c(rep(2,6),rep(1,6)) the graph stays the same (while names and color should be inverted). This is because setNames(mycol, unique(df$mpft)) and setNames(mynam, unique(df$mpft)) make a sequential assignment rather than using mpft as the index for the mycol and mynam vectors. Let me know if this make sense.
Substituting values = setNames(mycol[mpft], unique(test_data_long$mpft)) with values = setNames(mycol[unique(test_data_long$mpft)], unique(test_data_long$mpft)) did the trick. Thanks again.
The "problem" is just sorting (and not related to ggplot2). How you provide the mapping is up to you. In this example, you could also do setNames(mycol, sort(unique(df$mpft))). Btw., normally you would just do df$mpft <- factor(df$mpft, levels = 1:2, labels = mynam). Then you could use the factor levels for color mapping.
0

EDIT: Using the reproducible example given in an edit to the question, the following code is provided.

library(dplyr)
df %>%  ggplot(.,aes(x=month,y=ran, colour= as.factor(mpft))) +
    geom_line(aes(group = mpft)) +
  scale_color_manual(guide = guide_legend(title = "Legend"),
                     values = mycol,
                     labels = mynam)

This code will only give the right colours if all the lines are in the figure, if they aren't then the colours will change to prevent this use the following code.

  #This example uses three types, of which we only want 2
mycol = c("#A38FCC","#8f9e00","#7F40FF")
myname = c("random_line1","random_line2","random_line3")
set.seed(123)
df=data.frame(month = rep(month.abb,3), 
              mpft  = c(rep(1,12),rep(2,12),rep(3,12)), 
              ran = runif(36,0.,10.))

#this creates a dataframe that maps mpft to it's name and hex code
details <- data.frame(mpft = 1:length(mycol), 
                      mycol=(mycol), 
                      myname= myname, 
                      stringsAsFactors = FALSE)

#merge the two data frames
df2 <- df %>% left_join(., details, by="mpft") 

#We experiment to see if the colour scheme is maintained by removing one of the types
df3 <- df2 %>% filter(mpft !=2)

#Arrange the newly formed dataframe in order of mpft, although it makes no difference in this example it is crucial if the observatiions of mpft are not in numerical order.
df3 <- df3  %>% arrange(mpft)

 #Colours wrong!
 df2%>%  ggplot(.,aes(x=month,y=ran, colour= myname)) +
    geom_line(aes(group = mpft)) + 
  scale_color_manual(guide = guide_legend(title = "Legend"),
                     values = unique(df2$mycol) )
#Colours Correct!
df3%>%  ggplot(.,aes(x=month,y=ran, colour= myname)) +
    geom_line(aes(group = mpft)) + 
  scale_color_manual(guide = guide_legend(title = "Legend"),
                     values = unique(df3$mycol) )

The above code chunk will maintain your colour scheme even if for some reason not all types are in the plot

3 Comments

Thanks for the answer, I have made a second edit to reduce the problem to its minimal terms. Can you give it look, it might be easier to discuss on that
Could you say what was wrong with the first attempt I gave? Not sure if I understand the question correctly,
I am not sure about the second, I get some errors (%>% operator and others). The first actually works for the basic example but in fact just for that specific case. When you say values = mycol and labels = mynam you don't specify the index of that vector (which should be mpft in this case).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.