0

I'm working on a R Shiny program that can take any csv file and output graphs of it. The user who uploads the csv has some guidelines on how the data should look, but I don't want it to be too strict.

I'm currently trying to use ggplot2 to graph multiple lines of the same dataset on one plot for comparison.

The data I am currently uploading looks like this (simplified, as the data has over 1000 rows):

Date Hamburgers Salads Sodas Fries
12-01    4        4      3    2
12-02    1        7      3    9
12-03    22       24     45   34
12-04    23       44    46    22

I'm trying to output a graph that has the time on the X-axis (the user chooses this via a sidebar, as he can choose any axis, but time makes the most sense here). For the Y axis, I want 4 lines, colored differently, plotting each variable over time.

I have all of the 'user taking in input and choosing which columns to graph' implemented, but for simplicity's sake, we can assume that for the most part, this has been hard coded (so Y variable will actually be input$y, etc in my implementation)

The portion of my code where I try to graph the data is:

output$plotLine <- renderPlot({
  p <- ggplot(data, aes_string(x=X, y=Y), environment = environment()) 
      p <- p + geom_point(size = 3) 
      p <- p + geom_line(aes(group=1)) 

  print(p)
})

This plots one of the lines, but I have no idea how to plot the others on the same plot. I've read about using 'group' in the aes function, but this depends on having a classifier in the dataset, which this one currently does not have.

I have also looked into the melt() function from the reshape2 package but am not sure how it would help me (both for the multiple line problem and the greater sense of this project, so that the user doesn't have to abide by strict rules for upload format of the csv).

Any help would be much appreciated!

4
  • 1
    What do you mean by we can assume that for the most part, this has been hard coded (so Y variable will actually be input$y, etc in my implementation)? How is this consistent with there being multiple possible columns as your "Y" variable? I think you should take a step back and think of how best you want to reshape your input data into a form which you can easily plot. As for "no strict rules for the csv" What do you mean? There must be some definitions. How are you going to parse your date column ('12-01' is not a Date, and definitely not unambiguously %m-%d or %y-%m or just an id? Commented Aug 1, 2013 at 1:50
  • The fact that you say to assume that Y will in fact be input$y, coupled with the suspicious presence of environment() tells me that you've already gone quite a ways down the wrong path. The use of melt is (essentially) unavoidable here, and I would assume that that would be where you'd pass your users selection of specific columns. Commented Aug 1, 2013 at 2:16
  • @joran: Hello, I've largely followed this person in implementing user input: github.com/jcheng5/seattle-meetup/blob/master/diamonds3/…. How is the use of melt unavoidable? I'm pretty sure I could take environment() out, I only included it because another user on SO helped me out on another question with environment(). Commented Aug 1, 2013 at 17:44
  • @mnel: You are correct, I do need to take a step back =/. Basically my code right now (modeled on jcheng's diamonds 3 project in above comment) depends on the user uploading a csv file in a specific format like the one in my post, which is unfortunate. I was hoping melt could alleviate these restrictions, but I'm not sure how. It's a good thing you mention, the Date, as one of the requirements for my program to work is that the date is in year-month format, or else the ggplot2 plots the months out of order (alphabetically). Commented Aug 1, 2013 at 17:47

1 Answer 1

1

Assuming you put the xaxis variable (Date) in selectedxaxis, the selected products in selectedproducts and with data holding the loaded data:

selectedxaxis = "Date"
selectedproducts = c("Sodas", "Salads")
widedata = subset(data, select = c(selectedxaxis, selectedcolumns))

longdata = melt(widedata, id.vars=selectedxaxis, variable.name='Product', value.name='Count')
ggplot(longdata) + geom_line(aes(Date, Count, color=Product))
Sign up to request clarification or add additional context in comments.

2 Comments

Is there a way with melt to allow the user to choose which inputs to be on the graph? For example, allowing for the input$x and input$y variables, can something like longdata = melt(widedate, id.vars=input$x, variable.name=input$y) work?
Yes, that's very easy, all it takes is to leave the columns that you want to use. I updated the answer to show you. You can't simply use variable.name=input.y though, that just changes the column name.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.