0

The end goal of this question is to plot X and Y for a graph using a dataframe.

I have a dataframe like so:

             Open    High     Low   Close     Volume       stock symbol
Date                                                              
2000-10-19    1.37    1.42    1.24    1.35  373590000         AAPL
2000-10-20    1.36    1.46    1.35    1.39  195836200         AAPL
2000-10-23    1.39    1.49    1.39    1.46  129851400         AAPL
2000-10-24    1.48    1.49    1.34    1.35  192711400         AAPL
2000-10-25    1.36    1.37    1.30    1.32  163448600         AAPL
2000-10-26    1.34    1.42    1.25    1.32  178110800         AAPL
2000-10-27    1.35    1.37    1.28    1.33  181242600         AAPL
2000-10-30    1.37    1.42    1.34    1.38  152558000         AAPL

And I am trying to plot Date vs. Open. I know there is a way to simply plot, but I will be applying this concept to larger dataframes and would like to know how to do it "long-hand".

What I've tried:

print(some_DF['Open'])

Result:

 Date
    2000-10-19      1.37
    2000-10-20      1.36
    2000-10-23      1.39
    2000-10-24      1.48
    2000-10-25      1.36
    2000-10-26      1.34

Problem:

Date seems to be my index, but the column header 'Open' Does not appear.

Question:

How do i print the above Dataframe while having 'Open' as my header. Then making some value x=Date's column and some value y = 'Open's values?

"Expected Code to work":

Im thinking something like

print([some_DF['Open'] headers = 'date','open')
x = some_DF['Date'] #So that this becomes first column of dataframe
y = some_DF['Open'] #So that this becomes second column of dataframe
7
  • what's the data file like? the first 10 lines will do. I'm curious how you read that data file, and what the raw file looks like Commented Sep 16, 2016 at 16:15
  • 2
    Have you tried print(some_DF[['Open']])? Commented Sep 16, 2016 at 16:15
  • @M.Klugerford this is very close to what i'd like. However, the 'Date' and 'Open' headers look to be on different rows. Could you explain what the double [[ ]] is doing? Commented Sep 16, 2016 at 16:20
  • @Tuan333 it's from panda's DataReader from yahoo finance. Commented Sep 16, 2016 at 16:20
  • 1
    It's funny, it's the third time in two days I see someone worried about how the pandas output looks. Do you have a specific reason for that? What matters is the graph you want... Commented Sep 16, 2016 at 16:24

1 Answer 1

4

You can reset_index on the data-frame and then print the subset dataframe consisting of the two columns

>>> df
            a  b
Date            
2000-10-19  1  3
2000-10-20  2  4
2000-10-21  3  5
2000-10-22  4  6
2000-10-23  5  7
>>> print(df.reset_index()[['Date', 'a']])
        Date  a
0 2000-10-19  1
1 2000-10-20  2
2 2000-10-21  3
3 2000-10-22  4
4 2000-10-23  5

Like IanS mentioned, you shouldn't worry about how the output looks in pandas. Date was an index and Open a column. The difference in the print statement illustrates that distinction.

Edit:

The df[[list_of_column_names]] is the same as df.loc[:, [list_of_column_names]]. It gives a list of columns to subset the original dataframe.

Sign up to request clarification or add additional context in comments.

3 Comments

restet_index... why didn't i think of that. Great! Now how would i make x the first column and y the second? any tips?
@MattR with your original dataframe you could do print(some_DF.reset_index()[['Date', 'Open']]) The order of the output depends on the order you provide the list of column names.
I'm still confused on what the [[ ]] is doing. Would it be possible to explain that? I really appreciate the help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.