1

I have the data frame:

dataframe

I want to plot the date column vs the close column.

Here is the code to do that:

import pandas as pd
import matplotlib.pyplot as plt



df=pd.read_csv('C:/Users/admin/Desktop/ohlcv.csv')

df['date']=pd.to_datetime(df['date'])
df=df.set_index('date')


plt.figure(figsize=(12.5,4.5))
plt.title('Close Price Variation')
plt.xlabel('date')
plt.ylabel('close')
plt.plot(df.close)
plt.show()

This gives the plot as shown:

Plot

The issue is that the timestamp in the original plot is not the same as in the dataframe. What I mean by this is that data in the dataframe is spaced by 15 mins interval while in plot is spaced in 1 min interval

How can I get the same timestamp in the plot and in the dataframe

7
  • 1
    Can you show the entire date column in your table? It is crucial to see the formatting Commented Mar 9, 2021 at 7:21
  • @Shir Added the required column Commented Mar 9, 2021 at 7:24
  • I think the problem is with the +05:30 suffix in every date. Does it have any significance to you? I'd try removing it and plotting without it. I'll make the code snippet. Commented Mar 9, 2021 at 7:26
  • You can I just need the time like 9,9:15 and so on Commented Mar 9, 2021 at 7:26
  • 1
    @Huzefa So you need to rotate them. Commented Mar 9, 2021 at 8:11

2 Answers 2

3

If the time series of your data contains a time zone, you can use tz_localize(None) to remove the time zone and make only the original time series. Next, we use the MinitueLocator to specify a 15-minute interval for the time series. Then we rotate the strings to avoid overlapping

import yfinance as yf
import pandas as pd

df = yf.download("AAPL", interval='15m', start="2021-03-03", end="2021-03-04")

df.head(5)

    Open    High    Low     Close   Adj Close   Volume
Datetime                        
2021-03-02 10:00:00-05:00   126.735001  127.198402  126.541000  126.654999  126.654999  3163537
2021-03-02 10:15:00-05:00   126.650101  126.940002  125.980003  126.069901  126.069901  4519324
2021-03-02 10:30:00-05:00   126.059998  126.430000  125.779999  125.894997  125.894997  4357847
2021-03-02 10:45:00-05:00   125.879997  126.349998  125.730003  126.129997  126.129997  3172543
2021-03-02 11:00:00-05:00   126.129997  126.449997  125.849998  126.114998  126.114998  2971019

df.index = pd.to_datetime(df.index.tz_localize(None))

df.head(5)
    Open    High    Low     Close   Adj Close   Volume
Datetime                        
2021-03-02 10:00:00     126.735001  127.198402  126.541000  126.654999  126.654999  3163537
2021-03-02 10:15:00     126.650101  126.940002  125.980003  126.069901  126.069901  4519324
2021-03-02 10:30:00     126.059998  126.430000  125.779999  125.894997  125.894997  4357847
2021-03-02 10:45:00     125.879997  126.349998  125.730003  126.129997  126.129997  3172543
2021-03-02 11:00:00     126.129997  126.449997  125.849998  126.114998  126.114998  2971019

import matplotlib.pyplot as plt
import matplotlib.dates as mdates

fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(111)
ax.set_title('Close Price Variation')
ax.set_xlabel('Date')
ax.set_ylabel('Close')
ax.plot(df.Close[:-2])

mins = mdates.MinuteLocator(byminute=[0,15,30,45])
ax.xaxis.set_major_locator(mins)

ax.tick_params(axis='x', labelrotation=45)
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

2

You can correctly format the date column and this solves the problem:

import pandas as pd
import matplotlib.pyplot as plt

#df=pd.read_csv('C:/Users/admin/Desktop/ohlcv.csv')
df = pd.DataFrame({"date":["2021-03-09 09:15:00+05:30", 
                           "2021-03-09 09:30:00+05:30"],
                   "close":[615.8, 615.5]})
print(df.head())
df['date']=pd.to_datetime(df['date'], format="%Y-%m-%d %H:%M:%S%z")
df=df.set_index('date')

plt.figure(figsize=(12.5,4.5))
plt.title('Close Price Variation')
plt.xlabel('date')
plt.ylabel('close')
plt.plot(df.close)
plt.show()

Notice the %z in date formatting deals with the suffix discussed in the comments - UTC offset. For more information is strftime pandas documentation.

enter image description here

2 Comments

Your answer is correct but does not work with large data. I have updated the dataframe to match the original length
@Huzefa I updated the solution such that the apply function redundant. This should work for data frames in any size. Simply load your own instead of the dummy frame I created.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.