5

I have big data as csv file which has too many dates, so when I plot it, x axis writes all of them, like f.e : from 2000-12-24 to 2017-12-24 and also y axis.

I have tried to use a set, but that set needs to sort and problem is that when I sort it the data from Y isn't for any of sorted dates.

import matplotlib.pyplot as plt
import urllib as u
import numpy as np
import csv

stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'

date = []
openp = []
high = []
low = []
close = []
adjclose = []
volume = []

text = u.request.urlopen(stock_price_url).read().decode()
with open('nw.csv', 'w') as fw:
    fw.write(text)
    fw.close()

with open('nw.csv', 'r') as csvf:
f = csv.reader(csvf, delimiter=',')
for row in f:
    if 'Date' not in row:
        date.append(row[0])
        openp.append(row[1])
        high.append(row[2])
        low.append(row[3])
        close.append(row[4])
        adjclose.append(row[5])
        volume.append(row[6])

dateset = set([])            
for z in date:
   dateset.add(z[:4])

highset = []
for z in high:
    highset.append(z[:3])


plt.plot(set(dateset), set(highset), linewidth=0.5)
plt.show()
2
  • So, what is your question? Commented Jan 4, 2018 at 6:49
  • I think pandas might help you with your problems. See for instance this, this and this post for more information. Commented Jan 4, 2018 at 10:24

1 Answer 1

17

You need to convert the dates first into a Python datetime object. This can then be converted into a matplotlib number. With this you can then tell matplotlib to add ticks based on year or month changes:

from datetime import datetime
import matplotlib
import matplotlib.pyplot as plt
import urllib as u
import numpy as np
import csv

stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'

date = []
high = []

text = u.request.urlopen(stock_price_url).read().decode()

with open('nw.csv', 'w') as f_nw:
    f_nw.write(text)

with open('nw.csv', 'r', newline='') as f_nw:
    csv_nw = csv.reader(f_nw)
    header = next(csv_nw)

    for row in csv_nw:
        date.append(matplotlib.dates.date2num(datetime.strptime(row[0], '%Y-%m-%d')))
        high.append(row[2])

ax = plt.gca()
#ax.xaxis.set_minor_locator(matplotlib.dates.MonthLocator([1, 7]))
#ax.xaxis.set_minor_formatter(matplotlib.dates.DateFormatter('%b'))
ax.xaxis.set_major_locator(matplotlib.dates.YearLocator())
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%Y'))
#ax.tick_params(pad=20)

plt.plot(date, high, linewidth=0.5)
plt.show()   

This would give you just the years: year locators

Or if you uncomment the minor locator/formatter you would get: year and month locators

Note:

  1. You do not need to close a file if you are opening it with a with block.

  2. The script assumes you are using Python 3.x.

  3. To skip the header just read it in using next() before iterating over the rows in your for loop.

Sign up to request clarification or add additional context in comments.

1 Comment

You are welcome! Don't forget to click on the grey tick under the up/down buttons to accept the answer as the accepted solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.