Getting NaN when reading data from website Python

Question

I am trying to read data from the website and I am somehow new in it. Looked into some examples but somehow not getting it running. The website is:

http://www.ariva.de/adidas-aktie/historische_kurse

There is a download button where a csv file can be downloded, marked bottom right hand side in a red box in the attached picture:

Image

Somehow not clear why I receive the NaN values? The code is defined below:

import pandas as pd
import io
import requests
url="http://www.ariva.de/A1EWWW/historische_kurse?boerse_id=6&month=2006-01-31&currency=&clean_split=1&clean_split=0&clean_payout=1&clean_payout=0&clean_bezug=1&clean_bezug=0/wkn_A1EWWW_historic.csv"
s=requests.get(url).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')), error_bad_lines=False)

print(c)

I suggest to print a few lines of the file's actual content rather than a screenshot from where you downloaded it. — MB-F
– MB-F, Commented Feb 8, 2017 at 10:50
Perhaps because the URL doesn't return CSV data? I.e. wget or curl that url - it is not CSV. — Zaur Nasibov
– Zaur Nasibov, Commented Feb 8, 2017 at 10:50

MaxU - stand with Ukraine · Accepted Answer · 2017-02-08 11:08:53Z

4

you can use pandas.read_html() method:

In [261]: df = pd.read_html(url, thousands='.', decimal=',')[3]

In [262]: df
Out[262]:
         0        1        2        3        4    5        6        7
0    Datum   Erster     Hoch     Tief  Schluss  NaN   Stücke  Volumen
1   310106  37.1782  37.5537  36.6383  36.7215    €  2324069   85,3 M
2   300106  36.3204  37.3745  36.2798   37.191    €  2553488   95,0 M
3   270106  35.6077  36.3887    35.58  36.2414    €  2950272    107 M
4   260106  35.2877   35.548  35.0594  35.4605    €  2147777   76,2 M
5   250106  35.6077  35.6077  35.1255  35.3133    €  1985601   70,1 M
6   240106  35.5266   35.612  35.2813    35.42    €  1435138   50,8 M
7   230106  35.2279  35.6931  35.0145  35.4584    €  1506623   53,4 M
8   200106   35.516   35.516  35.2514  35.3879    €  2251534   79,7 M
9   190106  35.0999    35.58  35.0999  35.4157    €  1425647   50,5 M
10  180106  34.8695  35.2343  34.5707  35.0871    €  2812569   98,7 M
11  170106  35.0145   35.565  35.0145  35.3623    €  2431866   86,0 M
12  160106   35.149  35.4584  34.9783  35.3751    €   747868   26,5 M
13  130106  35.5245  35.5266  35.0295  35.0786    €  2016092   70,7 M
14  120106  35.1383  35.5608  35.0145   35.452    €   941786   33,4 M
15  110106  35.3133  35.4882  34.8396  34.9527    €  1341719   46,9 M
16  100106   35.102  35.3346  35.0359  35.1127    €  1673729   58,8 M
17  090106  35.7976  35.7976  35.0359  35.3005    €  2055502   72,6 M
18  060106  35.7507  35.8467  35.5266  35.6909    €  1532681   54,7 M
19  050106  35.8254  35.8787  35.4989    35.74    €  1653103   59,1 M
20  040106    35.74  35.9534  35.5693  35.8467    €  2760820   99,0 M
21  030106  35.2386  35.8168  35.1618  35.3303    €  2885207    102 M
22  020106  34.4598  35.1084  34.4598  35.0359    €  1254853   44,0 M

In [263]: df.columns = df.iloc[0]

In [264]: df.drop(0, inplace=True)

In [265]: df
Out[265]:
0    Datum   Erster     Hoch     Tief  Schluss NaN   Stücke Volumen
1   310106  37.1782  37.5537  36.6383  36.7215   €  2324069  85,3 M
2   300106  36.3204  37.3745  36.2798   37.191   €  2553488  95,0 M
3   270106  35.6077  36.3887    35.58  36.2414   €  2950272   107 M
4   260106  35.2877   35.548  35.0594  35.4605   €  2147777  76,2 M
5   250106  35.6077  35.6077  35.1255  35.3133   €  1985601  70,1 M
6   240106  35.5266   35.612  35.2813    35.42   €  1435138  50,8 M
7   230106  35.2279  35.6931  35.0145  35.4584   €  1506623  53,4 M
8   200106   35.516   35.516  35.2514  35.3879   €  2251534  79,7 M
9   190106  35.0999    35.58  35.0999  35.4157   €  1425647  50,5 M
10  180106  34.8695  35.2343  34.5707  35.0871   €  2812569  98,7 M
11  170106  35.0145   35.565  35.0145  35.3623   €  2431866  86,0 M
12  160106   35.149  35.4584  34.9783  35.3751   €   747868  26,5 M
13  130106  35.5245  35.5266  35.0295  35.0786   €  2016092  70,7 M
14  120106  35.1383  35.5608  35.0145   35.452   €   941786  33,4 M
15  110106  35.3133  35.4882  34.8396  34.9527   €  1341719  46,9 M
16  100106   35.102  35.3346  35.0359  35.1127   €  1673729  58,8 M
17  090106  35.7976  35.7976  35.0359  35.3005   €  2055502  72,6 M
18  060106  35.7507  35.8467  35.5266  35.6909   €  1532681  54,7 M
19  050106  35.8254  35.8787  35.4989    35.74   €  1653103  59,1 M
20  040106    35.74  35.9534  35.5693  35.8467   €  2760820  99,0 M
21  030106  35.2386  35.8168  35.1618  35.3303   €  2885207   102 M
22  020106  34.4598  35.1084  34.4598  35.0359   €  1254853  44,0 M

edited Feb 8, 2017 at 11:08

answered Feb 8, 2017 at 10:54

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

MCM Over a year ago

perfect thanks a lot!!! Somehow I saw that the values are using for the decimal seperation a comma instead of a point. How can I cange it in the reader?

MCM Over a year ago

Thanks a lot. I wish I could give you more honor for that answer. Was getting crazy :)

MaxU - stand with Ukraine Over a year ago

@MCM, glad i could help :-)

MCM Over a year ago

Sorry for bothering, but I have one more question. Is it possible to download directly from an the download button shown on the picture above?

MaxU - stand with Ukraine Over a year ago

@MCM, sorry, i don't know how to do that. It has also nothing to do with Pandas/Numpy/etc. So I'd recommend you to open a new question and to tag it correspondingly - it can attract right people...

|

Nikhil Rupanawar · Accepted Answer · 2017-02-08 12:23:43Z

0

To download via requests, First we need to locate URL which download button hits (possibly via JS?), You can use browser inspector or equivalent to do so. I found this in your case.

import requests
r = requests.get("http://www.ariva.de/quote/historic/historic.csv?secu=291&boerse_id=6&clean_split=1&clean_payout=0&clean_bezug=1&min_time=8.2.2016&max_time=8.2.2017&trenner=%3B&go=Download", stream=True)
with open('out.csv', 'wb') as fd:
    for chunk in r.iter_content(100):
        fd.write(chunk)

answered Feb 8, 2017 at 12:23

Nikhil Rupanawar

4,22111 gold badges37 silver badges51 bronze badges

Collectives™ on Stack Overflow

Getting NaN when reading data from website Python

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related