0

I am trying to read data from the website and I am somehow new in it. Looked into some examples but somehow not getting it running. The website is:

http://www.ariva.de/adidas-aktie/historische_kurse

There is a download button where a csv file can be downloded, marked bottom right hand side in a red box in the attached picture:

Image

Somehow not clear why I receive the NaN values? The code is defined below:

import pandas as pd
import io
import requests
url="http://www.ariva.de/A1EWWW/historische_kurse?boerse_id=6&month=2006-01-31&currency=&clean_split=1&clean_split=0&clean_payout=1&clean_payout=0&clean_bezug=1&clean_bezug=0/wkn_A1EWWW_historic.csv"
s=requests.get(url).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')), error_bad_lines=False)

print(c)
3
  • you want to download the file? Commented Feb 8, 2017 at 10:49
  • I suggest to print a few lines of the file's actual content rather than a screenshot from where you downloaded it. Commented Feb 8, 2017 at 10:50
  • 1
    Perhaps because the URL doesn't return CSV data? I.e. wget or curl that url - it is not CSV. Commented Feb 8, 2017 at 10:50

2 Answers 2

4

you can use pandas.read_html() method:

In [261]: df = pd.read_html(url, thousands='.', decimal=',')[3]

In [262]: df
Out[262]:
         0        1        2        3        4    5        6        7
0    Datum   Erster     Hoch     Tief  Schluss  NaN   Stücke  Volumen
1   310106  37.1782  37.5537  36.6383  36.7215    €  2324069   85,3 M
2   300106  36.3204  37.3745  36.2798   37.191    €  2553488   95,0 M
3   270106  35.6077  36.3887    35.58  36.2414    €  2950272    107 M
4   260106  35.2877   35.548  35.0594  35.4605    €  2147777   76,2 M
5   250106  35.6077  35.6077  35.1255  35.3133    €  1985601   70,1 M
6   240106  35.5266   35.612  35.2813    35.42    €  1435138   50,8 M
7   230106  35.2279  35.6931  35.0145  35.4584    €  1506623   53,4 M
8   200106   35.516   35.516  35.2514  35.3879    €  2251534   79,7 M
9   190106  35.0999    35.58  35.0999  35.4157    €  1425647   50,5 M
10  180106  34.8695  35.2343  34.5707  35.0871    €  2812569   98,7 M
11  170106  35.0145   35.565  35.0145  35.3623    €  2431866   86,0 M
12  160106   35.149  35.4584  34.9783  35.3751    €   747868   26,5 M
13  130106  35.5245  35.5266  35.0295  35.0786    €  2016092   70,7 M
14  120106  35.1383  35.5608  35.0145   35.452    €   941786   33,4 M
15  110106  35.3133  35.4882  34.8396  34.9527    €  1341719   46,9 M
16  100106   35.102  35.3346  35.0359  35.1127    €  1673729   58,8 M
17  090106  35.7976  35.7976  35.0359  35.3005    €  2055502   72,6 M
18  060106  35.7507  35.8467  35.5266  35.6909    €  1532681   54,7 M
19  050106  35.8254  35.8787  35.4989    35.74    €  1653103   59,1 M
20  040106    35.74  35.9534  35.5693  35.8467    €  2760820   99,0 M
21  030106  35.2386  35.8168  35.1618  35.3303    €  2885207    102 M
22  020106  34.4598  35.1084  34.4598  35.0359    €  1254853   44,0 M

In [263]: df.columns = df.iloc[0]

In [264]: df.drop(0, inplace=True)

In [265]: df
Out[265]:
0    Datum   Erster     Hoch     Tief  Schluss NaN   Stücke Volumen
1   310106  37.1782  37.5537  36.6383  36.7215   €  2324069  85,3 M
2   300106  36.3204  37.3745  36.2798   37.191   €  2553488  95,0 M
3   270106  35.6077  36.3887    35.58  36.2414   €  2950272   107 M
4   260106  35.2877   35.548  35.0594  35.4605   €  2147777  76,2 M
5   250106  35.6077  35.6077  35.1255  35.3133   €  1985601  70,1 M
6   240106  35.5266   35.612  35.2813    35.42   €  1435138  50,8 M
7   230106  35.2279  35.6931  35.0145  35.4584   €  1506623  53,4 M
8   200106   35.516   35.516  35.2514  35.3879   €  2251534  79,7 M
9   190106  35.0999    35.58  35.0999  35.4157   €  1425647  50,5 M
10  180106  34.8695  35.2343  34.5707  35.0871   €  2812569  98,7 M
11  170106  35.0145   35.565  35.0145  35.3623   €  2431866  86,0 M
12  160106   35.149  35.4584  34.9783  35.3751   €   747868  26,5 M
13  130106  35.5245  35.5266  35.0295  35.0786   €  2016092  70,7 M
14  120106  35.1383  35.5608  35.0145   35.452   €   941786  33,4 M
15  110106  35.3133  35.4882  34.8396  34.9527   €  1341719  46,9 M
16  100106   35.102  35.3346  35.0359  35.1127   €  1673729  58,8 M
17  090106  35.7976  35.7976  35.0359  35.3005   €  2055502  72,6 M
18  060106  35.7507  35.8467  35.5266  35.6909   €  1532681  54,7 M
19  050106  35.8254  35.8787  35.4989    35.74   €  1653103  59,1 M
20  040106    35.74  35.9534  35.5693  35.8467   €  2760820  99,0 M
21  030106  35.2386  35.8168  35.1618  35.3303   €  2885207   102 M
22  020106  34.4598  35.1084  34.4598  35.0359   €  1254853  44,0 M
Sign up to request clarification or add additional context in comments.

6 Comments

perfect thanks a lot!!! Somehow I saw that the values are using for the decimal seperation a comma instead of a point. How can I cange it in the reader?
Thanks a lot. I wish I could give you more honor for that answer. Was getting crazy :)
@MCM, glad i could help :-)
Sorry for bothering, but I have one more question. Is it possible to download directly from an the download button shown on the picture above?
@MCM, sorry, i don't know how to do that. It has also nothing to do with Pandas/Numpy/etc. So I'd recommend you to open a new question and to tag it correspondingly - it can attract right people...
|
0

To download via requests, First we need to locate URL which download button hits (possibly via JS?), You can use browser inspector or equivalent to do so. I found this in your case.

import requests
r = requests.get("http://www.ariva.de/quote/historic/historic.csv?secu=291&boerse_id=6&clean_split=1&clean_payout=0&clean_bezug=1&min_time=8.2.2016&max_time=8.2.2017&trenner=%3B&go=Download", stream=True)
with open('out.csv', 'wb') as fd:
    for chunk in r.iter_content(100):
        fd.write(chunk)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.