2

I am using Python Google Colab and trying to read the csv file from this link: https://www.macrotrends.net/stocks/charts/AAPL/apple/stock-price-history

If you scroll little bit down, you will be able to see download button. I'd like to get the link by using selenium or bs and read the csv file. I am trying to do something like this,

# install packages
!pip install selenium
!apt-get update # to update ubuntu to correctly run apt install
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin

# load packages
import pandas as pd
from selenium import webdriver
import sys

# run selenium and read the csv file
sys.path.insert(0,'/usr/lib/chromium-browser/chromedriver')
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver.get('https://www.macrotrends.net/stocks/charts/AAPL/apple/stock-price-history')#put here the adress of your page
btn = driver.find_element_by_tag_name('button')
btn.click()
df = pd.read_csv('##.csv')

It seems to be working until btn.click() part but getting error after as it doesn't tell me the link of the download button nor the file name. Could you please assist? That would be much appreciated.

2
  • What's the error you're getting? Please add the stack traceback. Commented Mar 4, 2021 at 9:41
  • @PatrickKlein the btn.click() was not doing anything. I just checked that chitown88 method works perfectly. Commented Mar 4, 2021 at 10:16

1 Answer 1

3

No need for selenium. The data is embedded in the <script> tags.

import requests
from bs4 import BeautifulSoup
import json
import pandas as pd

t = 'AAPL'
url = 'https://www.macrotrends.net/assets/php/stock_price_history.php?t={}'.format(t)

response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

scripts = soup.find_all('script',{'type':'text/javascript'})
for script in scripts:
    if 'var dataDaily' in str(script):
        jsonStr = '[' + str(script).split('[',1)[-1].split('];')[0] + ']'
        jsonData = json.loads(jsonStr)
        
df = pd.DataFrame(jsonData)
df = df.rename(columns={'o':'open','h':'high','l':'low','c':'close','d':'date','v':'volume'})
df.to_csv('MacroTrends_Data_Download_{}.csv'.format(t), index=False)

Output:

print(df)
             date      open      high  ...   volume     ma50    ma200
0      1980-12-12    0.1012    0.1016  ...  469.034      NaN      NaN
1      1980-12-15    0.0964    0.0964  ...  175.885      NaN      NaN
2      1980-12-16    0.0893    0.0893  ...  105.728      NaN      NaN
3      1980-12-17    0.0910    0.0915  ...   86.442      NaN      NaN
4      1980-12-18    0.0937    0.0941  ...   73.450      NaN      NaN
          ...       ...       ...  ...      ...      ...      ...
10135  2021-02-25  124.6800  126.4585  ...  148.200  131.845  112.241
10136  2021-02-26  122.5900  124.8500  ...  164.560  131.838  112.460
10137  2021-03-01  123.7500  127.9300  ...  116.308  131.840  112.716
10138  2021-03-02  128.4100  128.7200  ...  102.261  131.790  112.957
10139  2021-03-03  124.8100  125.7100  ...  111.514  131.661  113.184

[10140 rows x 8 columns]
Sign up to request clarification or add additional context in comments.

4 Comments

Is this a typo in your answer? for script in scripts: script=scripts[5] ...
Ah shoot.Wasn't necessarily a typo, but was in there for when I was testing/debuging. Forgot to take that out. I'll fix that. Thanks for catching that.
Thanks much for the answer. I am trying to apply to a different website. Would you be able to explain what the lines below the 'for script in scripts' loop do? I tried searching the source html of the example URL for 'var dataDaily' but could not find it, and so I cannot figure out what the if statement is doing. Thanks!
this data is found specifically in the <script> tags as json format. The javascript variable it is found under is the var dataDaily, hence I pull out the string with that. This is specific to this particular site. Other sites may not have the data within the <script> tags, and if they do, likely have it stored under a different variable name.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.