0

When I run the below code, a chrome window opens, the page loads and a CSV file downloads into my documents.

However, I want to download the CSV file into a python list.

When I try to print the initial download it shows 'None' and reading in csv.reader shows the following error message:

import csv
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

def getData()
    driver = webdriver.Chrome()
    driver.get(f"http://financials.morningstar.com/balance-sheet/bs.html?t=AAPL")
    button = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, "rf_export")))
    data = button.click()
    print(data)
    data = csv.reader(button)
    for row in data:
        print(row)
    return data

getData()


-------------


None
Traceback (most recent call last):
  File "helpers.py", line 403, in <module>
    newData2("AAPL")
  File "helpers.py", line 397, in newData2
    data = csv.reader(button)
TypeError: argument 1 must be an iterator
2
  • What do you mean by "download the CSV file into a python list" ? Commented Mar 29, 2019 at 1:34
  • I mean downloading into a python variable rather than to the local device. Commented Mar 29, 2019 at 23:25

1 Answer 1

1

If you take a look at he download button on that page, it is a link to the following javascript function: SRT_stocFund.Export()

Looking at this function (at http://financials.morningstar.com/finan/static/script/SRT_stockFund.js), it calls SRT_StockFund.GetPara(), and uses the returned data to create a link, and changes your browser's location to it:

document.location = hostPath+"/ajax/ReportProcess4CSV.html?" + params+"&denominatorView="+denominatorView+"&number="+number;

In my case, the url looked like this:

"//financials.morningstar.com/ajax/ReportProcess4CSV.html?&t=XNAS:AAPL&region=usa&culture=en-US&cur=&reportType=bs&period=12&dataType=A&order=asc&columnYear=5&curYearPart=1st5year&rounding=3&view=raw&r=13805&denominatorView=raw&number=3"

What you could easily do in selenium is:

  • call the GetPara() function
  • create the download url yourself.

You can call javascript from within selenium with something like: driver.execute_script('SRT_stocFund.GetPara()') - and then build your string to create the download link, and retrieve it.

Your browser's dev tools are your friend here.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for taking the time Danielle, I had explored the JS in an attempt to pull from the url alone but for some reason this doesn't work. Running driver.execute_script('SRT_stocFund.Export()') alone downloads the file. However, as with the above method, it downloads to my local device rather than into a python variable which is the requirement. Any ideas?
As per my comment, construct the URL, and then use python to download it, eg.: requests.get(the_url)
That is what my original program did. However, the site seems to have restricted use of the url. Hence my search for another option. Entering your link no longer initiates a download :(
No, it needs to be generated every time, using selenium to grab the page and run the javascript to get the vars, and only then to build the ephemeral URL.
OK I get you! Code now includes driver.execute_script(f'SRT_stocFund.LoadAComponent("sfcontent", "XNAS:AAPL", "bs", "", "SB", 10)') followed by driver.execute_script('SRT_stocFund.GetPara()') then I try requests.get(my_url) but the value has gone from returning none to returning an empty array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.