0

I tried to scrape this page using beautifulsoup find method but I could not find the table value in the HTML page. I found out that the website is generating the data instantly when I load the page through an internal API. Any help??

Thanks in advance.

7
  • 1
    I've not looked at the page, but if you've found the Ajax request to the API then you should be able to make your own request to it? Commented Mar 5, 2019 at 15:46
  • One of these should do it: github.com/dhamaniasad/HeadlessBrowsers Commented Mar 5, 2019 at 15:49
  • @RobinZigmond thanks for your reply. Do you have any resources on how to establish an Ajax request from python? Commented Mar 5, 2019 at 15:51
  • @AhmedSabry You should try to use Selenium in addition to beautifulsoup, it allows you to recover the data load from an ajax request because you can wait for the load Commented Mar 5, 2019 at 15:51
  • 2
    Possible duplicate of python - web scraping an ajax website using BeautifulSoup Commented Mar 5, 2019 at 16:20

1 Answer 1

2

This works for me. I had to dig around in the dev tools but found it

import requests

geturl=r'https://www.barchart.com/futures/quotes/CLJ19/all-futures'
apiurl=r'https://www.barchart.com/proxies/core-api/v1/quotes/get'


getheaders={

    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.9',
    'cache-control': 'max-age=0',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36'
    }

getpay={
    'page': 'all'
}

s=requests.Session()
r=s.get(geturl,params=getpay, headers=getheaders)



headers={
    'accept': 'application/json',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.9',
    'referer': 'https://www.barchart.com/futures/quotes/CLJ19/all-futures?page=all',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36',
    'x-xsrf-token': s.cookies.get_dict()['XSRF-TOKEN']

}


payload={
    'fields': 'symbol,contractSymbol,lastPrice,priceChange,openPrice,highPrice,lowPrice,previousPrice,volume,openInterest,tradeTime,symbolCode,symbolType,hasOptions',
    'list': 'futures.contractInRoot',
    'root': 'CL',
    'meta': 'field.shortName,field.type,field.description',
    'hasOptions': 'true',
    'raw': '1'

}


r=s.get(apiurl,params=payload,headers=headers)
j=r.json()
print(j)

>{'count': 108, 'total': 108, 'data': [{'symbol': 'CLY00', 'contractSymbol': 'CLY00 (Cash)', ........
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.