3

I tried to web scrape the table data from a binary signals website. The data updates after some time and I wanted to get the data as it updates. The problem is, when I scrape the code it returns empty values. The table has a table tag.

I'm not sure if it uses something else other than html because it updates without reloading. I had to use a browser user agent to get passed the security.

When I run it returns correct data but I have noticed signal id increments by 1

<table class="ui stripe hover dt-center table" id="isosignal-table" style="width:100%"><thead><tr><th></th><th class="no-sort">Current Price</th><th class="no-sort">Direction</th><th class="no-sort">Asset</th><th class="no-sort">Strike Price</th><th class="no-sort">Expiry Time</th></tr></thead><tbody><tr :class="[ signal.direction.toLowerCase() == 'call' ? 'call' : 'put' ]" :id="'signal-' + signal.id" :key="signal.id" ref="signals" v-for="signal in signals"><td style="display: none;" v-text="signal.id"></td><td v-text="signal.current_price"></td><td v-html="showDirection(signal.direction)"></td><td v-text="signal.asset"></td><td v-text="signal.strike_price"></td><td v-text="parseTime(signal.expiry)"></td></tr></tbody></table>


table = soup.table
print(table)

But when I run the whole code it returns this: [] ['', '', '', '', '', '']

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen

url = "https://signals.investingstockonline.com/free-binary-signal-page"
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
page = urlopen(req)
data = page.read()

soup = BeautifulSoup(data, 'html.parser')
table = soup.table
table_rows = table.find_all('tr')

for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    if len(row) < 1:
         pass
    print(row)

I thought it would display the whole table but it just displayed empty strings. What could be the problem?

0

1 Answer 1

2

In the HTML you've provided, there is no text content in the elements, so you're getting that correctly. When you look at the live website, text content that appears in the table was inserted dynamically by JS fetching information from a server via ajax. In other words, if you perform a request, you'll get the skeleton (HTML) but no meat (live data).

You can use something like Selenium to extract this information as follows:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()  
chrome_options.add_argument("--headless")  
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("https://signals.investingstockonline.com/free-binary-signal-page")

for tr in driver.find_elements_by_tag_name("tr"):
    for td in tr.find_elements_by_tag_name("td"):
        print(td.get_attribute("innerText"))

Output (truncated):

EURJPY
126.044
22:00:00
1.50318

EURCAD
1.50332
22:00:00
1.12595

EURUSD
1.12604
22:00:00
0.86732

EURGBP
0.86743
22:00:00
1.29825

GBPUSD
1.29841
22:00:00
145.320
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you ggorlen for your answer. I was skeptical when I saw the rows being dynamically added. Unfortunately, I'm using the Firefox version of Selenium, how does it translate because it does not run?
Thank you @ggorlen, I just changed every chrome to option.
@MarkGacoka if the answer resolved the problem, it's customary to accept the solution.
Ok. I basically changed all the 'Chrome' to 'Firefox' and everything worked out perfectly. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.