3

I am new to Python and Web Scraping so please bear with me. I have been trying to build a web scraping tool to open a web page, log-in, and retrieve a certain value. Thus far, I have been able to open the web page and log-in. However, I simply cannot find a way to retrieve (print) the value that I require. This is what my current code looks like:

from selenium import webdriver
from bs4 import BeautifulSoup

driver = webdriver.Chrome(executable_path=r'C:/Users/User/Downloads/chromedriver.exe')

url = "xxxxxxxx"
driver.get(url)
driver.find_element_by_name("username").send_keys("xxxxx")
driver.find_element_by_name("password").send_keys("xxxxx")
elem = driver.find_element_by_css_selector("form#frmMain > a:nth-child(4)")
elem.click()

html = '''<p class="value noWrap" data-bind="text: MarketValue">R 4 516 469.32</p>'''
soup = BeautifulSoup(html, 'lxml')

for p in soup.find_all('p'):
    print(p.string)

driver.quit()

The value I require is embedded in the html variable above "R 4 516 469.32". However, this value changes on a daily basis. I have tried using xpath and css, but the value in question seems to be hidden for some odd reason. How can I refer to the element dynamically in order to be able to retrieve the new value every day?

Please note: I have blanked out the url as this is a website used for company purposes.

Please help!

Thanks so much

4
  • What do you mean by "the value in question seems to be hidden"? Commented Mar 19, 2019 at 11:48
  • @JackFleeting if i print out the page this is how the html appears for the above: <p class="value noWrap" data-bind="text: MarketValue"></p> Commented Mar 19, 2019 at 12:09
  • I just copied and pasted the main 4 lines of your code (beginning with html = ) and ended up with R 4 516 469.32. So I can't see what the problem is. Same thing if I change the last line to print(p.text). Commented Mar 19, 2019 at 12:32
  • I may have not been clear, the HTML code in my code above specifies the market value. However I was looking for a dynamic solution to the above paste code as the market value changes everyday and I would not like to retrieve the HTML string every day Commented Mar 19, 2019 at 12:41

1 Answer 1

2

The desired element is a dynamic element so to extract the text within the element you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following solutions:

  • Using CSS_SELECTOR:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.value.noWrap[data-bind$='MarketValue']"))).get_attribute("innerHTML"))
    
  • Using XPATH:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@class='value noWrap' and contains(@data-bind,'MarketValue')]"))).get_attribute("innerHTML"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.