1
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys

opts = Options()
opts.set_headless()
assert opts.headless  # Operating in headless mode
browser = Chrome(executable_path=r"C:\Users\taksh\AppData\Local\Programs\Python\Python37-32\chromedriver.exe", options=opts)
browser.implicitly_wait(3)
browser.get('https://ca.finance.yahoo.com/quote/AMZN/profile?p=AMZN')

results = browser.find_elements_by_xpath('//*[@id="quote-header-info"]/div[3]/div/div/span[1]')
print(results)

And I get back:

[<selenium.webdriver.remote.webelement.WebElement (session="b3f4e2760ffec62836828e62530f082e", element="3e2741ee-8e7e-4181-9b76-e3a731cefecf")>]

What I actually what selenium to scrape is the price of the stock. I thought i was doing it correctly because this would find the element when I used selenium on Chrome without headless mode. How can I scrape the actual data from the website in headless mode?

2 Answers 2

1

You need to further extract the data after getting all element in a list.

results = browser.find_elements_by_xpath('//*[@id="quote-header-info"]/div[3]/div/div/span[1]')

for result in results:
    print(result.text)

This will display all the data present in list.

Sign up to request clarification or add additional context in comments.

4 Comments

I seeeee! So basically if I am using selenium with headlesss mode, any sort of data that I scrape I will have to write this for loop to display it basically correct?
@JackJones, exactly, you should do write a loop to extract data, no matter whether its GUI mode or headless. find_elements returns list of webelement not list of string. .text is there to get individual web element text. in your case while you printing results its printing all weblement present in that list nothing else. If there is single element then go with find_elements
i see, so basically if for some reason you may get an error when trying to scrape the data, it isn't a bad idea to try find_element instead of find_elements because you might have multiple elements of that type correct?
If there is only one element then use find_element and if scrapping multiple same type of element then find_elements. If element is not there while using find_element then it will throw NoSuchElementException while find_elements returns 0 in this case. So to avoid exception you can use find_elements and check if size >0 the element there else no element. without facing the exception.
0

It could be same xpath and locator appearing multiple time in html. So if we can put this code in try-catch while checking in headless mode.

Headless mode basically will scan HTML only so to debug better Try - differnt version of xpath like going to its parent of span and then traversing it

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.