Extracting user data from stackoverflow using Selenium

Question

I'm pretty new to this web scraping (Data extraction) stuff. I want to extract the user's reputation from his stackoverflow account. I'm using Selenium. I've successfully logged in but I can't get the data from the next url, which is http://stackoverflow.com

This is my code:

from selenium import webdriver
from selenium.webdriver.support import ui
def page_is_loaded(driver):
    return driver.find_element_by_tag_name("body") != None

chromedriver = 'C:\\chromedriver.exe'
browser = webdriver.Chrome(chromedriver)
browser.get('https://stackoverflow.com/users/login')

username = browser.find_element_by_id("email")
password = browser.find_element_by_id("password")

username.send_keys("emailID")
password.send_keys("password")

browser.find_element_by_name("submit-button").click()

wait = ui.WebDriverWait(browser, 10)
wait.until(page_is_loaded)

print browser.current_url

It works, I get redirected to the next page, but the last command still prints: https://stackoverflow.com/users/login

Thanks in advance. I'm sure I'm missing something little.

@LukasGraf The API tells about the information that's on the next page, i.e. User page, I've tried extracting the data using those tags. It doesn't work. — Malik Usman
– Malik Usman, Commented Jun 7, 2016 at 11:01

praba230890 · Accepted Answer · 2016-06-07 11:50:24Z

2

It takes some time to update the browser.current_url after redirecting. You can use either browser.refresh() or time.sleep() to get the updated value.

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://stackoverflow.com/users/login')
username = browser.find_element_by_id("email")
password = browser.find_element_by_id("password")
username.send_keys("emailID")
password.send_keys("password")
browser.find_element_by_name("submit-button").click()
browser.refresh()
print browser.current_url

Hope, the output of the below code could help you understand this better.

import time
from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://stackoverflow.com/users/login')
username = browser.find_element_by_id("email")
password = browser.find_element_by_id("password")
username.send_keys("emailID")
password.send_keys("password")
browser.find_element_by_name("submit-button").click()

for i in range(5):
    print browser.current_url, " - loop ", i
    time.sleep(1)
print browser.current_url

edited Jun 7, 2016 at 11:50

answered Jun 7, 2016 at 11:39

praba230890

2,32022 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Malik Usman Over a year ago

I need your help on another thing reputation = browser.find_element_by_class_name("reputation") Why is this code returning this: <selenium.webdriver.remote.webelement.WebElement (session="6324aa926438c0c8ff58c9bd8b1f73c4", element="0.19992890885383097-1")>

AntonB Over a year ago

Your asking selenium to find_element, so what you're getting back is a web element. You should use something like reputation = browser.find_element_by_class_name("reputation").text if you want the text inside of the element.

Collectives™ on Stack Overflow

Extracting user data from stackoverflow using Selenium

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related