I want to webscrape [this][1] page dynamic form, I'm using Selenium right now for that and obtaining some results.
My questions:
It is possible to replace the Selenium + WebDriver code with some POST Request? (I have worked with Requests before, but only when an API is available... I can't figure out how to reverse code this form)
Is there a better way to clean up the result page to get only the table? (In my example, the result "data" variable is a mess, but anyway I have obtained the last value which was the main purpose of the script)
Any recommendations?
My code:
from selenium import webdriver
import pandas as pd
from bs4 import BeautifulSoup
def get_tables(htmldoc):
soup = BeautifulSoup(htmldoc)
return soup.findAll('table')
driver = webdriver.Chrome()
driver.get("http://dgasatel.mop.cl/visita_new.asp")
estacion1 = driver.find_element_by_name("estacion1")
estacion1.send_keys("08370007-6")
driver.find_element_by_xpath("//input[@name='chk_estacion1a' and @value='08370007-6_29']").click()
driver.find_element_by_xpath("//input[@name='period' and @value='1d']").click()
driver.find_element_by_xpath("//input[@name='tiporep' and @value='I']").click()
driver.find_element_by_name("button22").click()
data = pd.read_html(driver.page_source)
print(data[4].tail(1).iloc[0][2])
Thanks in advance. [1]: http://dgasatel.mop.cl/visita_new.asp
html-requestspackage might be a possibility. It has the option to let the page render before pulling the source html