0

Link to website: http://www.tennisabstract.com/cgi-bin/player-classic.cgi?p=RafaelNadal

I am trying to write code which goes through each row in a table and extracts each element from that row. I am aiming for an ouput in the following layout

Row1Element1, Row1Element2, Row1Element3 
Row2Element1, Row2Element2, Row2Element3
Row3Element1, Row3Element2, Row3Element3

I have had two major attempts at coding this.

Attempt 1:

rows = driver.find_elements_by_xpath('//table//body//tr')
elements = rows.find_elements_by_xpath('//td')
#this gets all rows in the table, but then gets all elements on the page, 
not just the table

Attempt 2:

driver.find_elements_by_xpath('//table//body//tr//td')
#this gets all the elements that I want, but makes no distinction to which 
 row each element belongs to

Any help is appreciated

4
  • Can you provide a link to the site you're scraping from? Commented Oct 8, 2019 at 2:44
  • Sure, there you go Commented Oct 8, 2019 at 2:56
  • Many tables in the pages, which the table you mean? Commented Oct 8, 2019 at 3:31
  • The large one at the bottom Commented Oct 8, 2019 at 3:39

2 Answers 2

1

You can get table headers and use indexes to get right sequence in the row data.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.tennisabstract.com/cgi-bin/player-classic.cgi?p=RafaelNadal")

table_headers = [th.text.strip() for th in driver.find_elements_by_css_selector("#matchheader th")]
rows = driver.find_elements_by_css_selector("#matches tbody > tr")

date_index = table_headers.index("Date")
tournament_index = table_headers.index("Tournament")
score_index = table_headers.index("Score")

for row in rows:
    table_data = row.find_elements_by_tag_name("td")
    print(table_data[date_index].text, table_data[tournament_index].text, table_data[score_index].text)
Sign up to request clarification or add additional context in comments.

1 Comment

@Michael Happy to help. If this answer or any other one solved your issue, please mark it as accepted, how to accept the answer
0

This is the locator each rows the table you mean XPATH: //table[@id="matches"]//tbody//tr

First following import:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

Each rows:

driver.get('http://www.tennisabstract.com/cgi-bin/player-classic.cgi?p=RafaelNadal')

rows = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, '//table[@id="matches"]//tbody//tr')))

for row in rows:
    print(row.text)

Or each cells:

for row in rows:
    cols = row.find_elements_by_tag_name('td')
    for col in cols:
        print(col.text)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.