3

I am new to python and need some guidance on extracting values from specific cells from a HTML table.

The URL that I am working on can be found here

I am looking to get the first 5 values only in the Month and Settlement columns and subsequently display them as:

"MAR 14:426'6"

Problem that I am facing is:

  1. How do I get the loop to start from the 3rd "TR" in the table
  2. How to get only values for td[0] and td[6].
  3. How to restrict the loop to only retrieve values for 5 rows

This is the code that I am working on:

tableData = soup1.find("table", id="DailySettlementTable")
for rows in tableData.findAll('tr'):
    month = rows.find('td')
    print month

Thank you and appreciate any form of guidance!

1
  • It may be cleaner and easier to extract all data into a list of lists, and then get the fields you want. It's not as performant, of course, but you probably don't need to worry about that just yet, especially since you're new to python Commented Dec 18, 2013 at 17:19

1 Answer 1

1

You probably want to use slicing.

Here's a modified snippet for your code:

table = soup.find('table', id='DailySettlementTable')

# The slice notation below, [2:7], says to take the third (index 2)
# to the eighth (index 7) values from the rows we get.
for rows in table.find_all('tr')[2:7]:
    cells = rows.find_all('td')
    month = cells[0]
    settle = cells[6]

    print month.string + ':' + settle.string
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.