I m working with BeautifulSoup in Python for scraping a webpage. The html under issue looks like below:
<td><a href="blah.html>blahblah</a></td>
<td>line2</td>
<td></td>
i wish to take the contents of the td tag. So for the first td, i need the "blahblah" text and for the next td, i want to write "line2" and for the last td, "blank" because there is no content.
my code snippet looks like this -
row = []
for each_td in td:
link = each_td.find_all('a')
if link:
row.append(link[0].contents[0])
row.append(link[0]['href'])
elif each_td.contents[0] is None:
row.append('blank')
else:
row.append(each_td.contents[0])
print row
However on running, i get the error -
elif each_td.contents[0] is None:
IndexError: list index out of range
Note- i am working with beautifulsoup.
How do I test for the "no-content-td" and weite appropriately? Why is the "... is None" not working?