2

I'm very new to python 2.7 and I have a task to read a table in the URL.

I'm getting the data from URL with table. and now the issue is, I need only data but I am getting with tags also. Please help me. Thank you in advance.

from bs4 import BeautifulSoup
import urllib2


    response = urllib2.urlopen('https://www.somewebsite.com/')
    html = response.read()
    soup = BeautifulSoup(html)

    tabulka = soup.find("table", {"class" : "defaultTableStyle tableFontMD tableNoBorder"})



    records = [] 
    for row in tabulka.findAll('tr'):
        col = row.findAll('td')

        print col 

1 Answer 1

3

you have to use .text attribute

from bs4 import BeautifulSoup
import urllib2


response = urllib2.urlopen('https://www.somewebsite.com/')
html = response.read()
soup = BeautifulSoup(html)

tabulka = soup.find("table", {"class" : "defaultTableStyle tableFontMD tableNoBorder"})



records = [] 
for row in tabulka.findAll('tr'):
    col = row.findAll('td')

    print [coli.text for coli in col]
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for your answer, and one issue is there, i.e i am getting every element with [u'Type', u'Name', u'Discovered'], but there is no u in the html
u means the encoding is unicode. it is not part of the text, you can change to coli.text.encode('utf-8') to get rid of it
Thank you very much, & guide me one last question, I have an tag in that table row i.e <a href="www.somewebsite.com/quesitons"></a> with some link, & how can i read that link.
repeat the above with response = urllib2.urlopen('https://www.somewebsite.com/quesitons')
Actually the html format is like this. <td> text</td <td><a href="link">text</a></td><td>text></td> and I am able to get the text of all td's but i also required the link. at atime how can I get that.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.