Beautifulsoup AttributeError: 'list' object has no attribute 'text'

Question

I have the following html code:

<div>
    <span class="test">
     <span class="f1">
      5 times
     </span>
    </span>

    </span>
   </div>

<div>

</div>

<div>
    <span class="test">
     <span class="f1">
      6 times
     </span>
    </span>

    </span>
   </div>

I managed to navigate the tree, but when trying to print I get the following error:

AttributeError: 'list' object has no attribute 'text'

Python code working:

x=soup.select('.f1')
print(x)

gives the following:

[]
[]
[]
[]
[<span class="f1"> 19 times</span>]
[<span class="f1"> 12 times</span>]
[<span class="f1"> 6 times</span>]
[]
[]
[]
[<span class="f1"> 6 times</span>]
[<span class="f1"> 1 time</span>]
[<span class="f1"> 11 times</span>]

but print(x.prettify) throws the error above. I am basically trying to get the text between the span tags for all instances, blank when none and string when available.

shouldn't it throw: AttributeError: 'list' object has no attribute 'prettify' ? — navyad
– navyad, Commented Oct 9, 2018 at 12:39

r.ook · Accepted Answer · 2018-10-09 13:02:01Z

1

select() returns a list of the results, regardless if the result has 0 items. Since list object does not have a text attribute, it gives you the AttributeError.

Likewise, prettify() is to make the html more readable, not a way to interpret the list.

If all you're looking to do is extract the texts when available:

texts = [''.join(i.stripped_strings) for i in x if i]

# ['5 times', '6 times']

This removes all the superfluous space/newline characters in the string and give you just the bare text. The last if i indicates to only return the text if i is not None.

If you actually care for the spaces/newlines, do this instead:

texts  = [i.text for i in x if i]

# ['\n      5 times\n     ', '\n      6 times\n     ']

answered Oct 9, 2018 at 13:02

r.ook

13.9k2 gold badges26 silver badges41 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Sohan Das · Accepted Answer · 2018-10-09 12:47:44Z

0

from bs4 import BeautifulSoup
html = '''<div>
    <span class="test">
     <span class="f1">
      5 times
     </span>
    </span>
    </span>
   </div>
<div>
</div>
<div>
    <span class="test">
     <span class="f1">
      6 times
     </span>
    </span>
    </span>
   </div>'''


soup = BeautifulSoup(html, 'html.parser')
aaa = soup.find_all('span', attrs={'class':'f1'})
for i in aaa:
    print(i.text)

Output:

5 times
6 times

answered Oct 9, 2018 at 12:47

Sohan Das

1,6302 gold badges17 silver badges19 bronze badges

Comments

Desiigner · Accepted Answer · 2018-10-09 13:01:13Z

0

I'd recommend you using .findAll method and loop over matched spans.

Example:

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'lxml')

for span in soup.findAll("span", class_="f1"):
    if span.text.isspace():
        continue
    else:
        print(span.text)

The .isspace() method is checking whether a string is empty (checking if a string is True won't work here since an empty html span cointans spaces).

edited Oct 9, 2018 at 13:01

answered Oct 9, 2018 at 12:51

Desiigner

2,3668 gold badges33 silver badges51 bronze badges

Collectives™ on Stack Overflow

Beautifulsoup AttributeError: 'list' object has no attribute 'text'

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related