0

I am scraping a site for some stats and getting the results as expected, but I can't get the final list output into a string. Searched and tried everything I can find... strip(), append(), replace('\n'), replace('\n\t\r'), and a few dozen other things. And, I get an output error at the end as there are some additional items in list I don't want.

Output I get:

81
79
55
12
76
AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

Output I want:

81 79 55 12 76

Here is a sample of what I am scraping:

</li>, <li><span class="bp3-tag p p-81">81</span> f1</span>
</li>, <li><span class="bp3-tag p p-79">79</span> f2</span>
</li>, <li><span class="bp3-tag p p-55">55</span> f3</span>
</li>, <li><span class="bp3-tag p p-12">12</span> f4</span>
</li>, <li><span class="bp3-tag p p-76">76</span> f5</span>
[<li><span class="tooltip multiline" data-tooltip="some text i don't care about.">

My code looks like this, where a_stats is the list of fields being searched (f1, f2, ...)

dws = soup.find_all('div', {'class': 'col-3'})
more_lis = [div.find_all('li') for div in dws]
lis = soup.find_all('li') + more_lis
for li in lis:
       for stats in a_stats:
           if stats in li.text:
                t = re.findall('\d+', li.text)
                ti = (" ".join(t))
                print(ti)

I'm very much a novice, and this feels like it should be easy but I just can't get there yet. Help appreciated. Many thanks in advance.

2 Answers 2

1

Instead of print(t1) try print(t1, end=" ")

EDIT

dws = soup.find_all('div', {'class': 'col-3'})
more_lis = [div.find_all('li') for div in dws]
lis = soup.find_all('li') + more_lis
for li in lis:
       for stats in a_stats:
           try:
               if stats in li.text:
                   t = re.findall('\d+', li.text)
                   ti = (" ".join(t))
                   print(ti)
           except AttributeError:
               pass

Added try and except block to handle AttributeError


The end argument in print decides what should follow after the object is printed. By default it is \n so you get the new line. Change it to a while space like " " and that should be it.

Sign up to request clarification or add additional context in comments.

4 Comments

Close! got the integers as desired, but it's still throwing the attribute error for the unwanted tooltip... Any thoughts?
@mayord kindly provide the exact line no where you are encountering the error
It's in line 6: 'if stats in li.text:'
I have made changes to the answer
0

Here's an example based on reading the HTML from a file. The changes needed for your use-case should be obvious:-

from bs4 import BeautifulSoup

with open('/Users/andy/dummy.html') as html:
    vals = []
    soup = BeautifulSoup(html, 'html.parser')
    divs = soup.find_all('div', class_='col-3')
    for div in divs:
        for li in div.find_all('li'):
            vals.append(li.text+' ')
    print(''.join(vals))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.