The tags from a web page are as follows:
<div class="lg_col MT5">
<p>
<span class="sp starGryB">4.4</span>
</p>
<p class="MT5 UC">
<span class="gd10gb">141 Ratings</span>
</p>
</div>
I am trying to retrieve the values "4.4", and "141 Ratings" for all the div class values "lg_col MT5".
The nested for loop that I use isn't working as expected. It seems as if the hierarchy of the tags isn't taken into account.
import requests
import sys
from bs4 import BeautifulSoup
HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"}
def test_function():
url = "http://www.burrp.com/chennai/search.html?q=buffet"
source_code = requests.get(url, headers=HEADERS)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for tag in soup.select('div.lg_col.MT5'):
for tag1 in soup.select('span.sp.starGryB'):
try:
print(tag1.string)
except KeyError:
pass
for tag2 in soup.select('span.gd10gb'):
try:
print(tag2.string)
except KeyError:
pass
test_function()
`
The expected output is: 4.4 followed by 141 Ratings for each of the div tags in the webpage.
But the output is: All the starGryB values followed by all the gd10gb values as this happens over and over again.
starGryBclass in an example you posted. Is it a typo? Also, "does not work as expected" is not very descriptive. How exactly it work and what do you expect from it?