I would like to extract information about website similarity from this link:
https://www.alexa.com/siteinfo/amazon.com
I am looking at class='site', trying to extract information from
<a href="/siteinfo/ebay.com" class="truncation">ebay.com</a>
but I can see only one value. Could it be possible to extract all the 4 values and related overlap score?
What I am trying to achieve is a table which includes this information
W amazon.com
eBay.com 70.1
pinterest.com 54.7
wikipedia.org 51.3
facebook.com 50.4
I have tried
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, "html.parser")
print([item.get_text(strip=True) for item in soup.select("span.site")])
but this seems to be enough for getting information because of some wrong parameters in the code.
span.truncation,a.trunctation, ordiv.siteJavaScriptto add elements - butBeautifulSoupandrequestscan't run JavaScript - you may need Selenium to control real web browser which can runJavaScripta.truncationis the element that you've shown in the question. And the scores look like<span class="truncation">38.0</span>, sospan.truncation. For site classes, those are only on div elements