1

I'm using scrapy to crawl and scrape a website. I need the whole html instead of components. We can easily extract the component using xpath selectors but is there any method to extract the whole html block for a given class. For example in the below html code, i need the exact html source code of the whole div block prod-basic-info. Is there anyway i can do this ?

<div class="block prod-basic-info">
 <h2>Product information</h2>
 <p class="product-info-label">Category</p>
  <p>
   <a href="xyz.html"</a>
 </p>
</div>

1 Answer 1

4

Just point your xpath expression or CSS selector to the element and extract() it:

response.xpath('//div[contains(@class, "prod-basic-info")]').extract()[0]
response.css('div.prod-basic-info').extract()[0]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.