Web Scraping Return Empty Value Using Xpath in Scrapy

Question

Really need the help from this community.

My question is that when I used the code

========================================================================= response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract() enter image description here

to extract the vendor name in scrapy shell, the output is empty. I really did not know why that happened, and it seems to me that the problem might be the website info is updating dynamically?

The url for this web scraping is: https://cruiseline.com/cruise/7-night-bahamas-florida-new-york-roundtrip-32860, and what I need is the Vendor name and Price for each vendor. Besides the attached pic is the screenshot of "the inspect".

Really appreciate the help!

Xavier Guihot · Accepted Answer · 2018-03-30 07:33:37Z

1

You need to always check HTML source code in your browser (usually with Ctrl+U).

This way you'll find that information you want is embedded inside Javascript variables using JSON:

var partnerPrices = [{"pool":"9a316391b6550eef969c8559c14a380f","partner":"ncl.com","priority":0,"currency":"USD","data":{"32860":{"2018-02-25":{"Inside":579,"Suite":1199,"Balcony":699,"Oceanview":629},....
var sponsored_partners = [{"code":"CDCNA","name":"cruises.com","value":"cruises.com","logo":"\/images\/partner-logo-cruises-sm.png","logo_sprite":"partner-logo-cruises-com"},...

So you need to import json, parse response.body (using re or another method) and next json.loads() parsed JSON strings to iterate through two arrays.

edited Mar 30, 2018 at 7:33

Xavier Guihot

62.8k26 gold badges320 silver badges202 bronze badges

answered Feb 11, 2018 at 11:04

gangabass

10.7k2 gold badges26 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Web Scraping Return Empty Value Using Xpath in Scrapy

========================================================================= response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract() enter image description here

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

========================================================================= response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract() enter image description here

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related