How can I get the data from a css selector using scrapy?

Question

I'm a noob to web and scrapy. Sorry for the simpleness of this question.

I've got this: item['title'] = response.css('.pt-title a ::title').extract()

And I want to get the title from this:

<a href="http://www.heresyodomain.com/" title="Here's the title!">Here's the title!</a>

I was doing this item['title'] = response.css('.pt-title a::text').extract() but I realized that I was just getting the text not in the tag.

I've tried a few iterations of what I have above, that's just the last one I left off on. A little guidance would be much appreciated.

GHajba · Accepted Answer · 2015-08-05 07:42:50Z

3

Your query selects the text of the a tag because of a::text. If you need the title's text try the following:

item['title'] = response.css('.pt-title a::attr(title)').extract()

Eventually you get back a list, so you should take care of that too with item['title'] = response.css('.pt-title a::attr(title)').extract()[0] -- after proper validation of the result of course.

answered Aug 5, 2015 at 7:42

GHajba

3,7215 gold badges28 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Community · Accepted Answer · 2017-05-23 11:43:52Z

1

Thanks to this question: python scrapy get href using css selector I got an answer.

I used this: item['title'] = response.css('.pt-title a::attr(title)').extract()

edited May 23, 2017 at 11:43

CommunityBot

11 silver badge

answered Aug 5, 2015 at 7:38

SirRupertIII

12.6k20 gold badges78 silver badges124 bronze badges

Collectives™ on Stack Overflow

How can I get the data from a css selector using scrapy?

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related