1

I'm a noob to web and scrapy. Sorry for the simpleness of this question.

I've got this: item['title'] = response.css('.pt-title a ::title').extract()

And I want to get the title from this:

<a href="http://www.heresyodomain.com/" title="Here's the title!">Here's the title!</a>

I was doing this item['title'] = response.css('.pt-title a::text').extract() but I realized that I was just getting the text not in the tag.

I've tried a few iterations of what I have above, that's just the last one I left off on. A little guidance would be much appreciated.

2 Answers 2

3

Your query selects the text of the a tag because of a::text. If you need the title's text try the following:

item['title'] = response.css('.pt-title a::attr(title)').extract()

Eventually you get back a list, so you should take care of that too with item['title'] = response.css('.pt-title a::attr(title)').extract()[0] -- after proper validation of the result of course.

Sign up to request clarification or add additional context in comments.

Comments

1

Thanks to this question: python scrapy get href using css selector I got an answer.

I used this: item['title'] = response.css('.pt-title a::attr(title)').extract()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.