1

I'm using scrapy and I got to this point where I'd like to extract the text from a list with the following HTML structure:

u'<div id="someId">'
u'<p><strong>Text1:</strong> next to text 1</p>'
u'<p><strong>Text2:</strong> next to text 2</p>'
u'<p><strong>Text3:</strong> next to text </p>'
u'</div>'

so I'd like to get just the text:

Text1: next to text1

Text2: next to text2

Text3: next to text3

I want to extract the text with XPath as much as possible, I've been trying to use some XPath predicates without resolving my issue.

with

response.xpath('//*[@id="someid"]/p/text()').extract()

I don't get the text for the strong tag within P

any help will be more than appreciated.

1 Answer 1

4

you were close:

'//*[@id="someid"]/p//text()'

This will get you a list with all the text inside that p tag.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, I wasn't aware of "//"
my pleasure @jack.the.ripper

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.