1

I'm trying to request data from wikipedia in python using xpath. I'm getting an empty list. What am I doing wrong.

import requests

from lxml import html

pageContent=requests.get(
     'https://en.wikipedia.org/wiki/List_of_Olympic_medalists_in_judo'
)

tree = html.fromstring(pageContent.content)

name = tree.xpath('//*[@id="mw-content-text"]/div/table[1]/tbody/tr[2]/td[2]/a[1]/text()')

print name
0

1 Answer 1

2

This is a very common mistake when trying to get the xpath from the browser and the table tags, as the browser is the one that normally adds the tbody tag inside of them, which doesn't actually exist inside the response body.

So just remove it and it should be like:

'//*[@id="mw-content-text"]/div/table[1]//tr[2]/td[2]/a[1]/text()'
Sign up to request clarification or add additional context in comments.

2 Comments

Awsome that worked thanks, what about something like this bittrex.com/Market/Index?MarketName=btc-nxt. Im trying to get the xpath data for the price in dollars //*[@id="rowChart"]/div[2]/div/div[2]/div/div[2]/span/text(). But its also giving me a blank value
@TarikKoric that's a totally different question and a total different case, first is it protected by cloudflare so getting any kind of data from that site will be extremely difficult and also looks like the entire site is generated dynamically with javascript, so a normal request won't work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.