1

I am trying to scrape the year from the html below (https://www.espncricinfo.com/series/indian-premier-league-2022-1298423/punjab-kings-vs-delhi-capitals-64th-match-1304110/full-scorecard). Due to the way the site is coded I have to first identify the table cell that contains the word "Season" then get the year (2022 in this example).

I thought this would get it but it doesn't. There are no errors, just no results. I've not used the following-sibling approach before so I'd be grateful if someone could point out where I've messed up.

l.add_xpath(
            'Season',
            "//td[contains(text(),'Season')]/following-sibling::td[1]/a/text()")

html:

<tr class="ds-border-b ds-border-line">
    <td class="ds-min-w-max ds-border-r ds-border-line">
        <span class="ds-text-tight-s ds-font-medium">Season</span>
    </td>
    <td class="ds-min-w-max">
        <span class="ds-inline-flex ds-items-center ds-leading-none">
            <a href="https://www.espncricinfo.com/ci/engine/series/index.html?season2022" class="ds-text-ui-typo ds-underline ds-underline-offset-4 ds-decoration-ui-stroke hover:ds-text-ui-typo-primary hover:ds-decoration-ui-stroke-primary ds-block">
                <span class="ds-text-tight-s ds-font-medium">2022</span>
            </a>
        </span>
    </td>
</tr>

1 Answer 1

1

Try:

//span[contains(text(),"Season")]/../following-sibling::td/span/a/span/text()
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks @F.Hoque however there is a lot of different content with a similar path. That's why I tried using contains
@ Andy, I've updated and proven following your url that will select text value : 2022 only

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.