0

Background: I understand this question has been asked, but I'm really struggling with finding the right Xpath formula. I'm working with the iTunes XML file. So Apple is really annoying how they formatted this file.... Instead of making a tag, and then the text being the ID, they only using , , <some_value> tags. This is making it REALLY confusing trying to find the right element.

I've been trying to following this question: Python ElementTree: find element by its child's text using XPath but it's just not working for me. I've been reading this (https://lxml.de/tutorial.html) document, but i'm not finding the answer I need. I'm fairly certain the Xpath is the way to go, but I'll take any better suggestions.

I did try using some itunes specific libraries. It seems like they are all out of date, or just not working how I need. I was initially using Elementtree, but the getnext() feature from lxml is a savior when it Apple formats the file this way.

Example XML file:

<plist version="1.0">
<dict>
    <key>Library Persistent ID</key><string>6948B4402F0EEFFF</string>
    <key>Tracks</key>
    <dict>
        <key>18051</key>
        <dict>
            <key>Track ID</key><integer>18051</integer>
            <key>Size</key><integer>7930116</integer>
            <key>Total Time</key><integer>196336</integer>
            <key>BPM</key><integer>86</integer>
            <key>Date Modified</key><date>2018-10-23T12:41:05Z</date>
            <key>Date Added</key><date>2017-07-25T02:49:11Z</date>
        </dict>
        <key>18053</key>
        <dict>
            <key>Track ID</key><integer>18053</integer>
            <key>Size</key><integer>9780560</integer>
            <key>Total Time</key><integer>243513</integer>
            <key>Year</key><integer>2010</integer>
            <key>BPM</key><integer>74</integer>
            <key>Date Modified</key><date>2018-10-23T12:41:09Z</date>
            <key>Date Added</key><date>2017-07-25T02:49:11Z</date>
        </dict>
        <key>18055</key>
        <dict>
            <key>Track ID</key><integer>18055</integer>
            <key>Size</key><integer>12995663</integer>
            <key>Total Time</key><integer>323604</integer>
            <key>Year</key><integer>2005</integer>
            <key>BPM</key><integer>76</integer>
            <key>Date Modified</key><date>2018-10-23T12:41:14Z</date>
        <key>Date Added</key><date>2017-07-25T02:49:11Z</date>
        </dict>

The Approach: So let's say I need to find the element that has the ID '18053', Instead of iterating over everything recursively (which I've figured out how to do), it would be much more efficient if I could check for the ID #. Then get the element of the that's after

I've tried the following:

root = etree.parse(xml_file_path)
key_found = root.xpath("//key[text()='18053']")
element_wanted = key_found.getnext()

But get an error because key_found is a list, not an element.

Using find, should return an element, but using the following:

key_found = root.find("//key[text()='18053']")

I get an error "bad predicate"

Any help is appreciated. I've been working on this one for a few days now. Blame Apple! Thanks!

1
  • What's wrong with the fact that xpath returns a list? Just ask for element_wanted = key_found[0].getnext() (maybe check that key_found is not empty first). Commented Feb 5, 2021 at 2:58

1 Answer 1

1

The xpath method should return a list, so

element_wanted = key_found.getnext()

would be

element_wanted = key_found[0].getnext()

Provided you also test if the list contains elements

Sign up to request clarification or add additional context in comments.

1 Comment

Geez.... Clearly I've been staring at this too long. Thanks for being a second pair of eyes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.