0

I am working on personal project and using Rails to learn the framework. The project is music based and I'm using ChartLyrics.com's API to retrieve song lyrics. The API returns XML and I am having trouble extracting the actual lyric element from the XML.

I have installed the Nokogiri gem to help parse the XML. The following is what I'm using to retrieve the data. From the rails console:

doc = Nokogiri::XML(open(http://api.chartlyrics.com/apiv1.asmx/SearchLyricDirect?artist=michael%20jackson&song=bad))
puts doc

<?xml version="1.0" encoding="utf-8"?>
<GetLyricResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://api.chartlyrics.com/">
  <TrackId>0</TrackId>
  <LyricChecksum>a4a56a99ee00cd8e67872a7764d6f9c6</LyricChecksum>
  <LyricId>1710</LyricId>
  <LyricSong>Bad</LyricSong>
  <LyricArtist>Michael Jackson</LyricArtist>
  <LyricUrl>http://www.chartlyrics.com/28h-8gWvNk-Rbj1X-R7PXg/Bad.aspx</LyricUrl>
  <LyricCovertArtUrl>http://ec1.images-amazon.com/images/P/B000CNET66.02.MZZZZZZZ.jpg</LyricCovertArtUrl>
  <LyricRank>9</LyricRank>
  <LyricCorrectUrl>http://www.chartlyrics.com/app/correct.aspx?lid=MQA3ADEAMAA=</LyricCorrectUrl>
  <Lyric>
     Because I'm bad (bad-bad), I'm bad, come on (really, really bad)
     You know I'm bad (bad-bad), I'm bad, you know it (really, really bad)
     You know I'm bad (bad-bad), I'm bad, you know it (really, really bad) you know
     And the whole world has to answer right now
     Just to tell you once again
  </Lyric>
</GetLyricResult>

I shortened the lyrics to save space. How do I extract the Lyric element? I've tried all of the following:

> lyrics = doc.xpath('//Lyric')
=> []

> lyrics = doc.xpath('/Lyric')
=> []

> lyrics = doc.xpath('//GetLyricResult/Lyric')
=> []

> lyrics = doc.xpath('//GetLyricResult//Lyric')
=> []

> lyrics = doc.xpath('/GetLyricResult/Lyric')
=> []

lyrics is nil everytime. Can anyone tell me what I'm doing wrong?

1 Answer 1

3

By default, nokogiri looks for elements that are not in any namespace, but this document is namespaced:

doc.namespaces
#=> {"xmlns:xsi"=>"http://www.w3.org/2001/XMLSchema-instance", "xmlns:xsd"=>"http://www.w3.org/2001/XMLSchema", "xmlns"=>"http://api.chartlyrics.com/"}

So you have to append the xmlns namespace to the tag you're searching for (you can leave out the actual URL, since nokogiri will fill in the URL for the default namespace in for you):

doc.xpath('//xmlns:Lyric')

Alternatively you can search using css:

doc.css('Lyric')

See also: Why doesn't Nokogiri xpath like xmlns declarations

Sign up to request clarification or add additional context in comments.

2 Comments

That worked!!! Thank you very much, shioyama! PHP isn't crazy about namespaces either. Should've known.
Namespaces are a necessary evil when dealing with complex XML. They are massive overkill for normal XML and confuse people no end. Nokogiri's use of CSS to step around the namespacing problem is a nice icing layer, and is one reason why I prefer using CSS over XPath when parsing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.