1

How do I access value of a particular node of an XML file using R? I am new to R and would also like to know why xmltop[[1]]$IP returns a null. What am I doing wrong?

library('XML')
xmlfile <- xmlTreeParse("E:\\R Scripts\\Data\\Ipdata.xml")
xmltop = xmlRoot(xmlfile)
xmltop[[1]]$IP    # return a null value
xmlValue(xmltop[[1]]$IP)    # returns NA

XML:

<Response>
<location>
 <IP>213.139.122.103</IP>
 <CountryCode>FR</CountryCode>
 <CountryName>France</CountryName>
 <RegionCode/>
 <RegionName/>
 <City/>
 <ZipCode/>
 <TimeZone>Europe/Paris</TimeZone>
 <Latitude>48.86</Latitude>
 <Longitude>2.35</Longitude>
 <MetroCode>0</MetroCode>
 </location>
 <location>
 <IP>213.139.122.102</IP>
 <CountryCode>INR</CountryCode>
 <CountryName>India</CountryName>
 <RegionCode/>
 <RegionName/>
 <City/>
 <ZipCode/>
 <TimeZone>Chennai</TimeZone>
 <Latitude>48.83</Latitude>
 <Longitude>2.34</Longitude>
 <MetroCode>0</MetroCode>
 </location>
</Response>
3
  • xml2 is pretty nice for parsing, though you'll need your XPath skills. If you want the text contents of all <IP> nodes, library(xml2) ; xml %>% read_xml() %>% xml_find_all('//IP') %>% xml_text() where xml is the XML text or a path to the file. Commented May 17, 2016 at 5:17
  • @kumar Did my answer resolve your query? If yes, then accept and upvote. Commented May 17, 2016 at 7:59
  • xmltop[[1]][["IP"]] won't give you IP only. It will give you node. Commented May 17, 2016 at 7:59

2 Answers 2

1

You can access it by this command:

xmltop[[1]][["IP"]]

Even better, you can try to use XPATH through xpathApply or xpathSApply command to access all IP tags:

xpathApply(xmltop, "//IP")

Then you can extract information from these nodes with functions such as xmlValue:

xpathApply(xmltop, "//IP", xmlValue)

EDIT: You need to modify your original code a litte bit (convert the objects into XMLInternalNode) to use the functions such as xmlValue as follow:

xmlfile <- xmlTreeParse("Ipdata.xml", useInternalNodes = T)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks it was very helpfull
1

It can be access usingxmltop[[1]][[1]][[1]] or xmlValue(xmltop[[1]][[1]]) or xmltop[[1]][["IP"]][1]$text. These aren't name according to the nodes.

I would recommend you to convert it to dataframe or list using this code

Data frame:

xmldataframe <- xmlToDataFrame("E:\\R Scripts\\Data\\Ipdata.xml", stringsAsFactors=FALSE)

xmldataframe$IP[1]

List:

xmllist <- xmlToList("E:\\R Scripts\\Data\\Ipdata.xml")

xmllist[[1]]$IP

2 Comments

Thanks it was very helpful if I could I would mark both as answers.
@kumar Welcome. But you can upvote atleast my answer as mine answer was first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.