1

I have a DF called "billing".

<?xml version="1.0" encoding="ISO-8859-1" ?>


<test:TASS xmlns="http://www.vvv.com/schemas"  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="http://www.vvv.com/schemas http://www.vvv.com/schemas/testV2_02_03.xsd"  xmlns:test="http://www.vvv.com/schemas" >
    <test:house>
                <test:billing>
                    <test:proceduresummary>
                        <test:guidenumber>X2030</test:guidenumber>
                            <test:diagnosis>
                                <test:table>ICD-10</test:table>
                                <test:diagnosiscod>J441</test:diagnosiscod>
                                <test:description>CHRONIC OBSTRUCTIVE PULMONARY DISEASE WITH (ACUTE) EXACERBATION</test:description>
                            </test:diagnosis>
                            <test:procedure>
                                <test:procedure>
                                    <test:description>HOSPITAL</test:description>
                                </test:procedure>
                                <test:amount>12</test:amount>
                            </test:procedure>
                    </test:proceduresummary>
                </test:billing>
                    <test:billing>
                    <test:proceduresummary>
                        <test:guidenumber>Y6055</test:guidenumber>
                            <test:diagnosis>
                                <test:table>ICD-10</test:table>
                                <test:diagnosiscod>I21</test:diagnosiscod>
                                <test:description>ACUTE MYOCARDIAL INFARCTION</test:description>
                            </test:diagnosis>
                            <test:procedure>
                                <test:procedure>
                                    <test:description>HOSPITAL</test:description>
                                </test:procedure>
                                <test:amount>8</test:amount>
                            </test:procedure>
                    </test:proceduresummary>
                </test:billing>
                    <test:billing>
                    <test:proceduresummary>
                        <test:guidenumber>Z9088</test:guidenumber>
                            <test:diagnosis>
                                <test:table>ICD-10</test:table>
                                <test:diagnosiscod>F20</test:diagnosiscod>
                                <test:description>SCHIZOPHRENIA</test:description>
                            </test:diagnosis>
                            <test:procedure>
                                <test:procedure>
                                    <test:description>HOSPITAL</test:description>
                                </test:procedure>
                                <test:amount>1</test:amount>
                            </test:procedure>
                    </test:proceduresummary>
                </test:billing>
    </test:house>
</test:TASS>

My code:

require(tidyverse)
require(xml2)
setwd("D:/")
page<- read_xml("base.xml")

To dataframe:

ns<- page %>% xml_find_all(".//test:billing")
billing<-xml2::as_list(ns) %>% jsonlite::toJSON() %>% jsonlite::fromJSON()

See example: for each variable there are other variables (list or dataframe). I would like to transform these subvariables into standard variables (integer, character, ...) and build a DF without these hidden variables (list and dataframe). It is possible?

enter image description here

DF should look like this.

guidenumber<- c('X2030','Y6055','Z9088')
table<- c('ICD-10','ICD-10','ICD-10')
diagnosiscod<- c('J441','I21','F20')
description<- c('CHRONIC OBSTRUCTIVE PULMONARY DISEASE WITH (ACUTE) EXACERBATION','ACUTE MYOCARDIAL INFARCTION','SCHIZOPHRENIA')
procedure<- c('HOSPITAL','HOSPITAL','HOSPITAL')
amount<- c(12,8,1)
DF<- data.frame(guidenumber,table,diagnosiscod,description,procedure,amount)

1 Answer 1

1

Here is one way to do it:

require(xml2)

page = read_xml("base.xml")

guidenumber  = unlist(as_list(xml_find_all(page, ".//test:guidenumber")))
table        = unlist(as_list(xml_find_all(page, ".//test:table")))
diagnosiscod = unlist(as_list(xml_find_all(page, ".//test:diagnosiscod")))
description  = unlist(as_list(xml_find_all(page, ".//test:diagnosis//test:description")))
procedure    = unlist(as_list(xml_find_all(page, ".//test:procedure//test:description")))
amount       = unlist(as_list(xml_find_all(page, ".//test:amount")))

DF = data.frame(guidenumber,table,diagnosiscod,description,procedure,amount)
Sign up to request clarification or add additional context in comments.

2 Comments

thanks! The XML I am working on has several nodes, so I would not choose which node I would transform into DF, but I would like to transform the whole file. It is possible?
Yes, I am sure there is more than 1 way to do it. XPath expressions are pretty flexible.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.