1

I have generated an xml. There are few empty nodes which I want to remove

My XML

https://pastebin.com/wzjmZChU

I want to remove all empty nodes from my xml. Using xpath I tried

$xpath = '//*[not(node())]';
foreach ($xml->xpath($xpath) as $remove) {
    unset($remove[0]);
}

The above code is working to a certain level but I am not able to remove all empty node values.

Edit

I have tried the above code and it only works for single level.

3

1 Answer 1

4

You consider any element node without a child empty //*[not(node())] will accomplish that. But if it removes the element nodes it can result in additional empty nodes, so you will need an expression that does not only remove the currently empty element nodes, but these with only empty descendant nodes (recursively). Additionally you might want to avoid to remove the document element even if it is empty because that would result in an invalid document.

Building up the expression

  • Select the document element
    /*
  • Any descendant of the document element
    /*//*
  • ...with only whitespaces as text content (this includes descendants)
    /*//*[normalize-space(.) = ""]
  • ...and no have attributes
    /*//*[normalize-space(.) = "" and not(@*)]
  • ...or an descendants with attributes
    /*//*[normalize-space(.) = "" and not(@* or .//*[@*])]
  • ...or a comment
    /*//*[normalize-space(.) = "" and not(@* or .//*[@*] or .//comment())]
  • ...or a pi
    /*//*[ normalize-space(.) = "" and not(@* or .//*[@*] or .//comment() or .//processing-instruction()) ]

Put together

Iterate the result in reverse order, so that child nodes are deleted before parents.

$xmlString = <<<'XML'
<foo>
  <empty/>
  <empty></empty>
  <bar><empty/></bar>
  <bar attr="value"><empty/></bar>
  <bar>text</bar>
  <bar>
   <empty/>
   text
  </bar>
  <bar>
   <!-- comment -->
  </bar>
</foo>
XML;

$xml = new SimpleXMLElement($xmlString);

$xpath = '/*//*[
  normalize-space(.) = "" and
  not(
    @* or 
    .//*[@*] or 
    .//comment() or
    .//processing-instruction()
  )
]';
foreach (array_reverse($xml->xpath($xpath)) as $remove) {
  unset($remove[0]);
}

echo $xml->asXml();

Output:

<?xml version="1.0"?>
<foo>



  <bar attr="value"/>
  <bar>text</bar>
  <bar>

   text
  </bar>
  <bar>
   <!-- comment -->
  </bar>
</foo>
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.