3

I am having following XML which I required to process

<table>
<col1>check1</col1>
<col2>check2</col2>
<col3>check3</col3>
<content>
    <data>gt1</data>
    <data>check_gt1</data>
</content>
</table>

I want to get "<content><data>gt1</data><data>check_gt1</data></content>" from the parser.

My parsing code is as follows,

my $parser = XML::LibXML->new();
my $respDom = $parser->parse_string($xmldata);
print "content is ".$respDom->getDocumentElement->findnodes("//content");

The above code results in the textContent inside the nodes.How can I get the data I mentioned above ?

2
  • 2
    Note that you can call findnodes directly on the document node: $respDom->findnodes("//content") Commented Jan 12, 2016 at 13:11
  • Thanks, Is there any possible way to get only "<data>gt1</data><data>check_gt1</data>" in perl without doing the parsing ? Commented Jan 13, 2016 at 10:20

1 Answer 1

5

The XML::LibXML::Node objects have a method toString. That's what you need. I found it with a quick search of the XML::LibXML documentation.

use strict;
use warnings;
use XML::LibXML;

my $xmldata = <<'XML';
<table>
<col1>check1</col1>
<col2>check2</col2>
<col3>check3</col3>
<content>
    <data>gt1</data>
    <data>check_gt1</data>
</content>
</table>
XML

my $parser = XML::LibXML->new();
my $respDom = $parser->parse_string($xmldata);
print "content is "
  . $respDom->getDocumentElement->findnodes("//content")->[0]->toString;

This will print:

content is <content>
    <data>gt1</data>
    <data>check_gt1</data>
</content>

In general, I always search for either to_string, as_string, stringify or simply string if I need something like that and am not sure how that works in a specific module. It's almost always one of those.


Update

To only get the inside XML of the <content> element, you have to grab its child nodes and do toString for each of them. The map whole thing needs to be called in list context, or it will throw an error. Note how I changed the . to a , in the print statement.

print "content is "
  , $respDom->getDocumentElement->findnodes("//content")->[0]->childNodes->map(sub {$_->toString});
Sign up to request clarification or add additional context in comments.

5 Comments

Note that toString on a non-document node produces text that's not encoded.
Is there any possible way to get only "<data>gt1</data><data>check_gt1</data>" in perl without doing the parsing ?
I'm not sure I understand @Aravind.
From the above answer it provided "<content><data>gt1</data><data>check_gt1</data></content>" as result. Is it possible to get "<data>gt1</data><data>check_gt1</data>" as result(Without the element content)?
@Aravind have you looked at the documentation I linked at all?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.