0

In the xml we have some tags like

<string1 : string2>

and many more like this.

i need to write a regular expression to delete all the string end with ":" and i.e. here string1 and ":" also. and it should be always inside < >

e.g. Input = <string1 : string2>

output = <string2>

1
  • do you need to catch closing tags with the same format? Commented May 11, 2011 at 13:50

5 Answers 5

1

This is how you do it in php:

<?php
 $str = "<string1 : string2>";
 $s = preg_replace('~(</?)[^>:]*:\s*~', "$1", $str);
 var_dump($s);
?>

EDIT In Java

String str = "<ns2:senderId xmlns=\"netapp.com/fsoCanonical\">NetApp</ns2:senderId>";
System.out.println(str.replaceAll("(</?)[^>:]*:\\s*", "$1"));

Output

<senderId xmlns="netapp.com/fsoCanonical">NetApp</senderId>
Sign up to request clarification or add additional context in comments.

1 Comment

@satyam: based on comments I have provided you one example in Java above. Hope that meets your requirement.
0
<[^>:]*:\s*([^>]*)>

Search and replace with <$1>.

2 Comments

Hi jeff thanks a lot its working ,but one more exception.. lets say we have input xml like this <ns1:fso xmlns:ns1="netapp.com/fsoCanonical"> <ns2:senderId xmlns="netapp.com/fsoCanonical">NetApp</ns2:senderId> <receiverId xmlns="netapp.com/fsoCanonical">Unisys</receiverId> <messageId xmlns="netapp.com/…> </ns1:fso> and we need the output as below
<fso xmlns:ns1="netapp.com/fsoCanonical"> <senderId xmlns="netapp.com/fsoCanonical">NetApp</senderId> <receiverId xmlns="netapp.com/fsoCanonical">Unisys</receiverId> <messageId xmlns="netapp.com/…> </fso>
0

With an xslt parser you can use

<xsl:template match="*">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
<xsl:template match="@*|text()|comment()|processing-instruction()">
    <xsl:copy-of select="."/>
</xsl:template>

This link is relevant for your question.

Comments

0
(?<=<).*?:\s*(?=.*?>)

The look behind ensures the < is on the left, then matches your string including the : and optional whitespaces. The following look ahead ensures that there is the rest of the tag.

Capture this and replace with an empty string.

You can it see online here http://regexr.com

regular-expressions.info/java.html explains how to apply regexes in java.

Comments

0

<.*?:\s*(.*?)> should capture only the part you are interested in. How to do regex replacement can vary from programming language to programming language.

In java you could do

string.replaceAll("<.*?:\s*(.*?)>", "<$1>");

1 Comment

so could to tell me how to do it in java

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.