0

I have a malformed xml file which is generated with incorrect closing tags as follows.

<Root>
.
.
<Question id='1' type='text'>London</Question id='1' type='text'>
<Question id='2' type='radio'>4</Question id='2' type='radio'>
<Question id='3' type='check'>6</Question id='3' type='check'>
.
.
</Root>

I need to refine this XML file with propper closing tags as follows.

<Question id='1' type='text'>London</Question>

In summary close tags like,

<Question id='some id' type='some type'> should be replaced with </Question>

There are hundreds of tags in the file. How can I use string operations with RegEx to process that file in order to create a well-formed XML file.

Thanks,

Chatur

2
  • You should totally drop that and use a tolerant XML Parser. Commented Nov 28, 2012 at 12:56
  • Why not use regexp in notepad++ in order to refactor closing tags instead of using C# ? Commented Nov 28, 2012 at 13:04

1 Answer 1

2

Assuming str is the malformed XML string:

string fixed = Regex.Replace(str, @"</([^\s]+)[^>]+>", "</$1>");

A very useful thing to test regular expressions is Regex Designer from Rad Software. It's free, it's fully .NET-compatible and it has built-in help.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Artem... it worked for close tags except the root. The only issue is it replces the root close tag as </Roo>
@chatura: Change the pattern to </([^\s]+)\s+[^>]+> to fix the problem with </Root>.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.