8

How can I check if a xml file is well formed without invalid chars or tags?

For example, consider this xml:

<?xml version="1.0"?>
<PARTS>
   <TITLE>Computer Parts</TITLE>
   <PART>
      <ITEM>Motherboard</ITEM>
      <MANUFACTURER>ASUS</MANUFACTURER>
      <MODEL>P3B-F</MODEL>
      <COST> 123.00</COST>
   </PART>
   <PART>
      <ITEM>Video Card</ITEM>
      <MANUFACTURER>ATI</MANUFACTURER>
      <MODEL>All-in-Wonder Pro</MODEL>
      <COST> 160.00</COST>
   </PART>
</PARTSx>

The last tag </PARTSx>must be </PARTS>

2
  • 4
    Your title talks about validation, but the body just seems to be about well formed XML. Those are two different things. Without a DTD or XSD, you cannot validate XML. Are you sure you know what you're asking for? Commented Apr 2, 2011 at 2:11
  • Agreed with Rob, you need to read up on well formed XML first, then go for Valid XML (for which you need an XML SChema). Any XML Parser can check well formedness or Commented Apr 2, 2011 at 7:15

2 Answers 2

13

You can use the IXMLDOMParseError interface returned by the MSXML DOMDocument

this interface return a serie of properties which help you to identify the problem.

  • errorCode Contains the error code of the last parse error. Read-only.
  • filepos Contains the absolute file position where the error occurred. Read-only.
  • line Specifies the line number that contains the error. Read-only.
  • linepos Contains the character position within the line where the error occurred.
  • reason Describes the reason for the error. Read-only.
  • srcText Returns the full text of the line containing the error. Read-only.
  • url Contains the URL of the XML document containing the last error. Read-only.

check these two functions which uses the MSXML 6.0 (you can use another versions as well)

uses
  Variants,
  Comobj,
  SysUtils;

function IsValidXML(const XmlStr :string;var ErrorMsg:string) : Boolean;
var
  XmlDoc : OleVariant;
begin
  XmlDoc := CreateOleObject('Msxml2.DOMDocument.6.0');
  try
    XmlDoc.Async := False;
    XmlDoc.validateOnParse := True;
    Result:=(XmlDoc.LoadXML(XmlStr)) and (XmlDoc.parseError.errorCode = 0);
    if not Result then
     ErrorMsg:=Format('Error Code : %s  Msg : %s line : %s Character  Position : %s Pos in file : %s',
     [XmlDoc.parseError.errorCode,XmlDoc.parseError.reason,XmlDoc.parseError.Line,XmlDoc.parseError.linepos,XmlDoc.parseError.filepos]);
  finally
    XmlDoc:=Unassigned;
  end;
end;

function IsValidXMLFile(const XmlFile :TFileName;var ErrorMsg:string) : Boolean;
var
  XmlDoc : OleVariant;
begin
  XmlDoc := CreateOleObject('Msxml2.DOMDocument.6.0');
  try
    XmlDoc.Async := False;
    XmlDoc.validateOnParse := True;
    Result:=(XmlDoc.Load(XmlFile)) and (XmlDoc.parseError.errorCode = 0);
    if not Result then
     ErrorMsg:=Format('Error Code : %s  Msg : %s line : %s Character  Position : %s Pos in file : %s',
     [XmlDoc.parseError.errorCode,XmlDoc.parseError.reason,XmlDoc.parseError.Line,XmlDoc.parseError.linepos,XmlDoc.parseError.filepos]);
  finally
    XmlDoc:=Unassigned;
  end;
end;
Sign up to request clarification or add additional context in comments.

Comments

5

How are you creating/receiving the XML? Any sensible parser would catch this.

For example, using OmniXML

uses
  OmniXML;

type
  TForm1=class(TForm)
    Memo1: TMemo;
    //...
  private
    FXMLDoc: IXMLDocument;
    procedure FormCreate(Sender: TObject);
    procedure CheckXML;
  end;

implementation

uses
  OmniXMLUtils;

procedure TForm1.FormCreate(Sender: TObject);
begin
  // Load your sample XML. Can also do Memo1.Text := YourXML
  Memo1.Lines.LoadFromFile('YourXMLFile.xml');
end;

procedure TForm1.CheckXML;
begin
  FXMLDoc := CreateXMLDoc;
  // The next line raises an exception with your sample file.
  XMLLoadFromAnsiString(FXMLDoc, Memo1.Text); 
end;

2 Comments

Ken, i am processing many xml files, generated by a external application, and sometimes they have invalid chars or untermined tags.
@Salvador: How are you processing them once received? As I said, any decent parser will raise an exception for a malformed or invalid file; if you're parsing them with your own code, you shouldn't. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.