31

I have the following XML file

<lexicon>
<word>
  <base>a</base>
  <category>determiner</category>
  <id>E0006419</id>
</word>
<word>
  <base>abandon</base>
  <category>verb</category>
  <id>E0006429</id>
  <ditransitive/>
  <transitive/>
</word>
<word>
  <base>abbey</base>
  <category>noun</category>
  <id>E0203496</id>
</word>
<word>
  <base>ability</base>
  <category>noun</category>
  <id>E0006490</id>
</word>
<word>
  <base>able</base>
  <category>adjective</category>
  <id>E0006510</id>
  <predicative/>
  <qualitative/>
</word>
<word>
  <base>abnormal</base>
  <category>adjective</category>
  <id>E0006517</id>
  <predicative/>
  <qualitative/>
</word>
<word>
  <base>abolish</base>
  <category>verb</category>
  <id>E0006524</id>
  <transitive/>
</word>
</lexicon>

I need to read this file with C# application, and if only the category is verb I want to print its entire element word.
How can I do that?

3
  • 2
    have u tried using Linq to XML ? Commented Jan 1, 2013 at 11:59
  • Can you show me how to do it? Commented Jan 1, 2013 at 12:00
  • @Ruba you are supposed to do your own research first then come when you are having trouble. A google search or a search of SO would have shown you how do parse an XML. Commented Jan 1, 2013 at 12:02

5 Answers 5

32

You could use linq to xml.

    var xmlStr = File.ReadAllText("fileName.xml");


    var str = XElement.Parse(xmlStr);

    var result = str.Elements("word").
Where(x => x.Element("category").Value.Equals("verb")).ToList();

    Console.WriteLine(result);
Sign up to request clarification or add additional context in comments.

1 Comment

NB: the XElement class is in the System.Xml.Linq namespace
8

You could use an XPath, too. A bit old fashioned but still effective:

using System.Xml;

...

XmlDocument xmlDocument;

xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xml);

foreach (XmlElement xmlElement in 
    xmlDocument.DocumentElement.SelectNodes("word[category='verb']"))
{
    Console.Out.WriteLine(xmlElement.OuterXml);
}

3 Comments

I disagree that XPath is old fashioned, I like XPath a lot
I mean.. you can like it all you want, but compared to linq to xml it is old fashioned.
@BrootsWaymb Can you do: xDoc.XPathSelectElement("/root/child/element") with linq-to-xml in a single row? without using Descendants because that looks everywhere.
5

This is how I would do it (the code below has been tested, full source provided below), begin by creating a class with common properties

    class Word
    {
        public string Base { get; set; }
        public string Category { get; set; }
        public string Id { get; set; }
    }

load using XDocument with INPUT_DATA for demonstration purposes and find element name with lexicon . . .

    XDocument doc = XDocument.Parse(INPUT_DATA);
    XElement lex = doc.Element("lexicon");

make sure there is a value and use linq to extract the word elements from it . . .

    Word[] catWords = null;
    if (lex != null)
    {
        IEnumerable<XElement> words = lex.Elements("word");
        catWords = (from itm in words
                    where itm.Element("category") != null
                        && itm.Element("category").Value == "verb"
                        && itm.Element("id") != null
                        && itm.Element("base") != null
                    select new Word() 
                    {
                        Base = itm.Element("base").Value,
                        Category = itm.Element("category").Value,
                        Id = itm.Element("id").Value,
                    }).ToArray<Word>();
    }

The where statement checks if the category element exists and that the category value is not null and then check it again that it is a verb. Then check that the other nodes also exists . . .

The linq query will return an IEnumerable< Typename > object, so we can call ToArray< Typename >() to cast the entire collection into the type we want.

Then print it to get . . .

[Found]
 Id: E0006429
 Base: abandon
 Category: verb

[Found]
 Id: E0006524
 Base: abolish
 Category: verb

Full Source:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;

namespace test
{
    class Program
    {

        class Word
        {
            public string Base { get; set; }
            public string Category { get; set; }
            public string Id { get; set; }
        }

        static void Main(string[] args)
        {
            XDocument doc = XDocument.Parse(INPUT_DATA);
            XElement lex = doc.Element("lexicon");
            Word[] catWords = null;
            if (lex != null)
            {
                IEnumerable<XElement> words = lex.Elements("word");
                catWords = (from itm in words
                            where itm.Element("category") != null
                                && itm.Element("category").Value == "verb"
                                && itm.Element("id") != null
                                && itm.Element("base") != null
                            select new Word() 
                            {
                                Base = itm.Element("base").Value,
                                Category = itm.Element("category").Value,
                                Id = itm.Element("id").Value,
                            }).ToArray<Word>();
            }

            //print it
            if (catWords != null)
            {
                Console.WriteLine("Words with <category> and value verb:\n");
                foreach (Word itm in catWords)
                    Console.WriteLine("[Found]\n Id: {0}\n Base: {1}\n Category: {2}\n", 
                        itm.Id, itm.Base, itm.Category);
            }
        }

        const string INPUT_DATA =
        @"<?xml version=""1.0""?>
        <lexicon>
        <word>
          <base>a</base>
          <category>determiner</category>
          <id>E0006419</id>
        </word>
        <word>
          <base>abandon</base>
          <category>verb</category>
          <id>E0006429</id>
          <ditransitive/>
          <transitive/>
        </word>
        <word>
          <base>abbey</base>
          <category>noun</category>
          <id>E0203496</id>
        </word>
        <word>
          <base>ability</base>
          <category>noun</category>
          <id>E0006490</id>
        </word>
        <word>
          <base>able</base>
          <category>adjective</category>
          <id>E0006510</id>
          <predicative/>
          <qualitative/>
        </word>
        <word>
          <base>abnormal</base>
          <category>adjective</category>
          <id>E0006517</id>
          <predicative/>
          <qualitative/>
        </word>
        <word>
          <base>abolish</base>
          <category>verb</category>
          <id>E0006524</id>
          <transitive/>
        </word>
        </lexicon>";

    }
}

Comments

2

Alternatively, you can use XPath query via XPathSelectElements method:

var document = XDocument.Parse(yourXmlAsString);
var words = document.XPathSelectElements("//word[./category[text() = 'verb']]");

Comments

2
XDocument xdoc = XDocument.Load(path_to_xml);
var word = xdoc.Elements("word")
               .SingleOrDefault(w => (string)w.Element("category") == "verb");

This query will return whole word XElement. If there is more than one word element with category verb, than you will get an InvalidOperationException. If there is no elements with category verb, result will be null.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.