XML Deserializing returns "some" null values

Question

Here's my XML:

<Events>
  <Event>
    <content_id>6442452774</content_id>
    <content_title>Title of the event</content_title>
    <content_html>
<Date>2015-11-18</Date>
<EventType>Events</EventType>
<Description>
<p>this is an "event"</p>
</Description>
<Speakers>speaker1 LLC<br />speaker2<br />Jspeaker3</Speakers>
<Time>5:30 - 6:00pm Registration<br />6:00 - 7:00pm Panel Discussion<br />7:00 - 8:00pm Networking Reception</Time>
<Where></Where>
<RegistrationInfo>Please contact <a href="mailto:[email protected]">[email protected]</a> to register for this event.</RegistrationInfo>
<Registration>false</Registration>
</content_html>
    <date_created>2015-10-24T14:24:12.333</date_created>
    <folder_id>262</folder_id>
    <content_teaser>this is the content "can you display it."</content_teaser>
    <content_text>text of the vent "more text" a lot of text here </content_text>
    <end_date>2015-11-19T21:35:00</end_date>
    <content_type>1</content_type>
    <template_id>43</template_id>
    <content_status>A</content_status>
  </Event>
<Event>.... Other events   </Event>
<Events>

and here's are my classes:

 public class Serializable_Events
    {
        [XmlElement("Event")]
        public List<Serializable_Event> EventList = new List<Serializable_Event>();
    }
    public class Serializable_Event
    {
        [XmlElement("content_id")]
        public string content_id { get; set; }

        [XmlElement("content_title")]
        public string content_title { get; set; }

        [XmlElement("content_html")]
        public Serializable_Event_ContentHTML ContentHTML { get; set; }
        [XmlText]
        public string content_teaser { get; set; }
        [XmlElement("content_text")]
        public string content_text { get; set; }
    }
    public class Serializable_Event_ContentHTML
    {
        [XmlElement("Date")]
        public string Date { get; set; }

        [XmlElement("EventType")]
        public string EventType { get; set; }

        [XmlElement("Description")]
        public string Description { get; set; }
        [XmlElement("Speakers")]
        public string Speakers { get; set; }

        [XmlElement("Time")]
        public string Time { get; set; }


        [XmlElement("Where")]
        public string Where { get; set; }

        [XmlElement("RegistrationInfo")]
        public string RegistrationInfo { get; set; }

        [XmlElement("Registration")]
        public bool Registration { get; set; }

        //ignored html tags
        [XmlIgnore]
        public string p { get; set; }
        [XmlIgnore]
        public string br { get; set; }
        [XmlIgnore]
        public string a { get; set; }

    }

Implementation:

XmlSerializer ser = new XmlSerializer(typeof(Serializable_Events));
            var data = (Serializable_Events)ser.Deserialize(new StreamReader(@"events.xml"));

My problem is that some attributes are null and some are not (see the screenshot)

Dmitriy Khaykin · Accepted Answer · 2015-11-11 15:35:44Z

2

The ones that are null, like <Description> are due to malformed XML.

You are storing HMTL directly in XML with text all over the place, and the serializer is not expecting that; further you are telling the serializer to ignore HTML tags using XmlIgnore which is intended for XML tags with properly formed XML. That's a wrong use of XmlIgnore

All XML which contains non-XML mark-up should be wrapped in CDATA sections; this will solve your problem. Further, you can remove all of the XmlIgnore code as well since it's not needed.

Your XML should look like this:

<Events>
    <Event>
        <content_id>6442452774</content_id>
        <content_title>Title of the event</content_title>
        <content_html>
            <Date>2015-11-18</Date>
            <EventType>Events</EventType>
            <Description>
                <![CDATA[<p>this is an ""event""</p>]]>
            </Description>
            <Speakers>
                <![CDATA[speaker1 LLC<br />speaker2<br />Jspeaker3]]>
            </Speakers>
            <Time>
                <![CDATA[5:30 - 6:00pm Registration<br />6:00 - 7:00pm Panel Discussion<br />7:00 - 8:00pm Networking Reception]]>
            </Time>
            <Where></Where>
            <RegistrationInfo>
                <![CDATA[Please contact <a href='mailto:[email protected]'>[email protected]</a> to register for this event.]]>
            </RegistrationInfo>
            <Registration>false</Registration>
        </content_html>
        <date_created>2015-10-24T14:24:12.333</date_created>
        <folder_id>262</folder_id>
        <content_teaser>this is the content 'can you display it.'</content_teaser>
        <content_text>text of the vent 'more text' a lot of text here </content_text>
        <end_date>2015-11-19T21:35:00</end_date>
        <content_type>1</content_type>
        <template_id>43</template_id>
        <content_status>A</content_status>
    </Event>
</Events>"

edited Nov 11, 2015 at 15:35

answered Nov 11, 2015 at 15:31

Dmitriy Khaykin

5,2581 gold badge22 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

IndieTech Solutions Over a year ago

@Dimitriy Thakns for the response. I was expenting that and that's why i used the [XmlIgnore] for the html tags and it doesn't seem to fix the issue. My problem is that i inherited a bunch of xml files that i need to manage and it will be hard for me to add the CDATA for all of the html tags. but if that's the only way , then i will go ahead and do it. Please advise!

IndieTech Solutions Over a year ago

Is there a way to get the whole block after <Description> and not parse it?

Dmitriy Khaykin Over a year ago

See my updated answer. If you are getting malformed XML like this, and you need to normalize it, I would look at an HTML parser (like HtmlAgilityPack) to read the bad XML and rebuild it into good XML, and then run your serializer code on it. Clearly, that's a different problem / question though...

Dmitriy Khaykin Over a year ago

@Alundrathedreamwalker yes but not with an XML serializer; this is where I would use an HTML parser and clean this stuff up. XML is much more structured than HTML and when there are random nodes like <p>, <br /> everywhere in between text, the XML serializer doesn't know what to do with that (rightfully so).

Dmitriy Khaykin Over a year ago

Yeah it clearly is not working for the OP. XmlIgnore is meant to ignore XML elements and not parse them, not to fix bad XML and magically make it into good XML by ignoring HTML markup and then assuming the rest of it is good -- think about it -- how would it know which elements to keep and what the structure looks like?

|

Collectives™ on Stack Overflow

XML Deserializing returns "some" null values

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related