0

I have the following string

     string myHtml="<input type='text' value='123' class='myClass'></input>";

I want to read or cast myHTML into some sort of C# HTML object so I can do something like:

    DesiredHTMLClass obj=new DesiredHTMLClass(myHTML);
    string val=obj.value;  //Would return 123
    string mClass=obj.class; //Would return myclass

I cannot use something like the HTML Agility Pack,simple C#

Thanks

4
  • 4
    Why can't you use the HTML Agility Pack? What you are asking for is not simple. There is nothing built into the BCL that will do it. Commented Mar 26, 2012 at 18:39
  • I cannot use any third party stuff. Commented Mar 26, 2012 at 18:40
  • You are supposed to work with both hands tied behind your back? Why? The HAP is open source and can be inspected and vetted (it it not that large a codebase). Commented Mar 26, 2012 at 18:43
  • 3
    @abbas Then you have a lot of code to write. HTML parsers are super-complicated, because there are a ton of corner cases to be dealt with. If you're ok with ignoring them, you can probably get away with some regex's, though that will increase your chances for error significantly. You should at least look at the source for the HTML Agility Pack to see what's going on under the hood, and maybe give you some ideas about how to proceed. Commented Mar 26, 2012 at 18:43

2 Answers 2

1

You can use regex to detect tags and map attributes to properties of Html objects. But it's a painful work.

Edit: If you need only a small number of tags and you know it in advance you can parse it with Regex. If you need to parse the whole html you are in trouble.

Sign up to request clarification or add additional context in comments.

2 Comments

Is int there any simple way??
It's the simplest that comes to my mind. You need to parse only input tags?
0

If you can only use 'simple C#' then you'll have to parse the string manually, which won't be fun, but I suppose it's possible. Also, it will be difficult to expose attributes as concrete properties of the parsed object.

What you can do is use something like an SGML reader to convert the snippet to XML and then read it; if your HTML is well-formed and you know it will always be then you can skip the SGML step and use something like Linq2XML to parse it directly, although you still won't get an object with properties but rather you'll have to query the attribute values and so on.

1 Comment

Well, like I said if you know the HTML will be well-formed you can use an XML doc or Linq to XML to parse it. But if it's not, then you'll have to do it manually.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.