0

I have html file in which there is table content and other information in my c#.net application.

I want to parse the table contents for only some columns.Then should I use parser of html or Replace method of Regex in .net ?

And if I use the parser then how to use parser? Will parser extract the inforamation which is between the tags? If yes then how to use ? If possible show the example because I am new to parser.

If I use Replace method of Regex class then in that method how to pass the file name for which I want to extract the information ?

Edit : I want to extract information from the table in html file. For that how can I use html agility parser ? What type of code I should write to use that parser ?

1 Answer 1

5

Try the HTML Agility Pack.

Here's an example:

 HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm");
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
 }
 doc.Save("file.htm");

Regarding your extra question regarding regex: do not use Regex to parse HTML. It is not a robust solution. The above library can do a much better job.

Sign up to request clarification or add additional context in comments.

1 Comment

The function FixLink is not defined, so this won't compile. It is just an example of what the code might look like - you can't just copy and paste it into your project. Also, you haven't told us what exactly you need to do, so it's very unlikely that this code snippet will be exactly what you need.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.