-2

My Value 1

I need "My Value 1" Please Help Me. C# language

2
  • will the tags always be in the same format (eg 3 layers deep with a tr, a td and a font) Commented Mar 3, 2010 at 16:04
  • What language are you using to parse this? Commented Mar 3, 2010 at 16:11

9 Answers 9

7

As HTML code is very "unpredictable" I would recommend using a HTML parsing kit. Which programming language do you use? In .NET I have used HTML Agility Pack with great success. In Java HTML Parser might be handy (though I have not worked with it yet).

Sign up to request clarification or add additional context in comments.

Comments

3

You cannot properly parse HTML with regular expressions because regexps can't handle the nesting allowed by HTML. To do it properly. For that one line you show, you can use a regexp but you can't count on that line remaining identical so must use SAX/DOM for the task generally.

2 Comments

But you can parse a fixed string that happens to be HTML with regular expressions. While there are lots of issues with doing this, they're problems the OP probably doesn't have.
@Tom - agreed. But although the OP doesn't have them today, he might tomorrow and won't be left wondering what happened, hopefully.
3

I think parsing HTML using Regexes is not a wise idea, as highlighted by spa. A classic previous answer to a similar question is RegEx match open tags except XHTML self-contained tags

Comments

1

c# language

 string input = "<tr><TD><FONT size=\"2\">My Value 1</FONT></TD></tr>";
 string pattern = @"<[^>]*?>";
 string output = Regex.Replace(input, pattern, ""); //My Value 1

Just to remove all html tags.

Comments

0
function stripTags(markup){
  return markup.replace(/\s*<[^>]*?>\s*/gim,'');
}

This assumes all you really want is the inner text represented by "My Value 1" above.

Comments

0

Try:

/<tr>\s*<td>\s*<font.*?>(.*?)<\/font>\s*<\/td>\s*<\/tr>/i

Used in PHP:

<?php

if(preg_match('/<tr>\s*<td>\s*<font.*?>(.*?)<\/font>\s*<\/td>\s*<\/tr>/i',
              '<tr><TD><FONT size="2">My Value 1</FONT></TD></tr>',$matches))
        echo $matches[1]; // prints My Value 1
?>

1 Comment

@codaddict i need c# language :-)
0

if you are using PHP, split on </FONT>

$string='<tr><TD><FONT size="2">My Value 1</FONT></TD></tr>';
$s = explode('</FONT>',$string);
foreach ($s as $v){
     if ( strpos($v,"<FONT") !==FALSE) {
        $t = explode(">",$v);
        print end($t)."\n";
    }

}

output

$ php test.php
My Value 1

Comments

0

in perl I would use

my $string='<tr><TD><FONT size="2">My Value 1</FONT></TD></tr>';
$string =~ m/(<.*?>)*([^<]*)(<.*?>)*/;
print $2;

to get the desired result. The last part is not strictly necessary,

(<.*?>)*([^<]*)

will work as well

Comments

0

If you want to get the contents within the tags I think the following Regexp is enough:

^<.*>([^<>]+)<.*>$

It will only work if there really is any data between the tags somewhere, otherwise it will give a no-match.

1 Comment

i need only value "My Value 1"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.