11

I need to match with a javascript RegExp the string: bimbo999 from this a tag: <a href="/game.php?village=828&amp;screen=info_player&amp;id=29956" >bimbo999</a>

The numbers from URL vars (village and id) are changing every time so I have to match the numbers somehow with RegExp.

</tr>
                    <tr><td>Sent</td><td >Oct 22, 2011  17:00:31</td></tr>
                                <tr>
                        <td colspan="2" valign="top" height="160" style="border: solid 1px black; padding: 4px;">
                            <table width="100%">
    <tr><th width="60">Supported player:</th><th>
    <a href="/game.php?village=828&amp;screen=info_player&amp;id=29956" >bimbo999</a></th></tr>
    <tr><td>Village:</td><td><a href="/game.php?village=828&amp;screen=info_village&amp;id=848" >bimbo999s village (515|520) K55</a></td></tr>
    <tr><td>Origin of the troops:</td><td><a href="/game.php?village=828&amp;screen=info_village&amp;id=828" >KaLa I (514|520) K55</a></td></tr>
    </table><br />

    <h4>Units:</h4>
    <table class="vis">

I tried with this:

var match = h.match(/Supported player:</th>(.*)<\/a><\/th></i);

but is not working. Can you guys, help me?

2
  • 2
    Why are you manipulating the HTML directly? It's much safer (and usually easier) to work through the DOM. Find the right <table>, then the appropriate <a> tags in the table using jQuery or a cross-browser selector library like Sizzle and then just get the innerHTML of the <a> tag to get bimbo999. Commented Oct 23, 2011 at 6:40
  • Using regex to traverse html tags is not very good practice. Have you tried making a DOM element from the tag and getting innerHTML? Commented Oct 23, 2011 at 6:41

2 Answers 2

27

Try this:

/<a[^>]*>([\s\S]*?)<\/a>/
  • <a[^>]*> matches the opening a tag
  • ([\s\S]*?) matches any characters before the closing tag, as few as possible
  • <\/a> matches the closing tag

The ([\s\S]*?) captures the text between the tags as argument 1 in the array returned from an exec or match call.

This is really only good for finding text within a elements, it's not incredibly safe or reliable, but if you've got a big page of links and you just need their text, this will do it.


A much safer way to do this without RegExp would be:

function getAnchorTexts(htmlStr) {
    var div,
        anchors,
        i,
        texts;
    div = document.createElement('div');
    div.innerHTML = htmlStr;
    anchors = div.getElementsByTagName('a');
    texts = [];
    for (i = 0; i < anchors.length; i += 1) {
        texts.push(anchors[i].text);
    }
    return texts;
}
Sign up to request clarification or add additional context in comments.

2 Comments

/<a[^>]*>((?:.|\r?\n)*?)<\/a>/ is also handy for matching to the next closing tag over multiple lines.
It would match over multiple lines already \s match any white space character [\r\n\t\f ]
2

I don't have experience with Regex, but I think you can use JQuery with .text() !

JQuery API - .text()

I mean if you use :

var hrefText = $("a").text(); 

You will get your text without using Regex!

.find("a") and then gives you a list of a's tags objects and then use .each() to loop on that list then you can get the text by using .text().

Or your can use a class selector, id or anything you want!

2 Comments

this could also be done with regular javascript using getElementsByTagName('a'). Not a bad idea.
As a side note, it's not a good idea to use regex to parse HTML :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.