2

I'm looking for a regex that will be able to replace all links like <a href="javascript://potentiallybadstuff"> Link </a> with a warning. I've been having a play but no success so far! I've always been bad with regex, can someone point me in the right direction? I have this so far:

Edit: People saying don't use Regex - the HTML will be the output of a markdown parser with all HTML tags in the markdown stripped. Therefore i know that the output of all links will be formatted as stated above, therefore regex would surely be a good tool in this particular situation. I am not allowing users to enter pure HTML. And SO has done something very similar, try creating a javascript link, and it will be removed

<?php
//Javascript link filter test
if(isset($_POST['jsfilter'])){
    $html = "<a href=\"". $_POST['jsfilter']."\"> JS Link </a>";
    $pattern = "/ href\\s*?=\\s*?[\"']\\s*?(javascript)\\s*?(:).*?([\"']) /is";
    $replacement = "\"javascript: alert('Javascript links have been blocked');\"";
    $html = preg_replace($pattern, $replacement, $html);
    echo $html;
}
?>
<form method="post">
<input type="text" name="jsfilter" />
<button type="submit">Submit</button>
</form>
2
  • 2
    Don't. Just don't. It looks like you're accepting HTML tags. Accept BBCode instead. A tags isn't the only thing to worry about. There's also img tags, form tags, script tags and everything else that has onload attributes and such. Commented Jun 20, 2013 at 12:53
  • @h2ooooooo Well I'm accepting markdown with html tags stripped. I want links to be available, but just not javascript ones? I am not allowing any images or forms or scripts; just links Commented Jun 20, 2013 at 12:55

4 Answers 4

3

The right regex should be :

$pattern = '/href="javascript:[^"]+"/';
$replacement = 'href="javascript:alert(\'Javascript links have been blocked\')"';
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for answering the question, instead of criticizing :)
1

Use strip_tags and htmlSpecialChars() to display user generated content. If you want to let users use specific tags, refer to BBcode.

1 Comment

Just tested BBCode, and it does not block javascript links
1

You should test quote and double quotes, handle white spaces, etc...

    $html = preg_replace( '/href\s*=\s*"javascript:[^"]+"/i' , 'href="#"' , $html );
    $html = preg_replace( '/href\s*=\s*\'javascript:[^i]+\'/i' , 'href=\'#\'' , $html );

Comments

0

Try this code. I think, this would help.

<?php
//Javascript link filter test
if(isset($_POST['jsfilter'])){
    $html = "<a href=\"". $_POST['jsfilter']."\"> JS Link </a>";
    $pattern = '/a href="javascript:(.*?)"/i';
    $replacement = 'a href="javascript: alert(\'Javascript links have been blocked\');"';
    $html = preg_replace($pattern, $replacement, $html);
    echo $html;
}
?>

1 Comment

Are you sure about that? I lately made regex like this in script and it selected everything from the href string. You should use [^"]* instead of (.*?) as your server will think that the second quote is still in href.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.