7

I have a string that has php code in it, I need to remove the php code from the string, for example:

<?php $db1 = new ps_DB() ?><p>Dummy</p>

Should return <p>Dummy</p>

And a string with no php for example <p>Dummy</p> should return the same string.

I know this can be done with a regular expression, but after 4h I haven't found a solution.

2
  • Pro-tip: you won't cover all cases of bracket matching <? .... ?> with a regular expression. If you know there will only ever be one set of tags, or you have some other constraint, a regex might be possible. Brace matching is a non regular language. :P Commented Jul 15, 2010 at 18:12
  • Can you give more context? There may be a way to achieve what you are looking for without having to utilize a variable self-storing php. Commented Jul 15, 2010 at 18:13

5 Answers 5

8
 <?php
 function filter_html_tokens($a){
    return is_array($a) && $a[0] == T_INLINE_HTML ?
      $a[1]:
      '';
 }
 $htmlphpstring = '<a>foo</a> something <?php $db1 = new ps_DB() ?><p>Dummy</p>';
 echo implode('',array_map('filter_html_tokens',token_get_all($htmlphpstring)));
 ?>

As ircmaxell pointed out: this would require valid PHP!

A regex route would be (allowing for no 'php' with short tags. no ending ?> in the string / file (for some reason Zend recommends this?) and of course an UNgreedy & DOTALL pattern:

preg_replace('/<\\?.*(\\?>|$)/Us', '',$htmlphpstring);
Sign up to request clarification or add additional context in comments.

3 Comments

Just note that you may not get valid HTML out of the regex solution... <?php $foo='?>'; $bar = 'something';?><b>foo</b> will yield '; $bar='something'; ?><b>foo</b>. The sort of it, is there's no perfect solution... Combine each to get a "best"...
Indeed, no perfect solution. If the actual problem can be solved higher up so our though up kludges don't have to be used it would be far preferable.
When you need something accurate, this solution does an amazing job. Thank you.
1

Well, you can use DomDocument to do it...

function stripPHPFromHTML($html) {
    $dom = new DomDocument();
    $dom->loadHtml($html);
    removeProcessingInstructions($dom);
    $simple = simplexml_import_dom($d->getElementsByTagName('body')->item(0));
    return $simple->children()->asXml();
}

function removeProcessingInstructions(DomNode &$node) {
    foreach ($node->childNodes as $child) {
        if ($child instanceof DOMProcessingInstruction) {
            $node->removeChild($child);
        } else {
            removeProcessingInstructions($child);
        }
    }
}

Those two functions will turn

$str = '<?php echo "foo"; ?><b>Bar</b>';
$clean = stripPHPFromHTML($str);
$html = '<b>Bar</b>';

Edit: Actually, after looking at Wrikken's answer, I realized that both methods have a disadvantage... Mine requires somewhat valid HTML markup (Dom is decent, but it won't parse <b>foo</b><?php echo $bar). Wrikken's requires valid PHP (any syntax errors and it'll fail). So perhaps a combination of the two (try one first. If it fails, try the other. If both fail, there's really not much you can do without trying to figure out the exact reason they failed)...

1 Comment

Good point, with invalid PHP mine would indeed fail. Added it to the answer for good measure.
1

A simple solution is to explode into arrays using the php tags to remove any content between and implode back to a string.

function strip_php($str) {

  $newstr = '';

  //split on opening tag
  $parts = explode('<?',$str);

  if(!empty($parts)) {
      foreach($parts as $part) {

          //split on closing tag
          $partlings =  explode('?>',$part);
          if(!empty($partlings)) {

              //remove content before closing tag
              $partlings[0] = '';
          }

          //append to string
          $newstr .= implode('',$partlings);
      }
  }
  return $newstr;
}

This is slower than regex but doesn't require valid html or php; it only requires all php tags to be closed.

For files which don't always include a final closing tag and for general error checking you could count the tags and append a closing tag if it's missing or notify if the opening and closing tags don't add up as expected, e.g. add the code below at the start of the function. This would slow it down a bit more though :)

  $tag_diff = (substr_count($str,'<?') - (substr_count($str,'?>');

  //Append if there's one less closing tag
  if($tag_diff == 1) $str .= '?>';

  //Parse error if the tags don't add up
  if($tag_diff < 0 || $tag_diff > 1) die('Error: Tag mismatch. 
  (Opening minus closing tags = '.$tag_diff.')<br><br>
  Dumping content:<br><hr><br>'.htmlentities($str));

Comments

1

This is an enhanced version of strip_php suggested by @jon that is able to replace php part of code with another string:

/**
 * Remove PHP code part from a string.
 *
 * @param   string  $str            String to clean
 * @param   string  $replacewith    String to use as replacement
 * @return  string                  Result string without php code
 */
function dolStripPhpCode($str, $replacewith='')
{
    $newstr = '';

    //split on each opening tag
    $parts = explode('<?php',$str);
    if (!empty($parts))
    {
        $i=0;
        foreach($parts as $part)
        {
            if ($i == 0)    // The first part is never php code
            {
                $i++;
                $newstr .= $part;
                continue;
            }
            //split on closing tag
            $partlings = explode('?>', $part);
            if (!empty($partlings))
            {
                //remove content before closing tag
                if (count($partlings) > 1) $partlings[0] = '';
                //append to out string
                $newstr .= $replacewith.implode('',$partlings);
            }
        }
    }
    return $newstr;
}

Comments

0

If you are using PHP, you just need to use a regular expression to replace anything that matches PHP code.

The following statement will remove the PHP tag:

preg_replace('/^<\?php.*\?\>/', '', '<?php $db1 = new ps_DB() ?><p>Dummy</p>');

If it doesn't find any match, it won't replace anything.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.