How to remove php code from a string?

Question

I have a string that has php code in it, I need to remove the php code from the string, for example:

<?php $db1 = new ps_DB() ?><p>Dummy</p>

Should return Dummy

And a string with no php for example Dummy should return the same string.

I know this can be done with a regular expression, but after 4h I haven't found a solution.

Pro-tip: you won't cover all cases of bracket matching <? .... ?> with a regular expression. If you know there will only ever be one set of tags, or you have some other constraint, a regex might be possible. Brace matching is a non regular language. :P — Stefan Kendall
– Stefan Kendall, Commented Jul 15, 2010 at 18:12
Can you give more context? There may be a way to achieve what you are looking for without having to utilize a variable self-storing php. — Mark Grey
– Mark Grey, Commented Jul 15, 2010 at 18:13

Community · Accepted Answer · 2012-10-05 11:54:03Z

8

 <?php
 function filter_html_tokens($a){
    return is_array($a) && $a[0] == T_INLINE_HTML ?
      $a[1]:
      '';
 }
 $htmlphpstring = '<a>foo</a> something <?php $db1 = new ps_DB() ?><p>Dummy</p>';
 echo implode('',array_map('filter_html_tokens',token_get_all($htmlphpstring)));
 ?>

As ircmaxell pointed out: this would require valid PHP!

A regex route would be (allowing for no 'php' with short tags. no ending ?> in the string / file (for some reason Zend recommends this?) and of course an UNgreedy & DOTALL pattern:

preg_replace('/<\\?.*(\\?>|$)/Us', '',$htmlphpstring);

edited Oct 5, 2012 at 11:54

CommunityBot

11 silver badge

answered Jul 15, 2010 at 18:31

Wrikken

70.8k8 gold badges99 silver badges136 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ircmaxell Over a year ago

Just note that you may not get valid HTML out of the regex solution... <?php $foo='?>'; $bar = 'something';?>foo will yield '; $bar='something'; ?>foo. The sort of it, is there's no perfect solution... Combine each to get a "best"...

Wrikken Over a year ago

Indeed, no perfect solution. If the actual problem can be solved higher up so our though up kludges don't have to be used it would be far preferable.

brenjt Over a year ago

When you need something accurate, this solution does an amazing job. Thank you.

ircmaxell · Accepted Answer · 2010-07-15 18:44:31Z

1

Well, you can use DomDocument to do it...

function stripPHPFromHTML($html) {
    $dom = new DomDocument();
    $dom->loadHtml($html);
    removeProcessingInstructions($dom);
    $simple = simplexml_import_dom($d->getElementsByTagName('body')->item(0));
    return $simple->children()->asXml();
}

function removeProcessingInstructions(DomNode &$node) {
    foreach ($node->childNodes as $child) {
        if ($child instanceof DOMProcessingInstruction) {
            $node->removeChild($child);
        } else {
            removeProcessingInstructions($child);
        }
    }
}

Those two functions will turn

$str = '<?php echo "foo"; ?><b>Bar</b>';
$clean = stripPHPFromHTML($str);
$html = '<b>Bar</b>';

Edit: Actually, after looking at Wrikken's answer, I realized that both methods have a disadvantage... Mine requires somewhat valid HTML markup (Dom is decent, but it won't parse foo<?php echo $bar). Wrikken's requires valid PHP (any syntax errors and it'll fail). So perhaps a combination of the two (try one first. If it fails, try the other. If both fail, there's really not much you can do without trying to figure out the exact reason they failed)...

edited Jul 15, 2010 at 18:44

answered Jul 15, 2010 at 18:35

ircmaxell

166k36 gold badges269 silver badges316 bronze badges

1 Comment

Wrikken Over a year ago

Good point, with invalid PHP mine would indeed fail. Added it to the answer for good measure.

Jon · Accepted Answer · 2017-07-04 15:53:30Z

A simple solution is to explode into arrays using the php tags to remove any content between and implode back to a string.

function strip_php($str) {

  $newstr = '';

  //split on opening tag
  $parts = explode('<?',$str);

  if(!empty($parts)) {
      foreach($parts as $part) {

          //split on closing tag
          $partlings =  explode('?>',$part);
          if(!empty($partlings)) {

              //remove content before closing tag
              $partlings[0] = '';
          }

          //append to string
          $newstr .= implode('',$partlings);
      }
  }
  return $newstr;
}

This is slower than regex but doesn't require valid html or php; it only requires all php tags to be closed.

For files which don't always include a final closing tag and for general error checking you could count the tags and append a closing tag if it's missing or notify if the opening and closing tags don't add up as expected, e.g. add the code below at the start of the function. This would slow it down a bit more though :)

  $tag_diff = (substr_count($str,'<?') - (substr_count($str,'?>');

  //Append if there's one less closing tag
  if($tag_diff == 1) $str .= '?>';

  //Parse error if the tags don't add up
  if($tag_diff < 0 || $tag_diff > 1) die('Error: Tag mismatch. 
  (Opening minus closing tags = '.$tag_diff.')<br><br>
  Dumping content:<br><hr><br>'.htmlentities($str));

Eldy · Accepted Answer · 2018-11-27 15:09:27Z

This is an enhanced version of strip_php suggested by @jon that is able to replace php part of code with another string:

/**
 * Remove PHP code part from a string.
 *
 * @param   string  $str            String to clean
 * @param   string  $replacewith    String to use as replacement
 * @return  string                  Result string without php code
 */
function dolStripPhpCode($str, $replacewith='')
{
    $newstr = '';

    //split on each opening tag
    $parts = explode('<?php',$str);
    if (!empty($parts))
    {
        $i=0;
        foreach($parts as $part)
        {
            if ($i == 0)    // The first part is never php code
            {
                $i++;
                $newstr .= $part;
                continue;
            }
            //split on closing tag
            $partlings = explode('?>', $part);
            if (!empty($partlings))
            {
                //remove content before closing tag
                if (count($partlings) > 1) $partlings[0] = '';
                //append to out string
                $newstr .= $replacewith.implode('',$partlings);
            }
        }
    }
    return $newstr;
}

jeph perro · Accepted Answer · 2010-07-15 18:28:34Z

0

If you are using PHP, you just need to use a regular expression to replace anything that matches PHP code.

The following statement will remove the PHP tag:

preg_replace('/^<\?php.*\?\>/', '', '<?php $db1 = new ps_DB() ?><p>Dummy</p>');

If it doesn't find any match, it won't replace anything.

answered Jul 15, 2010 at 18:28

jeph perro

6,46228 gold badges97 silver badges126 bronze badges

Collectives™ on Stack Overflow

How to remove php code from a string?

5 Answers 5

3 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related