0

I have this string where I've put all opening tags into (array) $opened and all closing tags into (array) $closed, like so:

'<div>
    Test
 </div>

 <div>
    <blockquote>
       <p>The quick</p>
       <blockquote>
          <p>brown fox <span>jumps <span>over <img src="#" /> the'

Results in these two arrays:

$opened =

array(8) {
  [0]=> string(3)  "div"         // Need removed
  [1]=> string(3)  "div"
  [2]=> string(10) "blockquote"
  [3]=> string(1)  "p"           // Need removed
  [4]=> string(10) "blockquote"
  [5]=> string(1)  "p"
  [6]=> string(4)  "span"
  [7]=> string(4)  "span"
}

$closed =

array(2) {
  [0]=> string(3) "div"
  [1]=> string(1) "p"
}

I need to somehow say:

Find the first occurrence of $closed[0] (which is "div") in the $opened array and remove it from the $opened array, then repeat until all $closed tags ("div and "p") have been removed from the top of $opened.

13
  • Are you using the same snippet I gave in the other question? :p Commented Nov 15, 2009 at 15:02
  • 2
    Have you looked at it though? I would highly recommend using it instead of this. Trust me, DOM isn't really all that hard, and would be far easier than doing what you're trying to do now. Commented Nov 15, 2009 at 15:05
  • 2
    As many have insisted, regex is not the way to deal with parsing HTML, plus it's harder than the DOM method I suggested. Can you take a look at it and try it instead? Comment on my answer if you have specific questions. Commented Nov 15, 2009 at 15:07
  • 2
    C'mon, confess: you're just posting such a question in order to provoke bobince, right? :) Commented Nov 15, 2009 at 15:09
  • 3
    To have at least some kind of parser, I would do the following: Split the input at the tags while preserving them (see preg_split’s PREG_SPLIT_DELIM_CAPTURE). Then iterate the parts, put the opening tags on a stack and see if there is a corresponding closing tag and vice versa. If the opening and closing tags match, remove the opening tag from the stack. Doing so you can find mismatches of opening/closing tags and remove them or add the counterpart at the right position. Commented Nov 15, 2009 at 15:18

3 Answers 3

1

Hope this helps someone. This is what I came up with:

<?php

    for ( $i = 0; $i < $num_closed; $i++ )
    {
        unset ( $opened[ array_search( $closed[ $i ], $opened ) ] );
    }

?>

I also came up with a for loop which worked, but you had to manipulate the $opened[$i] and $closed[$n] independently, and it was a bit more code, so I ultimately decided on this one.

Sign up to request clarification or add additional context in comments.

Comments

0

I am not sure if this is what you are looking for, but this will remove the first instance of the thing you are looking for.

<?php
ini_set('display_errors', 1);
error_reporting(E_ALL);

$opened_tags = array("div", "div", "blockquote", "p", "blockquote", "p", "span", "span");
$closed_tags = array("div", "p");

$find = "div";

$size = sizeof($closed_tags);
for($i=0; $i<$size; $i++) {
    if($closed_tags[$i] == $find){
        unset($closed_tags[$i]);
        break;
    }
}

echo "closed_tags with empty spaces: ".print_r($closed_tags, true)."<br /><br />";

$closed_tags = array_values($closed_tags);
echo "closed_tags array indexed correctly: ".print_r($closed_tags, true)."<br />";
?>

Comments

0

I cannot tell from your question, but if your aim is to clean up a snippet of html code, why not simply use the HTMLTidy extension?

Tutorial

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.