0

Consider the code below, why it is not working?

<?php

$str = "
<h4>
   title
</h4>
";

$result = preg_match_all ('/<h4>([\d\D])<\/h4>/mi', $str, $matches);
var_dump($matches);
1

1 Answer 1

2

You probably meant

$str = "
<h4>
   title
</h4>
";

$result = preg_match_all ('/<h4>(.+?)<\/h4>/si', $str, $matches);
var_dump($matches);

The regex you applied, '/<h4>([\d\D])<\/h4>/mi', means "Match an opening h4, one character that's either a digit or not a digit, and a closing h4." But you have plenty of characters to match, so you need to specify a quantifier ("more than one", +). Update: you need a non-greedy quantifier, +?, if you have more than one h4 (very likely!) And the class [\d\D] can be reduced to "any character", .. One more point: you need to use /s instead of /m to get the behaviour you want.

This will probably include the newlines in your match!

Sign up to request clarification or add additional context in comments.

4 Comments

Maybe using .* instead of .+ in case of a empty node. Shouldn't be happening but depending on how HTML source is generated... mieux vaut prévenir que guérir
I removed the greediness, because that might match right up to the end tag of the very last h4 in the document! I hope to receive some feedback from Howard.
Will it "probably include the newlines in your match"; I had no idea the matching was a matter of probablies.
No, it isn't. I meant "My personal opinion is that it will include the newlines in your match, but I have too little experience with PHP's PCRE (and I didn't test it), so I'm not sure if this will happen!" You're invited (by me, but more importantly, by SO) to edit my answer and confirm or deny the statement, or comment on it explaining if this will happen.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.