I've noticed a strange preg_replace() behaviour when I'm dealing with strings that start with a numeric character: The replacement strings have their first character (first digit) cut off. I'm seeing it in PHP 5.6.36 and PHP 7.0.30.
This code:
<?php
$items = array(
'1234567890' => '<a href="http://example.com/1234567890">1234567890</a>',
'1234567890 A' => '<a href="http://example.com/123456789-a">1234567890 A</a>',
'A 1234567890' => '<a href="http://example.com/a-1234567890">A 1234567890</a>',
'Only Text' => '<a href="http://example.com/only-text">Only Text</a>',
);
foreach( $items as $title => $item ) {
$search = '/(<a href="[^"]+">)[^<]+(<\/a>)/';
$replace = '$1' . $title . '$2';
// Preserve for re-use.
$_item = $item;
// Doesn't work -- the titles starting with a number are wonky.
$item = preg_replace( $search, $replace, $item );
echo 'Broken: ' . $item . PHP_EOL;
// Ugly hack to fix the issue.
if ( is_numeric( substr( $title, 0, 1 ) ) ) {
$title = ' ' . $title;
}
$replace = '$1' . $title . '$2';
$_item = preg_replace( $search, $replace, $_item );
echo 'Fixed: ' . $_item . PHP_EOL;
}
produces this result:
Broken: 234567890</a>
Fixed: <a href="http://example.com/1234567890"> 1234567890</a>
Broken: 234567890 A</a>
Fixed: <a href="http://example.com/123456789-a"> 1234567890 A</a>
Broken: <a href="http://example.com/a-1234567890">A 1234567890</a>
Fixed: <a href="http://example.com/a-1234567890">A 1234567890</a>
Broken: <a href="http://example.com/only-text">Only Text</a>
Fixed: <a href="http://example.com/only-text">Only Text</a>
I've tested my regex online at https://regex101.com/, and as far as I can tell, it's written correctly. (It's not terribly complex, IMHO.)
Is this a PHP bug, or am I not completely grokking my regex?
'$1' . '1234...' . '$2'is being interpreted as$11234...$2.