There are several points of best practice involved in this seemingly innocuous task.
- When parsing HTML, use a legitimate DOM parsing tool (like
DOMDocument) instead of regex.
- When parsing a URL, use a legitimate URL parsing tool (like
parse_url()) instead of regex.
- When parsing a URL's query string, use a legitimate query string parsing tool (like
parse_str()) instead of regex
- When building a query string for a URL (especially, one being printed to an HTML document), use a legitimate query string builder tool (like
http_build_query()).
More considerations and safeguards can be implemented, but with your current example, I believe this uses the most reliable/reasonable tools rather than seeking shortcuts. Demo
$html = <<<HTML
<a href="/xsearch2?q=some search/21">21</a>
HTML;
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$a = $dom->getElementsByTagName('a')->item(0);
['path' => $path, 'query' => $qs] = parse_url($a->getAttribute('href'));
parse_str($qs, $qsArray);
if (isset($qsArray['q']) && preg_match('#(.*)/(\d+)$#', $qsArray['q'], $m)) {
$qsArray['q'] = $m[1];
$qsArray['page'] = $m[2];
$a->setAttribute('href', "$path?" . http_build_query($qsArray));
$html = $dom->saveHTML();
}
echo $html;
Output:
<a href="/xsearch2?q=some+search&page=21">21</a>
What you should definitely NOT do is be tempted by the unreliable shortcut of a single preg_replace() call (even if it does work). Demo
$html = <<<HTML
<a href="/xsearch2?q=some search/21">21</a>
HTML;
echo preg_replace('#/(\d+)(?=">)#', '&page=$1', $html, 1);
Like most innocent-ish regex patterns, this one will work... until it doesn't.
some searchstring?$pagination. So how did you end up with this hyperlink in a string? Where the rest of the code? Are you planning to use urlencode() on the$key?