Replace the querystring of an href declaration in an <a> tag

Question

I want to replace the following hyperlinks dynamically

from

<a href="/xsearch2?q=some search/21">21</a>

to

<a href="/xsearch2?q=some search&page=21">21</a>

How can I do that dynamically? I have tried the following:

preg_replace(
    '#<a.*?>([^>]*)</a>#i',
    '<a href="/xsearch2?q=' . $key . '&page=$1">$1</a>',
    $pagination
);

but it's changing the hyperlinks also, just want to remove the last slash / from hyperlinks and add &page=.

Just need to ask, is it possible to solve this by changing what is generating it this way? — Chris Haas
– Chris Haas, Commented Dec 1, 2024 at 2:41
This is not a job which needs or should implement regex. Is this input text part of a larger string/document? or is this the whole isolated string? — mickmackusa
– mickmackusa ♦, Commented Dec 1, 2024 at 3:51
What happens when the user adds a forward slash to the some search string? — mickmackusa
– mickmackusa ♦, Commented Dec 1, 2024 at 3:58
It is unclear what you're working with. You just don't tell. I can guess from the single line of code that the input is a string, called $pagination. So how did you end up with this hyperlink in a string? Where the rest of the code? Are you planning to use urlencode() on the $key? — KIKO Software
– KIKO Software, Commented Dec 1, 2024 at 7:42

mickmackusa · Accepted Answer · 2024-12-01 04:29:32Z

There are several points of best practice involved in this seemingly innocuous task.

When parsing HTML, use a legitimate DOM parsing tool (like DOMDocument) instead of regex.
When parsing a URL, use a legitimate URL parsing tool (like parse_url()) instead of regex.
When parsing a URL's query string, use a legitimate query string parsing tool (like parse_str()) instead of regex
When building a query string for a URL (especially, one being printed to an HTML document), use a legitimate query string builder tool (like http_build_query()).

More considerations and safeguards can be implemented, but with your current example, I believe this uses the most reliable/reasonable tools rather than seeking shortcuts. Demo

$html = <<<HTML
<a href="/xsearch2?q=some search/21">21</a>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$a = $dom->getElementsByTagName('a')->item(0);
['path' => $path, 'query' => $qs] = parse_url($a->getAttribute('href'));
parse_str($qs, $qsArray);
if (isset($qsArray['q']) && preg_match('#(.*)/(\d+)$#', $qsArray['q'], $m)) {
    $qsArray['q'] = $m[1];
    $qsArray['page'] = $m[2];
    $a->setAttribute('href', "$path?" . http_build_query($qsArray));
    $html = $dom->saveHTML();
}
echo $html;

Output:

<a href="/xsearch2?q=some+search&amp;page=21">21</a>

What you should definitely NOT do is be tempted by the unreliable shortcut of a single preg_replace() call (even if it does work). Demo

$html = <<<HTML
<a href="/xsearch2?q=some search/21">21</a>
HTML;

echo preg_replace('#/(\d+)(?=">)#', '&page=$1', $html, 1);

Like most innocent-ish regex patterns, this one will work... until it doesn't.

Collectives™ on Stack Overflow

Replace the querystring of an href declaration in an <a> tag

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related