I'm making a search page, where you type a search query and the form is submitted to search.php?query=your query. What PHP function is the best and that I should use for encoding/decoding the search query?
6 Answers
For the URI query value use urlencode/urldecode; for anything else use rawurlencode/rawurldecode.
To create entire query string use http_build_query()
The difference between urlencode and rawurlencode is that
urlencodeencodes according to application/x-www-form-urlencoded (space is encoded with+) whilerawurlencodeencodes according to the plain Percent-Encoding (space is encoded with%20).
10 Comments
+ instead of %20. And besides that, application/x-www-form-urlencoded is used to encode form data while the Percent-Encoding has a more general usage.urldecode. Then, what about the URI path (e.g. /a/path with spaces/) and URI fragment (e.g. #fragment). Should I always use rawurldecode for these two?rawurlencode; but for POST and GET fields go with urlencode (Like /?folder=my+folder)`The cunningly-named urlencode() and urldecode().
However, you shouldn't need to use urldecode() on variables that appear in $_POST and $_GET.
5 Comments
"name=b&age=c&location=d") sent to a PHP file via AJAX?$_GET, that data will get decoded by default without the need for you to run it through urldecode() too.urldecode() on $_GET, the php.net manual states: "The superglobals $_GET and $_REQUEST are already decoded. Using urldecode() on an element in $_GET or $_REQUEST could have unexpected and dangerous results. "Here is my use case, which requires an exceptional amount of encoding. Maybe you think it is contrived, but we run this in production. Coincidently, this covers every type of encoding, so I'm posting as a tutorial.
Use case description
Somebody just bought a prepaid gift card ("token") on our website. Tokens have corresponding URLs to redeem them. This customer wants to email the URL to someone else. Our web page includes a mailto link that lets them do that.
PHP code
// The order system generates some opaque token
$token = 'w%a&!e#"^2(^@azW';
// Here is a URL to redeem that token
$redeemUrl = 'https://httpbin.org/get?token=' . urlencode($token);
// Actual contents we want for the email
$subject = 'I just bought this for you';
$body = 'Please enter your shipping details here: ' . $redeemUrl;
// A URI for the email as prescribed
$mailToUri = 'mailto:?subject=' . rawurlencode($subject) . '&body=' . rawurlencode($body);
// Print an HTML element with that mailto link
echo '<a href="' . htmlspecialchars($mailToUri) . '">Email your friend</a>';
Note: the above assumes you are outputting to a text/html document. If your output media type is text/json then simply use $retval['url'] = $mailToUri; because output encoding is handled by json_encode().
Test case
- Run the code on a PHP test site (is there a canonical one I should mention here?)
- Click the link
- Send the email
- Get the email
- Click that link
You should see:
"args": {
"token": "w%a&!e#\"^2(^@azW"
},
And of course this is the JSON representation of $token above.
1 Comment
mailto: is not HTTP), you can use $mailToUri 'mailto:?' . http_build_query(['subject'=>$subject, 'body'=>$body], null, '&', PHP_QUERY_RFC3986);.You can use URL encoding functions. PHP has the
rawurlencode()
function.
ASP.NET has the
Server.URLEncode()
function.
In JavaScript, you can use the
encodeURIComponent()
function.
Comments
Based on what type of RFC standard encoding you want to perform or if you need to customize your encoding you might want to create your own class.
/**
* UrlEncoder make it easy to encode your URL
*/
class UrlEncoder{
public const STANDARD_RFC1738 = 1;
public const STANDARD_RFC3986 = 2;
public const STANDARD_CUSTOM_RFC3986_ISH = 3;
// add more here
static function encode($string, $rfc){
switch ($rfc) {
case self::STANDARD_RFC1738:
return urlencode($string);
break;
case self::STANDARD_RFC3986:
return rawurlencode($string);
break;
case self::STANDARD_CUSTOM_RFC3986_ISH:
// Add your custom encoding
$entities = ['%21', '%2A', '%27', '%28', '%29', '%3B', '%3A', '%40', '%26', '%3D', '%2B', '%24', '%2C', '%2F', '%3F', '%25', '%23', '%5B', '%5D'];
$replacements = ['!', '*', "'", "(", ")", ";", ":", "@", "&", "=", "+", "$", ",", "/", "?", "%", "#", "[", "]"];
return str_replace($entities, $replacements, urlencode($string));
break;
default:
throw new Exception("Invalid RFC encoder - See class const for reference");
break;
}
}
}
Use example:
$dataString = "https://www.google.pl/search?q=PHP is **great**!&id=123&css=#kolo&[email protected])";
$dataStringUrlEncodedRFC1738 = UrlEncoder::encode($dataString, UrlEncoder::STANDARD_RFC1738);
$dataStringUrlEncodedRFC3986 = UrlEncoder::encode($dataString, UrlEncoder::STANDARD_RFC3986);
$dataStringUrlEncodedCutom = UrlEncoder::encode($dataString, UrlEncoder::STANDARD_CUSTOM_RFC3986_ISH);
Will output:
string(126) "https%3A%2F%2Fwww.google.pl%2Fsearch%3Fq%3DPHP+is+%2A%2Agreat%2A%2A%21%26id%3D123%26css%3D%23kolo%26email%3Dme%40liszka.com%29"
string(130) "https%3A%2F%2Fwww.google.pl%2Fsearch%3Fq%3DPHP%20is%20%2A%2Agreat%2A%2A%21%26id%3D123%26css%3D%23kolo%26email%3Dme%40liszka.com%29"
string(86) "https://www.google.pl/search?q=PHP+is+**great**!&id=123&css=#kolo&[email protected])"
* Find out more about RFC standards: https://datatracker.ietf.org/doc/rfc3986/ and urlencode vs rawurlencode?
Comments
You know how people keep saying things like: "Never manually craft a JSON string in PHP -- always call json_encode() for stability/reliability."?
Well, if you are building a query string, then I say: "Never manually craft a URL query string in PHP—always call http_build_query() for stability/reliability."
Demo:
$array = [
'query' => 'your query',
'example' => null,
'Qbert says:' => '&%=#?/'
];
echo http_build_query($array);
echo "\n---\n";
echo http_build_query($array, '', '&');
Output:
query=your+query&Qbert+says%3A=%26%25%3D%23%3F%2F
---
query=your+query&Qbert+says%3A=%26%25%3D%23%3F%2F
The fine print on this function is that if an element in the input array has a null value, then that element will not be included in the output string.
Here is an educational answer on the Joomla Stack Exchange site which encourages the use of & as the custom delimiter: Why are Joomla URL query strings commonly delimited with "&" instead of "&"?
Initially packaging your query string data in array form offers a compact and readable structure, then the call of http_build_query() does the hard work and can prevent data corruption. I generally opt for this technique even for small query string construction.
foo barin a text field, createsfoo+barin the URL).file_get_contents