2

The name of my HTML checkbox form field is something like this:

name = "Some File name.pdf"

The PHP POST array looks something like this (some characters replaced by underscore):

array(
 "Some_File_name_pdf" => "on"
)

Is there a PHP function I can use to convert a filename string exactly as it appears in the POST array? I am not interested in str_replace. I want to be able to do this:

$myfilename = $obj->getFileName(); // returns "Some File name.pdf"
$result = isset($_POST[some_encoding_function($myfilename)]);

The some_encoding_function should take a string like "Some File name.pdf" and return something like "Some_File_name_pdf";

3
  • Since both spaces and dots seem to be converted to underscores, there can't be a function that reliably converts them back. But then again, using non-URL-compatible characters in name attributes is probably a bad idea anyway. Commented Jun 15, 2015 at 6:58
  • I don't mean to convert back. But I mean convert again a filename into same format i.e. replace dots, spaces (and I don't know what else) to underscores. Commented Jun 15, 2015 at 7:15
  • The more practical solutions to this problem might be to a) not use arbitrary user supplied values as keys and/or b) encode such keys in a way that does not break after processing (i.e. encode it to something like Some_File_name_pdf yourself using your own method). Commented Jun 15, 2015 at 9:39

1 Answer 1

2

According to the PHP documentation:

Dots and spaces in variable names are converted to underscores.

This would require only a trivial string substitution.
However, according to a comment on that page:

The full list of field-name characters that PHP converts to _ (underscore) is the following (not just dot):
chr(32) ( ) (space)
chr(46) (.) (dot)
chr(91) ([) (open square bracket)
chr(128) - chr(159) (various)

If that comment is correct, then you should be good to go with a function like

function underscorify($s)
{
    return preg_replace('/[ \.\[\x80-\x9F]/', '_', $s);
}

Note however, that chr(128) - chr(159) is ambiguous, as it is not mentioned whether this is character-encoding-dependent or not.
It may refer to all ASCII characters from to Ÿ, it may refer to all UTF-8 control characters from \u0080-\u009F, or it may simply be hardcoded to check the byte value for b >= 128 && b <= 159.

Sign up to request clarification or add additional context in comments.

1 Comment

I'd put my money on it meaning any byte between 128-159. PHP is not encoding aware in its string handling and couldn't be for arbitrary requests, and the code chr(128) literally produces the byte x80.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.