1

When user passes a directory name to my program, i check it against

private static final Pattern    DIRECTORY_PATTERN   
            = Pattern.compile("/*?([a-zA-Z_0-9]+)/*?",
                    Pattern.CASE_INSENSITIVE);

For what we've seen so far, this works, but i suspect this regex is incomplete.

Do you know of, or can you suggest a more complete regex, which would validate directory name?

2
  • What is an example of a string you want it to not accept? As far as I know, any string can be a valid directory name. Commented Jul 10, 2012 at 16:17
  • This is OS and File System dependent. stackoverflow.com/questions/537772/… provides a nice explanation. Commented Jul 10, 2012 at 16:19

2 Answers 2

3

Actually, there are a great many more characters you can use in a file name, even heinous things like backspaces and newline characters. In fact, you may find it depends on the underlying file system. I vaguely remember a rule somewhere that allowed everything except the actual path separator.

One thing I always consider when deciding if something is valid is to use it. For example, you can validate the format of an email address with a (complex) regex but the only way to be certain it's fully valid is to send a hyperlink mail to it to verify it's received.

In your particular case, if you want to create a file with that name, you can try to create a temporary file, in a directory you're actually allowed to create files in. If the file is created successfully, you can be pretty sure it's a valid name :-) Of course, if you're creating a file, you may just want to create the real file. If you're opening an existing file, forget the regex, just try to open the file - no amount of complication in your regex will tell you if the file exists or is readable by you.

To be frank, though, I'd consider placing your own limitations on the allowed characters - I have, in the past, cursed people who were silly enough to create file names with CTRL characters in them, or one called -rf that the rm command had troubles with (until you figure out how to get around that).

Sign up to request clarification or add additional context in comments.

1 Comment

You should for sure read the edit. Don't allow things like slash '/' or the null character '\0', and it might be better to disallow semi colons as well
0

This is file-system specific. Check out documentation for FS you expect to work with for list of accepted symbols and directory name limitations. You are already missing lots of punctuation and thousands of non-latin symbols for pretty much every modern FS in existence.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.