0

I am using a regular expression for image file names. The main reason why I'm using RegEx's is to prevent multiple files for the exact same purpose.

The syntax for the filenames can either be:

1)    img_0F_16_-32_0.png
2)    img_65_32_x.png

As you might have noticed, "img_" is the general prefix. What follows is a two-digit hexadecimal number. After another underscore comes an integer that has to be a power of two, somewhere between 1 through 512. Yet another underscore is next.

Okay so this far, my regular expression is working flawlessly. The rest is what I'm having problems with: Because what can follow is either a pair of integer coordinates (can be 0), separated by an underscore, or an x. After this comes the final ".png". Done.

Now the main problem I am having is that both variants have to be possible, and also it is highly important that there may not be any duplicate coordinates. Most importantly, integers, both positive and negative, may never start with one or more zeros! This would produce duplications like:

401 = 00401
-10 = -0010

This is my first attempt:

img_[0-9a-fA-F]{2}_(1|2|4|8|16|32|64|128|256|512)_([-]?[1-9])?[0-9]*_([-]?[1-9])?[0-9]*[.]png

Thanks for your help in advance,

Tom S.

1 Answer 1

3

Why use regular expressions? Why not create a class that decomposes either variant of String to a canonical String, give the class a hashCode() and equals() method that uses this canonical String and then create a HashSet of these objects to make sure that only one of these types of files exist?

Sign up to request clarification or add additional context in comments.

4 Comments

+1. I find that string-parsing code is often faster to write, easier to understand, and easier to debug if you just write it as normal code, rather than trying to use a regular expression for every purpose.
actually, I am using the regular expression in a filename filter. this is important because the directory i am loading the files from contains roughly ~2500 files.
If you're dealing with that lot of files, then you've one more reason to not use regex. It's slower than a simple parser.
@TomS.: You can write a FilenameFilter that performs any logic you like. There's no need to restrict it to regexes. And for the length of strings that we're talking about here (filenames), you've got a good chance at being faster than a regex with some simple parsing code. Either way, 2500 string parses is far too few to be a real performance hit on any reasonably fast machine.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.