4
File file = new File("file-type-string-i-want-2000-01-01-01-01-01.conf.gz");
            Matcher matcher = pattern.compile("\\-(.*)\\-\\d{4}")).matcher(fileName);
            StringBuilder sb = new StringBuilder();
            while (matcher.find()) {
                sb.append(matcher.group());
            }
            stringList = Arrays.asList(sb.toString().split("-"));
            if (stringList.size() >= 2) {
                nameFragment = stringList.get(stringList.size() - 2);
            }

Desired result is to extract

string-iwant 

from strings that look like this

file-type-string-iwant-2000-01-01-01-01-01.conf.gz 

Unfortunatly, the format for "string-iwant" is a non-fixed length of alpha-numeric characters that will include only ONE hyphen BUT never start with a hyphen. The date formatting is consistent, the year is always after the string, so my current approach is to match on the -year, but I'm having difficulty excluding the stuff at the beginning.

Thanks for any thoughts or ideas

Edit: updated strings

5
  • 2
    What about the file-type part, can that contain hyphen? If yes, what else makes it different from string-i-want? Commented Mar 27, 2012 at 14:57
  • file-type may contain a hyphen or may not Commented Mar 27, 2012 at 15:03
  • You cannot exclude the part at the beginning because you did not define what it is with enough clarity. From your current description, an extra assumption is needed to tell "type-string-i-want" from "string-i-want" or even from "i-want" or "want". Commented Mar 27, 2012 at 15:04
  • @Hoofamon - in that case you need to find another quality that differentiates the two parts (e.g. string-i-want always contains a specific amount of hyphens), otherwise there is no way to tell the difference with a regexp Commented Mar 27, 2012 at 15:05
  • @Hoofamon If file-type may or may not contain a hyphen, how do you tell apart "string-i-want" and "type-string-i-want"? Commented Mar 27, 2012 at 15:06

3 Answers 3

4

Here's the regex you need:

\\-([^-]+\\-[^-]+)\\-\\d{4}\\-

Basically it means:

  • - starts with minus
  • ([^-]+\\-[^-]+) contains 1 or more non-minus symbols, then a minus, then 1 or more non-minus symbols. This part is captured.
  • -\d{4} a minus sign and 4 digits

However, that will only work if stuff-you-need has only one hyphen (or a constant amount of hyphens, which will need correction in regex). Otherwise, there is no way to know if given the string file-type-string-i-want the word type belongs to a sting you want or not.

Added:

In case the file-type always contains exactly one hyphen, you can capture the required part this way:

[^-]+\\-[^-]+\\-(.*)\\-\\d{4}\\-

Explanation:

  • [^-]+\-[^-]+\\- some amount of non-hyphen characters, then a hyphen, then more non-hyphens. This will skip the file-type string with the following hyphen.
  • \-\d{4}\- a hyphen, 4 digits followed by another hyphen
  • (.*) everything in between of previous 2 statements is captured as being the string you need to select
Sign up to request clarification or add additional context in comments.

4 Comments

I don't get the impression that the capture should contain exactly one hyphen . . . just that hyphen is a valid character for that portion of the string.
The question is a bit confusing in this part: "Unfortunatly, the format for "string-i-want" is a non-fixed length of alpha-numeric characters that will include a hyphen BUT never start with a hyphen.", is it ONE hyphen, or is it ANY amount of hyphens.
sorry, it is indeed only 1 hyphen so I should have said string-iwant
@Hoofamon then the first part (before "Added:") will work for you.
0

If it were PHP I would use something like the following to capture that string.

/^(\w+\-){2}(?<string>.+?)\-\d{4}(\-\d{2}){5}(\.\w+){2}$/

Comments

0

The regex that I would use for this purpose is this with a positive lookahead:

Pattern p = Pattern.compile("[^-]+-[^-]+(?=-\\d{4})");

Which simply means match the text containing exactly one hyphen followed by one hyphen and a 4 digit year.

Then you can simply grab the matcher.group(0) as your matched text which will be string-iwant in this case.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.