0

I have need to configure a regex to match nth item (whitespace separated). I have below so far which gets 3rd item from a line, however it is a group item. Is it possible to modify the regex to actually match the 3rd item as the first match in result?

https://regex101.com/r/FKscLq/1

Also is there an equivalent regex to match the nth number (whitespace separated)?

E.g. below string should match 2323 as 2nd number. String should return no matches for 3rd number.

Fiji 123545 27.10.1981 Westpac 2323 Bank 232dcc desc

Edit: I have got the regex to match nth word now. See below, it works beautifully. https://regex101.com/r/2F4J9o/1

I still need to get the nth number match though.

2
  • It would be fairly easy to do in two passes, the first picks out non-whitespace groups separated by whitespace. The second pass would iterate over the results, looking for items that match ^[0-9]?$ . If the separators are just spaces and tabs, then you could use string.Split for the first pass Commented Oct 19, 2022 at 0:16
  • Oops, that should be ^[0-9]+$ Commented Oct 19, 2022 at 4:22

2 Answers 2

0

To get a match only for the second number, you can use a positive lookbehind assertion with a quantifier {n} to match n times digits surrounded by whitespace characters using (?<!\S)\d+(?!\S) so that it will not match for example 27.10.1981

In this pattern, the {n} is {1}

(?<=^(?:(?:(?!(?<!\S)\d+(?!\S)).)*(?<!\S)\d+(?!\S)){1}(?:(?!(?<!\S)\d+(?!\S)).)*)(?<!\S)\d+(?!\S)

Regex demo

Note that it is much easier to use a capture group:

^(?:(?:(?!(?<!\S)\d+(?!\S)).)*(\d+)){2}

Regex demo


To get the 3rd match for whitespace chars separated, you don't need any capture groups:

(?<=^(?:\S+\s){2})\S+

Regex demo

Sign up to request clarification or add additional context in comments.

Comments

0

A simple (and likely faster) solution that doesn't completely rely on Regex:

static string GetNthNumber(string input, int whichOne)
{
    var words = input.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries);
    var index = 0;
    const string pattern = @"^[0-9]+$";
    var regex = new Regex(pattern);
    foreach (var word in words)
    {
        if (regex.IsMatch(word))
        {
            index++;
            if (index == whichOne)
            {
                return word;
            }
        }
        if (index > whichOne)
        {
            return null;
        }
    }
    return null;
}

Some test code:

const string input = "Fiji 123545 27.10.1981 Westpac 2323 Bank 232dcc desc";
foreach (var index in Enumerable.Range(1, 4))
{
    var result = GetNthNumber(input, index);
    var show = result ?? "(null)";
    Console.WriteLine($"{index}: {show}");
}

Results:

1: 123545
2: 2323
3: (null)
4: (null)

If there are other characters that you consider whitespace, just add them to the array argument to string.Split

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.