1

I want (in C#) to check the syntax and extract some data from a string. Check if the string contains: "someWord IS someWord( OR someWord){1-infinite}" And extract every words and for the first word, name the group "switch"

This is my string :

string text = "[bird] IS blue OR yellow OR green";

So I use this regex

string switchPattern = @"\s*(?<switch>.+?)\s+IS\s+(.+?)(?:\s+OR\s+(.+?))+$";

And extract with

Match switchCaseMatch = Regex.Match(text, switchCaseOperatorPattern);

This give me a group with 4 elements

[0]: [bird] IS blue OR yellow OR green
[1]: green
[2]: blue
[3]: [bird]  named switch

but I want

[0]: [bird] IS blue OR yellow OR green
[1]: green
[2]: yellow
[3]: blue
[4]: [bird]  named switch

I hoped that the last "(.+?)" will create a group for all matching cases, but it create only one, for the last occurence. I try with Regex.Matches with the same result.

I know that I could do it with two regex (a Regex.Match then Regex.Matches for the "someWord( OR someWord){1-infinite}"), but I want to know if is it possible to do it with only one regex.

Thanks

2
  • considering your given string , what is your expected result ? Commented Apr 19, 2018 at 16:33
  • Regex.Captures instead of Regex.Groups might give you what you want. See this article Commented Apr 19, 2018 at 16:35

2 Answers 2

4

Actually you can do it with Regex.Match, using Captures as I said in my comment. Here is a code sample:

        string text = "[bird] IS blue OR yellow OR green";
        string switchPattern = @"\s*(?<switch>.+?)\s+IS\s+(.+?)(?:\s+OR\s+(.+?))+$";

        Match switchCaseMatch = Regex.Match(text, switchPattern);
        foreach (Group group in switchCaseMatch.Groups)
        {
            if (group.Captures.Count == 1)
                Console.WriteLine(group.Value);
            else foreach (Capture cap in group.Captures)
                    Console.WriteLine(cap.Value);
        }

This results in:

[bird] IS blue OR yellow OR green
blue
yellow
green
[bird]

See the Microsoft MSDN page for Captures for more information

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot that exactly what I am looking for.
0

I think using groups will be difficult because you need to anticipate how many groups you will have. I suggest using the Matches method and MatchCollection instead. You'll have access to the named group inside of there as well as capturing all occurrences of the target strings you are after.

e.g.

string text = "[bird] IS blue OR yellow OR green";
string switchPattern = @"(?<=(?<switch>\S+)\s+IS.*?)(\w+(?=\s+OR)|(?<=OR\s+)\w+)";
MatchCollection switchCaseMatch = Regex.Matches(text, switchPattern);

foreach (Match m in switchCaseMatch)
{
    Console.WriteLine(m.Groups["switch"].Value);
    Console.WriteLine(m.Value);
}

You construct the pattern to use an un-bounded lookbehind to search for the switch (group) text. This will force every occurrence of a color to follow that text. The dot-star in that lookbehind will consume all color texts that have been captured in previous iterations. Then, you use either a lookahead to find the first color (by ensuring that "OR" follows the color) or a lookbehind to find all subsequent colors (by ensuring the "OR" precedes the color. Then it's just a matter of evaluating the Value property of each Match object in the MatchCollection. The named group will be captured in each Match so you'll have access to that as well.

2 Comments

You can do it using Regex.Match, by using the Captures property within a Group...see my comment and answer for more info.
Yeah, I got ya. I haven't used Captures much, but OP was trying to use Groups which I was pretty sure wouldn't work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.