1

I'm not familiar with the regex, However I think that REGEX could help me a lot to resolve my problem.

I have 2 kind of string in a big List<string> str (with or without description) :

str[0] = "[toto]";
str[1] = "[toto] descriptionToto";
str[2] = "[titi]";
str[3] = "[titi] descriptionTiti";
str[4] = "[tata]";
str[5] = "[tata] descriptionTata";

The list isn't really ordered. I would parse all my list then format datas depending on what I will find inside.

If I find: "[toto]" I would like to get to set str[0]="toto"

and If I find "[toto] descriptionToto" I would like to get to set str[1]="descriptionToto"

Do you have any ideas of the better way to get this result please ?

5
  • In the first case I just would like to rid the "[" "]" and on the other one I would like to delete this part "[contains] ". I could cut the string if I find space and just use a Replace("[", "").Replace("]", "") If I don't find any space, but is it better/faster to use the Replace than to use the REGEX ? Commented Apr 2, 2014 at 17:38
  • Usually regex isn't faster, but does require less lines of code and could produce more readable code. Commented Apr 2, 2014 at 17:40
  • You could use the String.Trim Method (Char()). Is there a requirement that there be a [toto] to match with a [toto] descriptionToto? Commented Apr 2, 2014 at 17:42
  • can't you order the data, like str.OrderBy(x => x); Commented Apr 2, 2014 at 17:44
  • No, It's just a transformation, there is no link between all entries. the result just must be like that : [toto] => toto and [toto] description => description Commented Apr 2, 2014 at 17:45

5 Answers 5

1

There are two regex options if you ask me:

  1. Make a regex pattern with two capturing groups, then use group 1 or group 2 depending on whether group 1 is empty. In this case you'd use named capturing groups to get a clear relationship between the pattern and the code

  2. Make a regex that matches string type 1 or string type 2, in which case you would get your end result directly from regex

If you're going for speed, using str[0].IndexOf(']') would get most of the job done.

Sign up to request clarification or add additional context in comments.

Comments

1

Rather than regex, I'd be inclined to just use string.split, something along the lines of:

string[] tokens = str[0].Split(new Char [] {'[', ']'});
if (tokens[2] == "") {
    str = tokens[1];
} else {
    str = tokens[2];
}

1 Comment

It works thanks. The "[" "]" are the only chars in each string instead of the space or all others.
1

You can use single regex:

string s = Regex.Match(str[0], @"(?<=\[)[^\]]*(?=]$)|(?<=] ).*").Value;

Idea is simple: if the text is ended with ] and there is no other ], then take everything between [ ], otherwise take everything after first ].

Sample code:

List<string> strList = new List<string> {
    "[toto]",
    "[toto] descriptionToto",
    "[titi]",
    "[titi] descriptionTiti",
    "[tata]",
    "[tata] descriptionTata" };
foreach(string str in strList)
    Console.WriteLine(Regex.Match(str, @"(?<=\[)[^\]]*(?=]$)|(?<=] ).*").Value);

Sample output:

toto
descriptionToto
titi
descriptionTiti
tata
descriptionTata

8 Comments

This would handle see ref [5] in the description?
Console.WriteLine(Regex.Match("[toto] see ref [5]", @"(?<=\[).*(?=]$)|(?<=] ).*").Value);
@sln Sorry, didn't understand your comment right. Thanks for the notice. That case also can be resolved by replacing first .* with [^\]]*.
[toto see ref [5] malformed?
@sln There is no such requirement in topic starter question. Though it still matches toto see ref [5, which I presume is correct.
|
0

if you are planning to get just the description for those that contain description:

you can do a split at a space char - " " and store the second element of the array in str[1] which would be the description. If there's no description, a space would not exist. So do a loop and then in an array store : list.Split(' '). This will split the str with description into two elements. so:

for (int i = 0; i < str.Length; i++)
        {
           string words[] = str[i].Split(' ')
           if words.length > 1 
           {str[i] = word[1];
            }
        }

Comments

0

If those are code strings and not literal variable notation this should work.
The replacement just catenates capture group 1 and 2.

Find: ^\s*(?:\[([^\[\]]*)\]\s*|\[[^\[\]]*\]\s*((?:\s*\S)+\s*))$
Replace: "$1$2"

 ^ 
 \s* 
 (?:
      \[  
      ( [^\[\]]* )                # (1)
      \]   \s* 
   |  
      \[  [^\[\]]* \]
      \s*  
      (                           # (2 start)
           (?: \s* \S )+
           \s* 
      )                           # (2 end)
 )
 $

Dot-Net test case

 string str1 = "[titi]";
 Console.WriteLine( Regex.Replace(str1, @"^\s*(?:\[([^\[\]]*)\]\s*|\[[^\[\]]*\]\s*((?:\s*\S)+\s*))$", @"$1$2"));
 string str2 = "[titi] descriptionTiti";
 Console.WriteLine( Regex.Replace(str2, @"^\s*(?:\[([^\[\]]*)\]\s*|\[[^\[\]]*\]\s*((?:\s*\S)+\s*))$", @"$1$2"));

Output >>

 titi
 descriptionTiti

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.