2

So I'm trying to match up a regex and I'm fairly new at this. I used a validator and it works when I paste the code but not when it's placed in the codebehind of a .NET2.0 C# page.

The offending code is supposed to be able to split on a single semi-colon but not on a double semi-colon. However, when I used the string

"entry;entry2;entry3;entry4;"

I get a nonsense array that contains empty values, the last letter of the previous entry, and the semi-colons themselves. The online javascript validator splits it correctly. Please help!

My regex:

((;;|[^;])+)
3
  • Can you remove the javascript tag as this is a .NET question... Commented Jan 29, 2010 at 15:58
  • nregex.com/nregex/default.aspx is useful for checking regexes easily online, and so is sourceforge.net/projects/regulator on the desktop. The second is also very useful for learning them. Commented Jan 29, 2010 at 16:00
  • I had used a javascript validator to validate my original regex. The difference between c# and javascript regex is apparently my problem, hence why I tagged it javascript. Commented Jan 29, 2010 at 16:04

3 Answers 3

5

Split on the following regular expression:

(?<!;);(?!;)

It means match semicolons that are neither preceded nor succeeded by another semicolon.

For example, this code

var input = "entry;entry2;entry3;entry4;";
foreach (var s in Regex.Split(input, @"(?<!;);(?!;)"))
    Console.WriteLine("[{0}]", s);

produces the following output:

[entry]
[entry2]
[entry3]
[entry4]
[]

The final empty field is a result of the semicolon on the end of the input.

If the semicolon is a terminator at the end of each field rather than a separator between consecutive fields, then use Regex.Matches instead

foreach (Match m in Regex.Matches(input, @"(.+?)(?<!;);(?!;)"))
    Console.WriteLine("[{0}]", m.Groups[1].Value);

to get

[entry]
[entry2]
[entry3]
[entry4]
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! Too bad I was so far off with my original. This one leaves a trailing empty entry, any thoughts on how to get rid of that one?
1

Why not use String.Split on the semicolon?

string sInput = "Entry1;entry2;entry3;entry4";
string[] sEntries = sInput.Split(';');
// Do what you have to do with the entries in the array...

Hope this helps, Best regards, Tom.

5 Comments

The "but not on double semicolons" requirement makes this kind of ugly.
the problem is that he doesn't want to split on double-semicolon (;;) hence String.Split() is inadequate for him.
Sorry Tom, this would not work because it would split on ALL semicolons and I need it to skip over double semi-colons, as stated in the original question.
@DrJokepu: If you look at his sample input, there is no double semicolon...and anyway if there was there would be an empty element in the offset in the array
You can tell split not to return empty values, use the msdn.microsoft.com/en-us/library/system.stringsplitoptions.aspx
1

As tommieb75 wrote, you can use String.Split with StringSplitOptions Enumeration so you can control your output of newly created splitting array

string input = "entry1;;entry2;;;entry3;entry4;;";
char[] charSeparators = new char[] {';'};
// Split a string delimited by characters and return all non-empty elements.
result = input.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);

The result would contain only 4 elements like this:

<entry1><entry2><entry3><entry4>

2 Comments

Please read the original question to see why this does not work. I knew only regex would work before I asked the question.
So you would like to split a;b;c;;d to [a][b][c;;d] or [a][b][c][d]. If it's the second, you can still use Split, but if it's the first I will delete my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.