2

I want to split strings similar to abc123, abcdefgh12 or a123456 into letters and numbers, so that the result will be {"abc", "123"} etc.

What is the simplest way to do it in C# 4.0? I want to do it with one regex.

0

5 Answers 5

1

Why regex?

    static readonly char[] digits = {'0','1','2','3','4','5','6','7','8','9'};
    ....
    string s = "abcdefgh12", x = s, y = "";
    int i = s.IndexOfAny(digits);
    if (i >= 0) {
        x = s.Substring(0, i);
        y = s.Substring(i, s.Length - i);
    }
Sign up to request clarification or add additional context in comments.

7 Comments

@Ilya Kogan: If you don't know the solution, then how can you determine whether it is simple?
@phresnel It's not that I don't know the solution, I'm just looking for something concise, something that will make the code look short and clear.
@Ilya - if you want concise, then refactor it away into a separate method and call that when desired. The implementation presented is simple, correct and efficient. I'll take that over "1 line" any day of the week.
Hi @Marc, I guess we just see different things as simple. I agree that regexes can be easily overused, and @phresnel's link shows one really horrible example. But in this case I find @Josh's answer (linked by @Mehdi) more readable. It saves the need of calculating indexes and offsets inside the string.
@Ilya - I look at it this way: if I saw just that regex code, how long would it take me to grok what it is doing? Now compare to the index code "it is finding the first occurrence of a digit, and splitting there"
|
1

In addition to Marc Gravell, read http://www.codinghorror.com/blog/2008/06/regular-expressions-now-you-have-two-problems.html .

What is the simplest way to do it in C# 4.0? I want to do it with one regex.

That's practically an oxymoron in your case. The simplest way of splitting by a fixed pattern is not with regexes.

Comments

1

Unless I'm missing something, this should do the trick... ([a-z]*)([0-9]*)

3 Comments

either [a-zA-Z]* or you pass the case insensitive comparison option (don't know how it works in C#, in php and js you append /i at the end of the regex). Keep also in mind that the solution will also match ABC and 123 (i.e. only letters or only numbers). To avoid that replace the *s with +s.
Would the regex not be better with one-or-more-matching?
That's what I wrote in the comment above: replace the *s with +s.
1

"Only numbers or only letters" can be represented using [a-zA-Z]*|[0-9]*. All you have to do is look for all matches of that regular expression in your string. Note that non-alphanumeric characters will not be returned, but will still split the strings (so "123-456" would yield { "123", "456"}).

EDIT: I've interpreted your question as stating that your strings can be a sequence of letters and numbers in any order - if your string is merely one or more letters followed by one or more numbers, a regular expression is unnecessary: look for the first digit and split the string.

1 Comment

Splitting is done by calling Regex.Matches (msdn.microsoft.com/en-us/library/e7sf90t3.aspx) and then reading through the returned MatchCollection
-1

You could create a group for letteres and one for numbers. use this guide for further info: http://www.regular-expressions.info/reference.html HTH!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.