1

I want to split a string like this:

"---hello--- hello ------- hello --- bye"

into an array like this:

"hello" ; "hello ------- hello" ; "bye"

I tried it with this command:

test.Split(new string[] {"---"}, StringSplitOptions.RemoveEmptyEntries);

But that doesn't work, it splits the "-------" into 3 this "---- hello".

Edit:

I can't modify the text, it is an input and I don't know how it looks like before I have to modify it.

An other example would be:

--- example ---

--------- example text --------

--- example 2 ---

and it should only split the ones with 3 hyphens not the one with more.

7
  • That's normal. ------- contains --- Commented Apr 4, 2017 at 12:16
  • @RegisPortalez yea but thats the problem, I want to split it exactly like this, even if it contains it. Commented Apr 4, 2017 at 12:17
  • the String.Split is not the correct choice here. It cannnot do what you want. I think regex will be your friend Commented Apr 4, 2017 at 12:19
  • Do you have few other examples? Someone can propose you solution which will work only on given example. It's hard to provide robust solution when we don't know which parts of your string could change. Commented Apr 4, 2017 at 12:19
  • @swe do you have a solution with regex? I'm really bad at it. Commented Apr 4, 2017 at 12:21

5 Answers 5

6

You can use a Regex split. The regex uses a negative lookahead (?!-) to only match three - exactly. See also Get exact match of the word using Regex in C#.

string sentence = "---hello--- hello ------- hello --- bye";
var result = Regex.Split(sentence, @"(?<!-)---(?!-)");
foreach (string value in result) {
   Console.WriteLine(value.Trim());
}

.net Fiddle

Sign up to request clarification or add additional context in comments.

1 Comment

same answer, same time :)
4

Solution to find your Tokens with regex:

(?<!-)---(?!-)

Console.WriteLine(String.Join(",", System.Text.RegularExpressions.Regex.Split("---hello--- hello ------- hello --- bye", "(?<!-)---(?!-)")))

Comments

3

I suggest trying Regex.Split instead of string.Split:

  string source = "---hello1--- hello2 ------- hello3 --- bye";

  var result = Regex
    .Split(source, @"(?<=[^-]|^)-{3}(?=[^-]|$)") // splitter is three "-" only
    .Where(item => !string.IsNullOrEmpty(item))  // Removing Empty Entries
    .ToArray();

  Console.Write(string.Join(";", result));

Outcome:

  hello1; hello2 ------- hello3 ; bye

1 Comment

That was exactly what I was looking for, thank you :)
2
  1. Replace ----- by something else that is never is your test, like @@@ test.replace("------", "@@@")
  2. Split your string
  3. Replace @@@ by ------

2 Comments

What about sequences of 4, 6, 7, 8 etc?
The problem is I don't know how much chars the string contains, the input can be different every time, so that solution doesn't work for me.
0

I would suggest using an neutral character like "/split" or something like that. Than you can use test.Split(...) without woriing that it split something else that you want. Your code would now look something like that:

string test = "hello\split hello ------- hello \split bye";
test.Split("\split", StringSplitOptions.RemoveEmptyEntries);

2 Comments

You are assuming that OP has the choice to change the input, which I very much doubt here.
@DavidG exactly, I can't do that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.