0

How do i read the following text into a single string using c# regex?

*EDIT * :70://this is a string //this is continuation of string even more text 13

this is stored in a c# List object

so for example, the above needs to return

this is a string this is continuation of string even more tex

I've thought something like this would do the job, but it doesn't return any group values

foreach (string in inputstring)
{
   string[] words
   words = str.Split(default(string[]), StringSplitOptions.RemoveEmptyEntries);
   foreach (string word in words)
   {
      stringbuilder.Append(word + " ");
   }
 }
 Match strMatch = Regex.Match(stringBuilder, @"[^\W\d]+");
 if(strMatch.Success)
 {
     string key = strMatch.Groups[1].Value;
 }

perhaps, i'm going about this all wrong, but i need to use the regex expression to formualte a single string from the example string.

8
  • This code won't compile - you are missing a variable name for string in your foreach. Also, you're not using said variable anywhere Commented Sep 11, 2013 at 11:15
  • 1
    Search for a regex that replaces any non-alpha characters, ie. [a-zA-Z] including carriage returns and line feeds. By replacing everything you don't need with string.empty, you should achieve your goal. Commented Sep 11, 2013 at 11:16
  • Kind of like this question: stackoverflow.com/questions/4220172/… Commented Sep 11, 2013 at 11:17
  • I've not listed all the code, so i'm not too concerned it won't compile! - it's just example code - hey but thanks for the minus :-) Commented Sep 11, 2013 at 11:32
  • For some reason the example string ended up on one line, this may of led to the confusion, and some of hte responses. Checking for a regex for a single string would be easy. Commented Sep 11, 2013 at 11:35

1 Answer 1

2
var input = @":70://this is a string //this is continuation of string even more text 13";

Regex.Replace(input, @"[^\w\s]|[\d]", "").Trim();
// returns: this is a string this is continuation of string even more text

Explanation of regex:

[^ ... ] = character set not matching what's inside
\w       = word character
\s       = whitespace character
|        = or
\d       = digit

Alternatively you could use the regex [^A-Za-z\s] which reads "don't match capital letters, lowercase letters or whitespace".

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.