1

Since this is my first question here on stackoverflow I hope my question is correctly asked.

Basicly I have a normal .txt file which contains any text like:

car accident
people died
cat without owner


<!-- Text added at 6/29/2011 9:20:38 AM -->

Some addintional Text
other Text added
add Text

I have a write/append function which allows the user to append some text and set a little timestamp.

So my problem is: With another function, you can search and replace text in the textfile, but as you can guess if someone wants to replace the word "Text" it will be replaced in the xml-stylish comment(timestamp) as well.

My result until now is

content = Regex.Replace(content,"[^<+.*"+input+".*>+]*", replace);
//content = content of the .txt file, input = search term, replace = string to replace

But this fails miserably, as some regex pro's will see without executing it.

Now I hope that some regex pro could help me out here and provide me a search pattern which replaces the normal text but ignores the timestamp.

I'm not realy aware of the logic from regex until now, nevertheless I understand the single expressions so this would be a hook for me to understand Regex more properly.

Thanks in advice.

3
  • 2
    The regex: [^<+.*BAR.*>+]* matches any character except '<', '+', '.', '*', 'B', 'A', 'R' and '>' zero or more times. Note that this regex always matches between any 2 characters (it matches empty the string!). Better do a bit of reading before continuing with regex, IMO: regular-expressions.info/charclass.html and regular-expressions.info Commented Jun 29, 2011 at 7:53
  • thanks for that i'll go thorugh them Commented Jun 29, 2011 at 8:06
  • Sorry, "matches between any 2 characters" is wrong, I meant: "matches before and after any character". Commented Jun 29, 2011 at 8:12

1 Answer 1

1

If I understand your question correctly, you want to replace every instance of "Text" except for the one(s) inside the comment.

The easist way is to use a negative lookbehind (fantastic description here) as below:

content = Regex.Replace(content, @"(?<!<!--.*?)" + input, replace);

What you're doing is attempting to replace a repetition of any length of a character that is NOT <+.*> or a character contained in input with the value in replace.

If you're going to be working a lot with Regex, I would HIGHLY recommend giving the website above a good read. It's hands down the best intro to Regex that I've found, the time spent now will save you lots of headaches later!

Edit

Updated to add flexibility thanks to @stema

Sign up to request clarification or add additional context in comments.

5 Comments

yup thanks that helped me out and its working, so i will continue reading regex tutorials (again thanks for the link).
@Matthias R., this will work only if you want to replace "Text", because of the \s+ at the end of the negative lookbehind. This allows only whitespace between the <!-- and the searched word. If you want to be more flexible replace it with .*?, this matches non greedy every character. So @"(?<!<!--.*?)" is more flexible. But anyway +1 to @elmugrat
i see your right if i replace AM the AM in the comment is replaced too. So with your pattern it is not, so thank you for that
@elmugrat, The flexibility is needed, because the OP wants the user to be able to replace a text of his choice, so if this is "add", "2011", "20" or "AM" it will fail with just \s+
Yep you're right. When I ran through it in my head I somehow was thinking Regex.IsMatch instead of Regex.Replace and thought that after matching the first Text it would return, but of course it doesn't...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.