3

I am trying to perform regular expression matching and replacement on the same line in Ruby. I have some libraries that manipulate strings in Ruby and add special formatting characters to it. The formatting can be applied in any order. However, if I would like to change the string formatting, I want to keep some of the original formatting. I'm using regex for that. I have the regular expression matching correctly what I need:

mystring.gsub(/[(\e\[([1-9]|[1,2,4,5,6,7,8]{2}m))|(\e\[[3,9][0-8]m)]*Text/, 'New Text')

However, what I really want is the matching from the first grouping found in:

(\e\[([1-9]|[1,2,4,5,6,7,8]{2}m))

to be appended to New Text and replaced as opposed to just New Text. I'm trying to reference the match in the form of

mystring.gsub(/[(\e\[([1-9]|[1,2,4,5,6,7,8]{2}m))|(\e\[[3,9][0-8]m)]*Text/, '\1' + 'New Text')

but my understanding is that \1 only works when using \d or \k. Is there any way to reference that specific capturing group in my replacement string? Additionally, since I am using an asterik for the [], I know that this grouping could occur more than once. Therefore, I would like to have the last matching occurrence yielded.

My expected input/output with a sample is:

Input:  "\e[1mHello there\e[34m\e[40mText\e[0m\e[0m\e[22m"
Output: "\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m"

Input:  "\e[1mHello there\e[44m\e[34m\e[40mText\e[0m\e[0m\e[22m"
Output: "\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m"

So the last grouping is found and appended.

0

1 Answer 1

1

You can use the following regex with back-reference \\1 in the replacement:

reg = /(\\e\[(?:[0-9]{1,2}|[3,9][0-8])m)+Text/
mystring = "\\e[1mHello there\\e[34m\\e[40mText\\e[0m\\e[0m\\e[22m"
puts mystring.gsub(reg, '\\1New Text')

mystring = "\\e[1mHello there\\e[44m\\e[34m\\e[40mText\\e[0m\\e[0m\\e[22m"
puts mystring.gsub(reg, '\\1New Text')

Output of the IDEONE demo:

\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m
\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m

Mind that your input has backslash \ that needs escaping in a regular string literal. To match it inside the regex, we use double slash, as we are looking for a literal backslash.

Sign up to request clarification or add additional context in comments.

10 Comments

I just noticed you are looking for the first group, but in the 1st example, it is the last group \e[40m that is captured and my current regex will only use the last group match. Could you please precise? Here is my testing demo, please see the groups highlighted.
My apologies. You are correct. I'll update that. To be consistent, I want the last match from that group to be used.
Please check it again. \e[40m is right before Text in both examples, and thus it is matched and used in the replacement. I also do not quite understand your (\e\[[3,9][0-8]m) subpattern: did you mean to capture \e[30]m but not \e[21]m? Right now, it matches \e[,0]m. I have added it back to the regex.
The original intent of having (\e[[0-9]{1,2}m) and (\e[[3,9][0-8]m) was that there could be a sequence of alternating patterns. But since one is a subset of another, it was useless. I will update my original intent once I've verified it.
It is almost a subset. It is not a subset if you really plan to match a literal ,.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.