Ruby Regex Group Replacement

Question

I am trying to perform regular expression matching and replacement on the same line in Ruby. I have some libraries that manipulate strings in Ruby and add special formatting characters to it. The formatting can be applied in any order. However, if I would like to change the string formatting, I want to keep some of the original formatting. I'm using regex for that. I have the regular expression matching correctly what I need:

mystring.gsub(/[(\e\[([1-9]|[1,2,4,5,6,7,8]{2}m))|(\e\[[3,9][0-8]m)]*Text/, 'New Text')

However, what I really want is the matching from the first grouping found in:

(\e\[([1-9]|[1,2,4,5,6,7,8]{2}m))

to be appended to New Text and replaced as opposed to just New Text. I'm trying to reference the match in the form of

mystring.gsub(/[(\e\[([1-9]|[1,2,4,5,6,7,8]{2}m))|(\e\[[3,9][0-8]m)]*Text/, '\1' + 'New Text')

but my understanding is that \1 only works when using \d or \k. Is there any way to reference that specific capturing group in my replacement string? Additionally, since I am using an asterik for the [], I know that this grouping could occur more than once. Therefore, I would like to have the last matching occurrence yielded.

My expected input/output with a sample is:

Input:  "\e[1mHello there\e[34m\e[40mText\e[0m\e[0m\e[22m"
Output: "\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m"

Input:  "\e[1mHello there\e[44m\e[34m\e[40mText\e[0m\e[0m\e[22m"
Output: "\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m"

So the last grouping is found and appended.

Wiktor Stribiżew · Accepted Answer · 2015-05-13 19:48:23Z

1

You can use the following regex with back-reference \\1 in the replacement:

reg = /(\\e\[(?:[0-9]{1,2}|[3,9][0-8])m)+Text/
mystring = "\\e[1mHello there\\e[34m\\e[40mText\\e[0m\\e[0m\\e[22m"
puts mystring.gsub(reg, '\\1New Text')

mystring = "\\e[1mHello there\\e[44m\\e[34m\\e[40mText\\e[0m\\e[0m\\e[22m"
puts mystring.gsub(reg, '\\1New Text')

Output of the IDEONE demo:

\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m
\e[1mHello there\e[40mNew Text\e[0m\e[0m\e[22m

Mind that your input has backslash \ that needs escaping in a regular string literal. To match it inside the regex, we use double slash, as we are looking for a literal backslash.

edited May 13, 2015 at 19:48

answered May 13, 2015 at 19:14

Wiktor Stribiżew

631k41 gold badges502 silver badges633 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Wiktor Stribiżew Over a year ago

I just noticed you are looking for the first group, but in the 1st example, it is the last group \e[40m that is captured and my current regex will only use the last group match. Could you please precise? Here is my testing demo, please see the groups highlighted.

fowlball1010 Over a year ago

My apologies. You are correct. I'll update that. To be consistent, I want the last match from that group to be used.

Wiktor Stribiżew Over a year ago

Please check it again. \e[40m is right before Text in both examples, and thus it is matched and used in the replacement. I also do not quite understand your (\e\[[3,9][0-8]m) subpattern: did you mean to capture \e[30]m but not \e[21]m? Right now, it matches \e[,0]m. I have added it back to the regex.

fowlball1010 Over a year ago

The original intent of having (\e[[0-9]{1,2}m) and (\e[[3,9][0-8]m) was that there could be a sequence of alternating patterns. But since one is a subset of another, it was useless. I will update my original intent once I've verified it.

Wiktor Stribiżew Over a year ago

It is almost a subset. It is not a subset if you really plan to match a literal ,.

|

Collectives™ on Stack Overflow

Ruby Regex Group Replacement

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related