Ruby: How to append to each line of a string based on a given regex?

Question

I want to append </tag> to each line where it's missing:

text = '<tag>line 1</tag>
        <tag>line2         # no closing tag, append
        <tag>line3         # no closing tag, append
             line4</tag>   # no opening tag, but has a closing tag, so ignore
        <tag>line5</tag>'

I tried to create a regular expression to match this but I know its wrong:

text.gsub! /.*?(<\/tag>)Z/, '</tag>'

How can I create a regular expression to conditionally append each line?

Are you absolutely sure that each line contains exactly one tag? Are there going to be nested tags? Seeing how support for negative lookbehind seems to be a bit funky in Ruby, it might be easier just to split these lines and look for a </tag> substring and append one if you can't find it. — NullUserException
– NullUserException, Commented Aug 23, 2013 at 21:51
In my example there should always be a </tag> at the end of a line. — Andrew
– Andrew, Commented Aug 23, 2013 at 21:56
@NullUserException - what's funky about ruby lookbehind? I think you're imagining pre-1.9 scenarios. — pguardiario
– pguardiario, Commented Aug 24, 2013 at 0:36

Andrew · Accepted Answer · 2013-08-23 22:30:53Z

2

Here you go:

text.gsub!(%r{(?<!</tag>)$}, "</tag>")

Explanation:

$ means end of line and \z means end of string. \Z means something similar, with complications.

(?<!) work together to create a negative lookbehind.

edited Aug 23, 2013 at 22:30

Andrew

241k196 gold badges531 silver badges720 bronze badges

answered Aug 23, 2013 at 22:02

sawa

169k51 gold badges288 silver badges400 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Andrew Over a year ago

Thanks! I've deleted my comments and moved the explanation into the answer so it is easier for other people to see.

Chris Heald · Accepted Answer · 2013-08-23 21:57:22Z

0

Given the example provided, I'd just do something like this:

text.split(/<\/?tag>/).
     reject {|t| t.strip.length == 0 }.
     map {|t| "<tag>%s</tag>" % t.strip }.
     join("\n")

You're basically treating either and as record delimiters, so you can just split on them, reject any blank records, then construct a new combined string from the extracted values. This works nicely when you can't count on newlines being record delimiters and will generally be tolerant of missing tags.

If you're insistent on a pure regex solution, though, and your data format will always match the given format (one record per line), you can use a negative lookbehind:

text.strip.gsub(/(?<!<\/tag>)(\n|$)/, "</tag>\\1")

answered Aug 23, 2013 at 21:57

Chris Heald

62.9k10 gold badges131 silver badges143 bronze badges

Comments

Plasmarob · Accepted Answer · 2013-08-23 22:06:58Z

0

One that could work is:

/<tag>[^\n ]+[^>][\s]*(\n)/

This is will return all the newline chars without a ">" before them.

Replace it with "\n", i.e.

text.gsub!( /<tag>[^\n ]+[^>][\s]*(\n)/ , "</tag>\n")

For more polishing, try http://rubular.com/

answered Aug 23, 2013 at 22:06

Plasmarob

1,42114 silver badges21 bronze badges

Comments

7stud · Accepted Answer · 2013-08-24 05:31:50Z

0

text = '<tag>line 1</tag>
        <tag>line2        
        <tag>line3
        line4</tag>
        <tag>line5</tag>'

result = ""

text.each_line do |line|
  line.rstrip!
  line << "</tag>" if not line.end_with?("</tag>")
  result << line << "\n"
end

puts result

--output:--
<tag>line 1</tag>
        <tag>line2</tag>
        <tag>line3</tag>
        line4</tag>
        <tag>line5</tag>

answered Aug 24, 2013 at 5:31

7stud

48.8k14 gold badges107 silver badges137 bronze badges

Collectives™ on Stack Overflow

Ruby: How to append to each line of a string based on a given regex?

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related