0

I have a string

text 6ffdfd <a href="http://worldnews.com" target="_blank">toto</a> sdsdsd

I would like to find a regex that would

    1. add a opening span tag just after the end of the a tag html link (that is to say to be precise after the string "target="_blank">"
    1. add a closing span tag just before the a tag closing

The desired end result would be:

 <a href="http://worldnews.com" target="_blank"><span>toto</span></a> sdsdsd

For the moment , I don't find how to achieve 1, and I only partially managed 2. because my current code is wrongly adding white space that I don't want between /span and the closing a tag

Current code

orig_string = 'text 6ffdfd <a href="http://example.com" target="_blank">toto</a> sdsdsd'
end_result = orig_string.gsub(/<\/a>/, '</span> \\0')
print end_result

I have a set up a online editable DEMO here: https://repl.it/repls/SecondCapitalPika

7
  • 5
    Do not use regex to parse HTML. Use a proper HTML parser and add the <span> tag to the DOM. Commented Dec 6, 2017 at 14:14
  • hi Stefan, I am doing it as admin on Active Admin inputs. Why shouldn't I do it with a regexp? I'm a ruby newbie so I thought that could work. Are there security issues ? I want to change the input that I enter so that the one with <span> are saved to the database (instead of the one without <span>s) Commented Dec 6, 2017 at 14:15
  • See The Stack Overflow Regular Expressions FAQ, there are several links explaining why you should not use regular expressions to parse HTML. Commented Dec 6, 2017 at 14:20
  • thanks will check out. Learning sth new everyday:) Commented Dec 6, 2017 at 14:24
  • Thanks a lot. I went through the very "intense" debate betwene people for and agaisnt using regexp to parse html.In the end I'll go for using it , agreeing with this person "If you have a small set of HTML pages that you want to scrape data from and then stuff into a database, regexes might work fine. For example, I recently wanted to get the names, parties, and districts of Australian federal Representatives, which I got off of the Parliament's web site. This was a limited, one-time job" (source: stackoverflow.com/a/1733489/1467802) Commented Dec 6, 2017 at 15:15

2 Answers 2

1
orig_string =~ /(?<=>)([^<]*)(?=<\/a>)/
if $1.present?
  end_result = orig_strig.gsub(/(?<=>)([^<]*)(?=<\/a>)/, '<span>\1</span>')
end

Break down

(?<=>) # to have character >  before
([^<]*) # match everything until character <, match everything in a tag
(?=<\/a>) # to have </a> after

Will result in

print end_result
'text 6ffdfd <a href="http://example.com" target="_blank"><span>toto</span></a> sdsdsd'
Sign up to request clarification or add additional context in comments.

5 Comments

could you add just a little more info: indeed this works the first time but if i change the text/input in my admin panel, then the script kick in again and i end up with 4 spans instead of 2:) Could I use some if string contains no span yet, then do this...
aweosme, let me check it. i must really try learning regexp, it seems powerful !
how does $1.present work I mean don't you have to say where you check the presence of $1 ? like on orig_string?
i get undefined method `present?' for nil:NilClass:)
nil should return false for present?. Can you try unless $1.blank?
1

If you don't necessarily need a regex, then you could use Nokogiri:

require 'nokogiri'

text = <<-TEXT
  text 6ffdfd <a href="http://worldnews.com" target="_blank">toto</a> sdsdsd
  6ffdfd text <a href="http://worldnews.com" target="_blank">tete</a> sdsdsd
  6ffdfd text <a href="http://worldnews.com">titi</a> sdsdsd
TEXT

doc = Nokogiri.HTML text
doc.css('a[target="_blank"]').each { |anchor| anchor.add_next_sibling '<span>span</span>' }

1 Comment

Hi Sebastian, thanks for your help. Please refer to my comments under my question where I explain why in the end i still opted for a regexp. On top of those reasons, i also did not want to make my Ruby on Rails 4 more heavy than it already is for a very simple need by importing/requiring another library (nokogiri)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.