0

I have a text with news where i got html attributes that i don't need. How can i delete phrases in ruby such as

img width="750" alt="4.jg" c="/unload/medialiy/df6/4.jg" height="499" title=4.jg"

img width="770" alt="5.jg" c="/unload/medialiy/ty6/5.jg" height="499" title=5.jg"

So i need some regex smth like news.sub('/img*jg"/, ''). but it doesn't work.

1
  • "a text with news where i got html attributes" – what does that mean? Do you have HTML or text containing HTML? Why are the angle brackets missing? How does your actual input look like (i.e. news) and what is your expected output? Commented Sep 11, 2017 at 11:34

2 Answers 2

1

I would use:

img .*\.jg"

test

if you want to say in regex "any symbols in any quantity", use .* Dot means any symbol, and star - any quantity.

But are you sure you don't want to include angle braces?

<img .*\.jg">

As an aside, what if the order of attributes will be changed? Then you'll fail to match the img tag. We really need img tag with .jg" substring in it.

<img [^>]*\.jg"[^>]*>

test

Sign up to request clarification or add additional context in comments.

2 Comments

"Dot means any symbol" – are you sure that you want .jg then? ;-)
Oh! How foolish of me to omit back slashes! Practically, it was a longer substitution for '.?'. It was a correct regex, so tests passed, but not good for explanation, thank you!
0

In your particular case you can do this:

element = '<img width="750" alt="4.jg" c="/unload/medialiy/df6/4.jg" height="499" title="4.jg">'

puts element.gsub(/(width|alt)=\"[^ ]+\" ?/, '')

You can also play around with this regex here.

But if you need a more robust solution, try to take a look at the Nokogiri gem. This SO question can help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.