I have html source code which I want to filter out one or more links and keep the others.
I have set up my filter with "*" as the wildcard:
<a*>Link1</a>‚ <a*>Link2</a>‚ or <a*>Link3</a>
<a*>A bad link*</a>
some text* <a*>update*</a>
other text right before link <a*>click here</a>
I would like to filter out every instance of the link from the html source code using python. I'm ok with loading the list into an array. I need some help with the filter. Each line break would signify a separate filter and I only want to remove the link(s) and not the text
I am still very new to python and regex/beautifulsoup. Even if you could point me in the right direction, it would be greatly appreciated.