I have a nasty string that I converted from HTML code that looks like this:
<p><topic url="car-colours">Toyota Camry</topic> has <a href="/colours/dark-red">Dark Red</a><span> (2020)</span>, <a href="/colours/pearl-white">Pearl White</a><span> (2016 - 2017)
I want to extract the names of the colours from this string and put them in a list. I was thinking maybe I extract all substrings between the ">" and the "<" character as all colours are wrapped in it but I don't know how.
My goal is to have a list that will store all colours for the toyota camry like:
toyota_camry_colours = ["Dark Red", "Pearl White"]
Any ideas how I can do this? In bash I would use like grep or awk and stuff but don't know for python.