I'm somewhat of an amateur Programmer and new to this site. I have searched for this question, but have not found it anywhere else on the internet or this site.
I'm trying to grab all of the words in between the open and close paragraph html tags (<p> & </p>). My findall statement works for all the words in all the paragraphs in particular online articles except for where there is a single or double quotation mark. It is totally possible that there is a much better way to do what I'm trying to do or that this statement can be easily tweaked to include paragraphs with quotes. Any advice will be greatly appreciated!
findall statement:
aText = findall("<p>[A-Za-z0-9<>=\"\:/\.\-,\+\?#@'<>;%&\$\*\^\(\)\[\]\{\}\|\\!_`~ ]+</p>",text)