Using lxml in python I created this xpath syntax
htmlPage.xpath("/html/body//a/text()")
It gets me all <a>-tags in certain html scopes I desire. Now I encountered that the <a>-tags could look like this:
<a>This is a sentence with some <italic>italic text</italic>-formatting I want to parse.</a>
xpath returns me a list that has one element more then I expect. I checked that and recognized, that it splits the <a>-tag mentioned above into two list elements, instead of one. Instead of the string
"This is a sentence with some italic text-formatting I want to parse."
I get the two strings
"This is a sentence with some" # and
"-formatting I want to parse."
Is there a way to correct that?