I am currently writing a python script that can extract all of the text in an xml file. I am using the Element Tree library to interpret the data but I am running into this problem however when the data is structured like this...
<Segment StartTime="639.752" EndTime="642.270" Participant="fe016">
But I bet it's a good <Pause/> superset of it.
</Segment>
When I attempt to read out the text, I get the first half of the Segment ("Alright. So what we had") before the pause tag.
What I am trying to figure out is if there is a way to ignore the tags in the data segments and print out all of the text.
<Segment...text quoted does not match the text in your question.itertext(): stackoverflow.com/q/19369901/407651