Hot Linked Questions

0 votes

0 answers

636 views

Get text from xml that is behind an extra tag in python [duplicate]

I was trying to extract the text between tags from an .xml file but I found that some parts of the text has an <italic> tag that cuts the text after that and makes it impossible for me to get it....

smtn-nt

9

asked May 21, 2021 at 11:51

0 votes

1 answer

55 views

Parse XML to String with tags [duplicate]

I'm only getting value1 when using .findall('string') and rest is ignored. How to get whole value? xml file input: <resources> <string name="key">value1 value2...

Gorthez

421

asked Jul 15, 2024 at 7:48

64 votes

12 answers

70k views

Remove a tag using BeautifulSoup but keep its contents

Currently I have code that does something like this: soup = BeautifulSoup(value) for tag in soup.findAll(True): if tag.name not in VALID_TAGS: tag.extract() soup.renderContents() Except ...

Jason Christa

12.5k

asked Nov 19, 2009 at 19:19

43 votes

2 answers

37k views

xpath to get all the childrens text

Is there any way to get all the childrens node values within the ul tag. Input: <ul> <li class="type">Industry</li> <li><a href="/store/Browse/?N=355+361+...

pallavi

745

asked May 2, 2012 at 12:22

5 votes

1 answer

3k views

Python NLTK Shakespeare corpus

I am trying to import sentences from Shakespeare's NLTK corpus – following this help site – but I am having trouble getting access to the sentences (in order to train a word2vec model) : from nltk....

groromain92

95

asked May 1, 2017 at 14:55

0 votes

1 answer

1k views

Python lxml extract text when a tag exists in the middle of the text

I am trying to parse and extract all the text inside of the claim-text tag and prepare it for a csv. So each claim tag will have a column containing all the claim-text. Basically the claims are ...

theEconCsEngineer

75

asked Jul 1, 2020 at 3:57

0 votes

3 answers

2k views

python xml.etree.ElementTree remove empty tag in the middle of text

I have an xml document from which I want to extract text based on tags. The part that I want to extract text from looks something like this : <BlockText attr1="blah" attr2=657 ID="Bhf76" lang="en"&...

MMM

305

asked Feb 20, 2020 at 14:16

-1 votes

1 answer

1k views

Skipping "nested tags" when parsing XML with Python

I currently have an XML file that I'd like to parse with Python. I'm using Python's Element Tree and it works fine except I had a question. The file currently looks something like: <Instance> ...

Sean

3,460

asked Jan 17, 2020 at 7:04

0 votes

1 answer

760 views

How to parse XML file with xml.etree.ElementTree that have HTML content in its child

I have this XML file : <?xml version="1.0" encoding="UTF-8" standalone="true"?> <Component> <Custom/> <ID>1</ID> <LongDescription> <html><html> <...

Emna Jaoua

381

asked Jun 1, 2018 at 10:21

0 votes

1 answer

499 views

python Parsing xml: get text from tag which contains or or similar

I'm using xml.etree.ElementTree. When I try to get text from AbstractText, I get None or partial text if there are formatting tags like i, b or similar tags into the text. Here is a xml example <...

gabriele colia

65

asked Dec 8, 2019 at 20:15

0 votes

1 answer

364 views

Python parse XML content of an element when there is a child element

I have an XML file as below: <?xml version="1.0" encoding="UTF-8"?> <data> <text> I have <num1>two</num1> apples and <num2&...

Shaghayegh L

1

asked Jun 23, 2022 at 9:17

1 vote

2 answers

209 views

Parsing an xml file with an emphasis tag in it in python

I am currently writing a python script that can extract all of the text in an xml file. I am using the Element Tree library to interpret the data but I am running into this problem however when the ...

Handsome Alex

27

asked Apr 10, 2020 at 18:54

0 votes

0 answers

154 views

Parsing an XML file for multiple p tags to be merged with other p tags

I have an .xml file that I'm trying to extract the text from all p tags from. I've been able to implement the following code to do so, but it's not quite returning the output I want. tree = ET.parse(&...

spaghettiplants

9

asked Jul 29, 2021 at 0:20

Collectives™ on Stack Overflow

Linked Questions

Get text from xml that is behind an extra tag in python [duplicate]

Parse XML to String with <b> tags [duplicate]

Remove a tag using BeautifulSoup but keep its contents

xpath to get all the childrens text

Python NLTK Shakespeare corpus

Python lxml extract text when a tag exists in the middle of the text

python xml.etree.ElementTree remove empty tag in the middle of text

Skipping "nested tags" when parsing XML with Python

How to parse XML file with xml.etree.ElementTree that have HTML content in its child

python Parsing xml: get text from tag which contains <i> or <b> or similar

Python parse XML content of an element when there is a child element

Parsing an xml file with an emphasis tag in it in python

Parsing an XML file for multiple p tags to be merged with other p tags

Hot Network Questions