Copy content from HTML element and export to text file with Python3.x

Question

I am using a python3.x script to save a string to a text file:

nN = "hello"

f = open("file.txt", "w")
f.write(nN)
f.close()

and now I am trying to parse the content of an h2 element from a website (page scraping works fine) and I am getting an error when I am trying this:

nN = driver.find_element_by_id("title")

f = open("file.txt", "w")
f.write(nN)
f.close()

where the html line is:

<h2 id="title">hello</h2>

The error is:

write() argument must be str, not WebElement

I tried converting the nN into a string using the following:

f.write(str(nN))

and the new error is:

invalid syntax

How about f.write(str(nN))? I'm not sure what a driver object is, or what find_element_by_id() returns; but as the error states, f.write() must have a type str as its explicit parameter. — pstatix
– pstatix, Commented Oct 5, 2017 at 14:40

james-see · Accepted Answer · 2017-10-05 15:14:14Z

1

It looks like you are using Selenium and then using the webdriver to parse the html content?

The reason the string conversion is not working is because the nN is a Selenium/html object that probably is a dictionary or a list. You could try simply f.write(nN.text) and according to the documentation the .text version of nN should work.

To the larger issue of parsing html though, I would recommend using Beautiful Soup. Do pip3 install BeautifulSoup4 and then to import from bs4 import BeautifulSoup. Then as example:

with open('file.html','r') as f:
  htmltext = f # change as necessary, just needs to be string
  soup = BeautifulSoup(htmltext,'lxml')
  h2found = soup.find('h2',id="title")
  print(h2found)
  print(h2found.text)

Beautiful Soup has great documentation and is the standard and best library to use for parsing html.

edited Oct 5, 2017 at 15:14

answered Oct 5, 2017 at 14:57

james-see

13.3k6 gold badges47 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Copy content from HTML element and export to text file with Python3.x

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related