1

I am using 'urllib.request.urlopen' to read the content of an HTML page. Afterwards, I want to print the content to my local file and then do a certain operation (e.g. construct a parser on that page e.g. Beautiful Soup).

The problem After reading the content for the first time (and writing it into a file), I can't read the content for the second time in order to do something with it (e.g. construct a parser on it). It is just empty and I can't move the cursor(seek(0)) back to the beginning.

import urllib.request   


response = urllib.request.urlopen("http://finance.yahoo.com")


file = open( "myTestFile.html", "w")
file.write( response.read()  )    # Tried response.readlines(), but that did not help me
#Tried: response.seek()           but that did not work
print( response.read() )          # Actually, I want something done here... e.g. construct a parser:
                                  # BeautifulSoup(response).
                                  # Anyway this is an empty result 


file.close()

How can I fix it?

0

1 Answer 1

8

You can not read the response twice. But you can easily reuse the saved content:

content = response.read()
file.write(content)
print(content)
Sign up to request clarification or add additional context in comments.

1 Comment

Is there a way to make it response.write(content) ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.