Python - How to read the content of an URL twice?

Question

I am using 'urllib.request.urlopen' to read the content of an HTML page. Afterwards, I want to print the content to my local file and then do a certain operation (e.g. construct a parser on that page e.g. Beautiful Soup).

The problem After reading the content for the first time (and writing it into a file), I can't read the content for the second time in order to do something with it (e.g. construct a parser on it). It is just empty and I can't move the cursor(seek(0)) back to the beginning.

import urllib.request   


response = urllib.request.urlopen("http://finance.yahoo.com")


file = open( "myTestFile.html", "w")
file.write( response.read()  )    # Tried response.readlines(), but that did not help me
#Tried: response.seek()           but that did not work
print( response.read() )          # Actually, I want something done here... e.g. construct a parser:
                                  # BeautifulSoup(response).
                                  # Anyway this is an empty result 


file.close()

How can I fix it?

wim · Accepted Answer · 2017-08-22 16:21:51Z

8

You can not read the response twice. But you can easily reuse the saved content:

content = response.read()
file.write(content)
print(content)

edited Aug 22, 2017 at 16:21

answered Aug 22, 2017 at 16:03

wim

368k114 gold badges681 silver badges818 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Antonio López Ruiz Over a year ago

Is there a way to make it response.write(content) ?

Collectives™ on Stack Overflow

Python - How to read the content of an URL twice?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related