5

I am trying to use "requests" package and retrieve info from Github, like the Requests doc page explains:

import requests
r = requests.get('https://api.github.com/events')

And this:

with open(filename, 'wb') as fd:
    for chunk in r.iter_content(chunk_size):
        fd.write(chunk)

I have to say I don't understand the second code block.

  • filename - in what form do I provide the path to the file if created? where will it be saved if not?
  • 'wb' - what is this variable? (shouldn't second parameter be 'mode'?)
  • following two lines probably iterate over data retrieved with request and write to the file

Python docs explanation also not helping much.

EDIT: What I am trying to do:

  • use Requests to connect to an API (Github and later Facebook GraphAPI)
  • retrieve data into a variable
  • write this into a file (later, as I get more familiar with Python, into my local MySQL database)
5
  • Your second question is confusing. "wb" is a string representing the mode, yes. Commented Feb 15, 2016 at 19:51
  • But official documentation says it can be 'r', 'w', 'r+' and 'a'? :) They are all new to me so could be that I am mistaken. Commented Feb 15, 2016 at 19:52
  • Thank you Daniel! Do you have another solution how I can go about this? Commented Feb 15, 2016 at 19:53
  • 1
    You should link to the places you think you're reading these things. The full list of mode parameters to open is here, where you can see what the b stands for. Commented Feb 15, 2016 at 19:54
  • 1
    You haven't described what you are trying to do. Almost certainly, the GitHub api responds with JSON, so you should use the JSON methods described above the bit you read. Commented Feb 15, 2016 at 19:54

2 Answers 2

10

Filename

When using open the path is relative to your current directory. So if you said open('file.txt','w') it would create a new file named file.txt in whatever folder your python script is in. You can also specify an absolute path, for example /home/user/file.txt in linux. If a file by the name 'file.txt' already exists, the contents will be completely overwritten.

Mode

The 'wb' option is indeed the mode. The 'w' means write and the 'b' means bytes. You use 'w' when you want to write (rather than read) froma file, and you use 'b' for binary files (rather than text files). It is actually a little odd to use 'b' in this case, as the content you are writing is a text file. Specifying 'w' would work just as well here. Read more on the modes in the docs for open.

The Loop

This part is using the iter_content method from requests, which is intended for use with large files that you may not want in memory all at once. This is unnecessary in this case, since the page in question is only 89 KB. See the requests library docs for more info.

Conclusion

The example you are looking at is meant to handle the most general case, in which the remote file might be binary and too big to be in memory. However, we can make your code more readable and easy to understand if you are only accessing small webpages containing text:

import requests
r = requests.get('https://api.github.com/events')

with open('events.txt','w') as fd:
    fd.write(r.text)
Sign up to request clarification or add additional context in comments.

Comments

3

filename is a string of the path you want to save it at. It accepts either local or absolute path, so you can just have filename = 'example.html'

wb stands for WRITE & BYTES, learn more here

The for loop goes over the entire returned content (in chunks incase it is too large for proper memory handling), and then writes them until there are no more. Useful for large files, but for a single webpage you could just do:

# just W becase we are not writing as bytes anymore, just text.
with open(filename, 'w') as fd: 
    fd.write(r.content)

2 Comments

Demon, how does Python know where the file is (if using a previously created file)? I suppose it has to be in the same directory like my .py script files in that case?
for a local file it is based on the current working directory, so if you run your python script your home folder then change directories and run again, it will go into that new directory. The safe way to implement it is a full path such as "/home/Alex/example.html" (unix example) or "C:\\example.html" (windows example)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.