2

I'm replacing requests.get() with pd.read_csv() and would like to write some exception logic if pandas does not get the equivalent of a status code 200.

With requests, I can write:

response = requests.get(report_url)
if response.status_code != 200:

How can I apply the same logic to pd.read_csv()? Are there any status codes I can check on?

5
  • 1
    Don't you get an error if read_csv fails with a URL? Commented Jul 18, 2022 at 20:07
  • I'm not sure how to test this outside of passing an incorrect URL which doesn't test exactly what I want to check against. Commented Jul 18, 2022 at 20:09
  • 1
    you can't get status code with read_csv() - it simply raise error when it can't read it. You have to use requests.get() to check status and get data from url and later use read_csv( io.StringIO( text ) ). Or you should use try/except to catch error when it can't read data. Commented Jul 18, 2022 at 20:11
  • Hmm, that's odd. I can pass an external URL to read_csv() so I'd assume their goal with this feature was to replace any need for requests. Commented Jul 18, 2022 at 20:14
  • you can use url in read_csv() but this function doesn't have method to gives you status code. It simply raise error when it can't read url. Commented Jul 18, 2022 at 20:15

2 Answers 2

2

You can use url in read_csv() but it has no method to gives you status code. It simply raises error when it has non-200 status code and you have to use try/except to catch it. You have example in other answer.

But if you have to use requests then you can later use io.StringIO to create file-like object (file in memory) and use it in read_csv().

import io
import requests
import pandas as pd

response = requests.get("https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv")

print('status_code:', response.status_code)

#if response.status_code == 200:
if response.ok:
    df = pd.read_csv( io.StringIO(response.text) )
else:
    df = None

print(df)

The same way you can use io.StringIO when you create web page which gets csv using HTML with <form>.


As I know read_csv(url) works in similar way - it uses requests.get() to get file data from server and later it uses io.StringIO to read data.

Sign up to request clarification or add additional context in comments.

2 Comments

I had forgotten to add io to my answer, but you already did, I deleted it because yours is a lot more complete.
Actually looks like this is the route I'm going to take. Thank you.
2

My suggestion is to write a custom reader that makes it possible to check that a URL is valid before reading it although this defeats the purpose

import requests
def custom_read(url):
    try: 
        return_file = pd.read_csv(url) 
    except requests.exceptions.HTTPError as err:
        raise
    else:
        return return_file

A valid URL will work

my_file = custom_read("https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv")

This fails and raises a requests error

my_file1 = custom_read("https://uhoh.com")

Otherwise, there is no way to access the status code of a URL for a DataFrame object once it has been read.

7 Comments

Aw, that just seems inefficient having to run two http requests.
@Bonteq you can use read_csv( io.StringIO( response.text )) instead or running read_csv(url)
Sorry, I have edited, you only need to check that a requests error was raised. @Bonteq
@furas I think you're right, that's probably the best route here.
@Bonteq when read_csv has non-200 status code then it raises error and you have to use try/except to catch it
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.