0

I'm quite new to programming in general and was hoping someone could help me with this issue:

I have a Python script that was working fine until today. The script reads a Google Sheets spreadsheet into a pandas DataFrame and then continues processing. The code I use to read the spreadsheet is as follows:

import pandas as pd

id = 'sheets ID'
df = pd.read_csv(f'https://docs.google.com/spreadsheets/d/%7Bid%7D/export?format=csv')\`

This code used to work without any issues. However, when I ran it today, I received the following error:

HTTPError                                 Traceback (most recent call last)
...
raise HTTPError(req.full_url, code, msg, hdrs, fp)
HTTPError: HTTP Error 400: Bad Request

Does anyone know what could have caused this sudden change? Could it be related to changes in Google Sheets permissions or a problem with my code? Any suggestions on how to troubleshoot or fix this would be greatly appreciated!

Thank you in advance!

I tried using both pd.read_csv and pd.read_excel to read the data from the Google Sheets spreadsheet, expecting to get the data loaded into a pandas DataFrame as before. Here's what I did:

I first ran the original code that uses pd.read_csv:

id = '1jMHxJcfKd5kwakxa3gSovZOTpCFHZP367iQehShATWk'
df = pd.read_csv(f'https://docs.google.com/spreadsheets/d/{id}/export?format=csv')

This had always worked correctly until today, and I expected it to continue loading the data without issues.

After receiving the HTTP Error 400: Bad Request, I tried switching to pd.read_excel to see if using the .xlsx format would make any difference:

df = pd.read_excel(f'https://docs.google.com/spreadsheets/d/{id}/export?format=xlsx')
However, I encountered a similar error message.

I also checked the spreadsheet permissions and confirmed that it is publicly accessible (as it was before) and ensured the URL is correct.

Additionally, I searched for posts with similar issues on Stack Overflow and other forums to see if anyone else had encountered a recent change with Google Sheets or pandas. Most of the solutions I found recommended using the Google Sheets API instead of directly accessing the spreadsheet through the URL. While I understand that the API is a robust approach, it feels like an overcomplication for this simple project, especially since my script was working fine without it until today.

I'm still unsure what might have changed or why the HTTP Error 400 is appearing now, so any guidance on resolving this issue or troubleshooting further would be greatly appreciated!

4
  • The Google Sheets API should be the correct way to interact with. It is not that cumbersome. Im looking into the situation. Hang on. Commented Oct 3, 2024 at 15:03
  • This works for me no problem. Commented Oct 3, 2024 at 15:04
  • 1
    Try checking your internet connection. If you are using a VM, check if the connection is set to "bridged" (if applicable). Try updating your pandas module. Try using a different venv. A bad request usually means a malformed header or link address. Before opening the link, print it to the console first to check how it looks like. Manually set header information if required. If the problem persists, I'm afraid I'm out of ideas Commented Oct 3, 2024 at 15:07
  • Welcome to Stack Overflow! I agree with @Amparo Walter. Have you checked their recommendations? Do these resolve your problem? Commented Oct 4, 2024 at 18:42

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.