Reading integer and float numerical data using Google Sheet API

Question

I'm using the Google Sheets API to fetch data from a sheet where numbers use a European locale. The input Google Sheet looks like this:

product_price   impressions   clicks    ctr    avg_click_price   total_spent   orders    
2296,00         2184          117      5,36        12,63             1478,20       3

However, when I fetch the data using worksheet.get_all_records() and process it in Python, the numbers are misinterpreted. For example, I get:

product_price  impressions   clicks  ctr    avg_click_price  total_spent  orders   
229600         2184          117     536    1263             147820       3

Here’s the part of my code for processing the data:

sheet_data = worksheet.get_all_records()

# Processing integer columns
for col in integer_columns:
    if col in df.columns:
        logger.info(f"Processing integer column {col}...")
        df[col] = (
            df[col]
            .astype(str)  # Convert to string
            .str.replace("\u00A0", "")  # Remove non-breaking spaces
            .str.replace(" ", "")  # Remove regular spaces
            .str.extract(r"(\d+,.-)")  # Keep only digits
            .apply(locale.atof)  # Convert to float based on locale
        )

I suspect the issue is related to how numbers with commas and dots (e.g., 2296,00) are parsed. Google Sheets in pl_PL locale. The locale seems to be ignored, and the numbers are multiplied by 100 for float numbers.

How can I correctly parse and handle float and integer numbers in this format using Python, so the output matches the original values without creating two loops for integer and float numbers?

which 'European locale'. Also, show all code related to this not just the bit you think is relevant. — ticktalk
– ticktalk, Commented Nov 30, 2024 at 17:10
The .astype(str) bit will probably get values as displayed in cells, which would be text strings with commas instead of numeric data. You should get raw values instead. — doubleunary
– doubleunary, Commented Nov 30, 2024 at 17:17
@Anton How is this r"(\d+,.-)" supposed to keep only digits? What does the - do? — TheMaster
– TheMaster, Commented Nov 30, 2024 at 23:08

TheMaster · Accepted Answer · 2024-11-30 22:05:07Z

2

The default Values.get ValueRenderOption is FORMATTED_VALUE. Choose UNFORMATTED_VALUE in gspread to get raw values:

worksheet.get_all_records(value_render_option="UNFORMATTED_VALUE")

answered Nov 30, 2024 at 22:05

TheMaster

51.6k7 gold badges76 silver badges102 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

samhita · Accepted Answer · 2024-11-30 21:48:04Z

I tried using babel parse_decimal to convert the numbers into decimal.It has various options to convert currency etc as well, which you can try as required.

Here is the sample example

from babel.numbers import parse_decimal
import pandas as pd

data = {
    'product_price': ['2296,00'],
    'impressions': [2184],
    'clicks': [117],
    'ctr': ['5,36'],
    'avg_click_price': ['12,63'],
    'total_spent': ['1478,20'],
    'orders': [3]
}

df = pd.DataFrame(data)
def convert_european_with_babel(df, locale='pl_PL'):
    for col in df.columns:
        if df[col].dtype == 'object': 
            df[col] = df[col].apply(lambda x: parse_decimal(x, locale=locale)) 
    return df

df_cleaned_babel = convert_european_with_babel(df)
print(df_cleaned_babel)

Output

  product_price  impressions  clicks   ctr avg_click_price total_spent  orders
0       2296.00         2184     117  5.36           12.63     1478.20       3

Collectives™ on Stack Overflow

Reading integer and float numerical data using Google Sheet API

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related