1

I connected to a table on a db where there are two columns with dates. I had no problem to parse the column with values formatted as this: 2017-11-03
But I don't find a way to parse the other column with dates formatted as this: 2017-10-03 05:06:52.840 +02:00

My attempts
If I parse a single value through the strptime method

dt.datetime.strptime("2017-12-14 22:16:24.037 +02:00", "%Y-%m-%d %H:%M:%S.%f %z")

I get the correct output
datetime.datetime(2017, 12, 14, 22, 16, 24, 37000, tzinfo=datetime.timezone(datetime.timedelta(seconds=7200)))

but if I try to use the same code format while parsing the table to the dataframe, the column dtype is an object:

Licenze_FromGY = pd.read_sql(query, cnxn, parse_dates={"EndDate":"%Y-%m-%d", "LastUpd":"%Y-%m-%d %H:%M:%S.%f %z"})


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Tenant     1000 non-null   int64         
 1   IdService  1000 non-null   object        
 2   Code       1000 non-null   object        
 3   Aggregate  1000 non-null   object        
 4   Bundle     991 non-null    object        
 5   Status     1000 non-null   object        
 6   Value      1000 non-null   int64         
 7   EndDate    258 non-null    datetime64[ns]
 8   Trial      1000 non-null   bool          
 9   LastUpd    1000 non-null   object 

I also tried to change the code format either in the read_sql method or in the pd.to_datetime() method, but then all the values become NaT:

Licenze_FromGY["LastUpd"] = pd.to_datetime(Licenze_FromGY["LastUpd"], format="%Y-%m-%d %H:%M:%S.%fZ", errors="coerce") 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Tenant     1000 non-null   int64         
 1   IdService  1000 non-null   object        
 2   Code       1000 non-null   object        
 3   Aggregate  1000 non-null   object        
 4   Bundle     991 non-null    object        
 5   Status     1000 non-null   object        
 6   Value      1000 non-null   int64         
 7   EndDate    258 non-null    datetime64[ns]
 8   Trial      1000 non-null   bool          
 9   LastUpd    0 non-null      datetime64[ns]
dtypes: bool(1), datetime64[ns](2), int64(2), object(5)
memory usage: 71.4+ KB
None 

Anyone can help?

3
  • 1
    You have a UTC offset in your date/time input ("+02:00"). If that is not the same for all elements in the column, pandas will handle the column as 'object' dtype, not 'datetime64[ns]'. You could convert to UTC to get pandas datetime dtype. Commented Sep 6, 2022 at 16:58
  • Thanks. Do you mean in the original table or in the dataframe? In this latter case could you help me to understand how to do it? Commented Sep 7, 2022 at 11:02
  • 1
    I mean the dataframe. I've added a small example, I hope that clarifies things. Commented Sep 7, 2022 at 11:54

1 Answer 1

2

pandas cannot handle mixed UTC offsets in one Series (column). Assuming you have data like this

import pandas as pd

df = pd.DataFrame({"datetime": ["2017-12-14 22:16:24.037 +02:00", 
                                "2018-08-14 22:16:24.037 +03:00"]})

if you just parse to datetime,

df["datetime"] = pd.to_datetime(df["datetime"])

df["datetime"]
0    2017-12-14 22:16:24.037000+02:00
1    2018-08-14 22:16:24.037000+03:00
Name: datetime, dtype: object

you get dtype object. The elements of the series are of the Python datetime.datetime dtype. That limits the datetime functionality, compared to the pandas datetime dtype.

You can get that e.g. by parsing to UTC:

df["datetime"] = pd.to_datetime(df["datetime"], utc=True)

df["datetime"]
0   2017-12-14 20:16:24.037000+00:00
1   2018-08-14 19:16:24.037000+00:00
Name: datetime, dtype: datetime64[ns, UTC]

You might set an appropriate time zone to re-create the UTC offset:

df["datetime"] = pd.to_datetime(df["datetime"], utc=True).dt.tz_convert("Europe/Athens")

df["datetime"]
0   2017-12-14 22:16:24.037000+02:00
1   2018-08-14 22:16:24.037000+03:00
Name: datetime, dtype: datetime64[ns, Europe/Athens]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.