1

I am trying to insert a pandas dataframe with a date column into a Postgres database such that the data type in Postgres is also a date ('YYYY-MM-DD') but i can only get it to insert at timestamp without a timezone rather than a date. How can I do this?

Here's some starter code (though you'll need postgres creds to connect and test for real):

import pandas as pd
import sqlalchemy

# create toy data
df = pd.DataFrame({'date': ['2022-02-01', '2022-03-11']})
df['date']=df['date'].apply(lambda x: pd.to_datetime(x, format='%Y-%m-%d'))

# connect to postgres
sqlalchemy.create_engine(secrets.get(**SECRETS)))

# insert df into postgres
df.to_sql(
    "toy_table",
    engine,
    schema="toy_schema",
    index=False)

6
  • What is the data type for the field in the Postgres table? If it is timestamp then you will get a timestamp. If you want only a date stored then use the date type per Datetime. Commented May 9, 2022 at 15:56
  • @AdrianKlaver as described, the pandas dates get auto-converted to text in postgres despite their python data type being a date Commented Aug 9, 2022 at 0:30
  • The question remains, what is the data type for the field in toy_table? Commented Aug 9, 2022 at 4:21
  • @AdrianKlaver and the answer remains, it is a pandas datetime64 (i've stripped the time component such that it's just YYYY-MM-DD but it's still a pandas datetime Commented Aug 9, 2022 at 15:09
  • Yes, but the issue is when you insert it into the Postgres database. What you see is going to depend on the data type of the field you are inserting into. Until you provide that information an answer to your question is not possible. Commented Aug 9, 2022 at 17:34

1 Answer 1

2
+50

To illustrate what I am talking about in my comments, that you need to make the field a date:

\d dt_test 
                         Table "public.dt_test"
  Column  |            Type             | Collation | Nullable | Default 
----------+-----------------------------+-----------+----------+---------
 id       | integer                     |           |          | 
 ts_fld   | timestamp without time zone |           |          | 
 tsz_fld  | timestamp with time zone    |           |          | 
 date_fld | date                        |           |          | 
Indexes:
    "t_idx" btree ((ts_fld::date))

insert into dt_test values 
    (1, '2022-02-01', '2022-02-01', '2022-02-01'), 
    (2, '2022-03-11', '2022-03-11', '2022-03-11');

select * from dt_test ;
 id |       ts_fld        |        tsz_fld         |  date_fld  
----+---------------------+------------------------+------------
  1 | 2022-02-01 00:00:00 | 2022-02-01 00:00:00-08 | 2022-02-01
  2 | 2022-03-11 00:00:00 | 2022-03-11 00:00:00-08 | 2022-03-11

A timestamp(tz) field is always going to have a time component. When you supply just a date the value will have 00:00:00 added to it to make it a timestamp.

Sign up to request clarification or add additional context in comments.

4 Comments

Right and what I was trying to communicate is how to insert a python 'date'-like data type directly intk postgres so it is automatically interpreted as a postgres date. I can get it to be a timestamp (w/ or w/out TZ)
This 2022-02-01 00:00:00 is a date when you do select '2022-02-01 00:00:00'::date; 2022-02-01. If you want to store it directly as a date only then the Postgres field/column needs to be of type date.
Yeah that's basically the solution i found. Create the table first setting constraints/data types then insert from pandas and it will be coerced correctly to match. If you add that info to your answer, i'll accept your answer.
Answer updated per request.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.