1

I'm trying to load a Dask dataframe with SQLAlchemy using dd.read_sql_query. I define a table with one of the columns balance_date type DateTime (in the database is type DATE):

class test_loans(Base):
      __tablename__ = 'test_loans'
      annual_income = Column(Float)
      balance = Column(Float)
      balance_date = Column(DateTime)  # the type of the column is DateTime
      cust_segment = Column(String)
      total_amount_paid = Column(Float)
      the_key = Column(Integer)
      __table_args__ = (PrimaryKeyConstraint(the_key),)

Problem is that the dd.read_sql_query fails, as it says that the col_index is not type numeric or date but object:

stmt = select([ test_loans.balance_date, test_loans.total_amount_paid ]) 
ddf = dd.read_sql_query(stmt, con=con, index_col='balance_date', npartitions=3)

I get

TypeError: Provided index column is of type "object".  If divisions is
not provided the index column type must be numeric or datetime.

How to fix this? Is this a defect?

1
  • 1
    Please can you try and see what pd.read_sql gives for your query, with the dtypes? You will perhaps want to limit your query to the first few rows. Commented Aug 13, 2022 at 2:21

2 Answers 2

2

The problem is solved by casting the column as DateTime in the SQLAlchemy select statement.

Sign up to request clarification or add additional context in comments.

Comments

0

It is a bug in dask.dataframe, when no limits are given, it fetches min and max values for the index with pandas.read_sql which does not parse dates automatically, therefore you end up with this min/max df having object dtypes and that dtype is reused for the division, which cannot accept it.

Here is the culprit code: https://github.com/dask/dask/blob/8b95f983c232c1bd628e9cba0695d3ef229d290b/dask/dataframe/io/sql.py#L130

NB. I filled the github issue: https://github.com/dask/dask/issues/9383

1 Comment

In pandas.read_sql there's a keyword to tell explicitly what are the date columns. You cannot specify those columns in Dask. I solved this casting the columns as Date.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.