12
df.filter(pl.col("MyDate") >= "2020-01-01")

does not work like it does in pandas.

I found a workaround

df.filter(pl.col("MyDate") >= pl.datetime(2020,1,1))

but this does not solve a problem if I need to use string variables.

3 Answers 3

13

You can turn the string into a date type e.g. with .str.to_date()

Building on the example above:

import polars as pl
from datetime import datetime

df = pl.DataFrame({
    "dates": [datetime(2021, 1, 1), datetime(2021, 1, 2), datetime(2021, 1, 3)],
    "vals": range(3)
})

df.filter(pl.col('dates') >= pl.lit(my_date_str).str.to_date())
shape: (2, 2)
┌─────────────────────┬──────┐
│ dates               ┆ vals │
│ ---                 ┆ ---  │
│ datetime[μs]        ┆ i64  │
╞═════════════════════╪══════╡
│ 2021-01-02 00:00:00 ┆ 1    │
│ 2021-01-03 00:00:00 ┆ 2    │
└─────────────────────┴──────┘
Sign up to request clarification or add additional context in comments.

Comments

13

You can use python datetime objects. They will be converted to polars literal expressions.

import polars as pl
from datetime import datetime

pl.DataFrame({
    "dates": [datetime(2021, 1, 1), datetime(2021, 1, 2), datetime(2021, 1, 3)],
    "vals": range(3)
}).filter(pl.col("dates") > datetime(2021, 1, 2))

Or in explicit syntax: pl.col("dates") > pl.lit(datetime(2021, 1, 2))

Comments

0

Hacky workaround for slightly neater code: Just use pandas!

pd.to_datetime takes a single string, and from testing with my own data as well as your example polars is very happy to work with the pandas datetime object it returns.

If importing from pandas just isn't possible for you then this is useless, but if you want unfussy string to date conversion ... why not use pandas for what it's good at? :P

import polars as pl
from datetime import datetime
from pandas import to_datetime # or just import pandas as pd

df = pl.DataFrame({
    "dates": [datetime(2021, 1, 1), datetime(2021, 1, 2), datetime(2021, 1, 3)],
    "vals": range(3)
})

my_date_str = "2021-01-02"
my_date = to_datetime(my_date_str) # or use pd.to_datetime
print(df.filter(pl.col('dates') >= my_date))

which produces:

shape: (2, 2)
┌─────────────────────┬──────┐
│ dates               ┆ vals │
│ ---                 ┆ ---  │
│ datetime[μs]        ┆ i64  │
╞═════════════════════╪══════╡
│ 2021-01-02 00:00:00 ┆ 1    │
│ 2021-01-03 00:00:00 ┆ 2    │
└─────────────────────┴──────┘

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.