0

I want to apply filtering to Polars Dataframe, but each element of the filter is optional, dependent on the availablity of a function parameter.

I have a function that takes the following 3 paramters: ticker: str = '', strategy: str = '', iteration: int = -1

If these parameters are not the default value then I want to apply that filter to the dataframe. My logic is as follows:

self.configs_df = self.configs_df.filter(
                (ticker != '' & pl.col('Ticker') == ticker) &
                (strategy != '' & pl.col('Strategy') == strategy) &
                (iteration >= 0 & pl.col('Iteration') == iteration)
        )

but I get an error: TypeError: the truth value of an Expr is ambiguous which I understand, but I don't know how to build the expression up in its logical parts. Any help or guidance would be much appreciated as I am an absolute beginner with Polars, having come from Pandas, where I have this (similar) logic working.

Regards, Stuart

1
  • I would rather use normal if to run code with one filter or skip it. Eventually ... if (ticker != '') else ... to get pl.col('Ticker') == ticker or to get other structure which would select all rows. But I'm not experience with Polars Commented Apr 17 at 20:36

3 Answers 3

2

One possible way is to do it as such:

self.configs_df.filter(
    ((ticker == '') | (pl.col('Ticker') == ticker)) &
    ((strategy == '') | (pl.col('Strategy') == strategy)) &
    ((iteration < 0) | (pl.col('Iteration') == iteration))
)
Sign up to request clarification or add additional context in comments.

1 Comment

@Rodalm I fixed the typo. I don't follow your other complaint however, can you give a concrete example which goes wrong?
1

Either put them in a list first, or use ... if ... else True for each of them

# Setup
import polars as pl
df = pl.DataFrame({'x': [1, 2, 3], 'y': [11, 22, 33], 'z': [100, 200, 300]})

x = 2
y = None
z = 150

# Solution I recommend: Put in a list first
filters = []

if x is not None:
    filters.append(pl.col('x') <= x)

if y is not None:
    filters.append(pl.col('y') != y)

if z is not None:
    filters.append(pl.col('z') >= z)

# You can use `*` to expand the filters, and polars will take the intersection of them
# This is the same as calling `df.filter(pl.col('x') <= x, pl.col('z') >= z)`
result = df.filter(*filters)
print(result)

# Alternative: Use ternary if/else to turn unused filters into `AND True`

result = df.filter(
    (pl.col('x') <= x if x is not None else True)
    & (pl.col('y') != y if y is not None else True)
    & (pl.col('z') >= z if z is not None else True)
)
print(result)

Comments

0

Thank you to all that responded. I did try doing all the filtering outside of polars but it turned out to be really un wieldy, which is why I was trying to do it in Polars. When I thought about it some more I realised that some of the filtering was not required. But in the end I came up with a solution, as follows:

self.configs_df = self.configs_df.filter(
       ((pl.col('Ticker') == ticker) | (pl.lit(ticker) == pl.lit('')))
       & ((pl.col('Iteration') == iteration) | (pl.lit(iteration) == pl.lit(-1)))
)

I think the original problem was that I was trying to do something like this:

(ticker != '' & ...

which is what other people suggested. I think the trick was to use

pl.lit(ticker) == pl.lit('')

instead. So, I managed to resolve my problem, and I hope this helps anyone else that may have a similar problem. Thanks all that contributed.

Regards, Stuart

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.