How to drop row in polars-python [closed]

Question

Closed. This question needs details or clarity. It is not currently accepting answers.

Want to improve this question? As written, this question is lacking some of the information it needs to be answered. If the author adds details in comments, consider editing them into the question. Once there's sufficient detail to answer, vote to reopen the question.

Closed last month.

The community reviewed whether to reopen this question last month and left it closed:

Original close reason(s) were not resolved

Improve this question

How to add new feature like length of data frame & Drop rows value using indexing. I want to a add a new column where I can count the no-of rows available in a data frame, & using indexing drop rows value.

for i in range(len(df)):
    if (df['col1'][i] == df['col2'][i]) and (df['col4'][i] == df['col3'][i]):
        pass
    elif (df['col1'][i] == df['col3'][i]) and (df['col4'][i] == df['col2'][i]): 
        df['col1'][i] = df['col2'][i]
        df['col4'][i] = df['col3'][i]
    else:
       df = df.drop(i)

Please provide enough code so others can better understand or reproduce the problem. — Community
– Community Bot, Commented Mar 16, 2022 at 8:22

jqurious · Accepted Answer · 2024-07-20 11:12:52Z

18

Polars doesn't allow much mutation and favors pure data handling. Meaning that you create a new DataFrame instead of modifying an existing one.

So it helps to think of the data you want to keep instead of the row you want to remove.

Below I have written an example that keeps all data except for the 2nd row. Note that the slice will be the fastest of the two and will have zero data copy.

df = pl.DataFrame({
    "a": [1, 2, 3],
    "b": [True, False, None]
}).with_row_index()

print(df)

# filter on condition
df_a = df.filter(pl.col("index") != 1)

# stack two slices
df_b = df[:1].vstack(df[2:])

# or via explicit slice syntax
# df_b = df.slice(0, 1).vstack(df.slice(2, -1))

assert df_a.equals(df_b)

print(df_a)

Outputs:

shape: (3, 3)
┌───────┬─────┬───────┐
│ index ┆ a   ┆ b     │
│ ---   ┆ --- ┆ ---   │
│ u32   ┆ i64 ┆ bool  │
╞═══════╪═════╪═══════╡
│ 0     ┆ 1   ┆ true  │
│ 1     ┆ 2   ┆ false │
│ 2     ┆ 3   ┆ null  │
└───────┴─────┴───────┘

shape: (2, 3)
┌───────┬─────┬──────┐
│ index ┆ a   ┆ b    │
│ ---   ┆ --- ┆ ---  │
│ u32   ┆ i64 ┆ bool │
╞═══════╪═════╪══════╡
│ 0     ┆ 1   ┆ true │
│ 2     ┆ 3   ┆ null │
└───────┴─────┴──────┘

edited Jul 20, 2024 at 11:12

jqurious

24.2k6 gold badges24 silver badges43 bronze badges

answered Mar 16, 2022 at 10:13

ritchie46

15.6k2 gold badges45 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

ramslök Over a year ago

It says slice is faster but do you mean slice? Since there's no method call for slice here. :)

ritchie46 Over a year ago

[:3] is syntactic sugar for slice(0, 3)

ramslök Over a year ago

Right. I was looking at your answer and trying to understand what part of the code you said was better, so hence the question.

ritchie46 Over a year ago

I shall add an example with an explicit slice as well. :+1:

L Tyrone · Accepted Answer · 2024-05-24 06:36:10Z

0

I think the column you want to add index for getting length of df col, to remove certain rows you need to add index the drop using masking:

df.with_row_index().filter(~pl.col("index").is_in(your_index_points))

edited May 24, 2024 at 6:36

L Tyrone

8,36123 gold badges34 silver badges47 bronze badges

answered May 22, 2024 at 21:18

ravi kumar

1

Collectives™ on Stack Overflow

How to drop row in polars-python [closed]

2 Answers 2

Outputs:

4 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Outputs:

4 Comments

Comments

Related