python pandas clean up empty rows after last row of data

Question

I have a df like this:

     t1      t2     t3
0    a       b      c
1            b      
2 
3    
4    a       b      c
5            b      
6
7

I want to drop all values after index 5 because it has no values, but not index 2,3. I will not know whether each column will have data or not.

All values are strings.

unutbu · Accepted Answer · 2014-09-19 00:41:19Z

1

In [74]: df.iloc[:np.where(df.any(axis=1))[0][-1]+1]
Out[74]: 
   t1 t2 t3
10  a  b  c
11  b      
12         
13         
14  a  b  c
15  b

Explanation: First find which rows contain something other than empty strings:

In [37]: df.any(axis=1)
Out[37]: 
0     True
1     True
2    False
3    False
4     True
5     True
6    False
7    False
dtype: bool

Find the location of the rows which are True:

In [71]: np.where(df.any(axis=1))
Out[71]: (array([0, 1, 4, 5]),)

Find the largest index (which will also be the last):

In [72]: np.where(df.any(axis=1))[0][-1]
Out[72]: 5

Then you can use df.iloc to select all rows up to and including the index with value 5.

Note that the first method I suggested is not as robust; if your dataframe has an index with repeated values, then selecting the rows with df.loc is problematic.

The new method is also a bit faster:

In [75]: %timeit df.iloc[:np.where(df.any(axis=1))[0][-1]+1]
1000 loops, best of 3: 203 µs per loop

In [76]: %timeit df.loc[:df.any(axis=1).cumsum().argmax()]
1000 loops, best of 3: 296 µs per loop

edited Sep 19, 2014 at 0:41

answered Sep 19, 2014 at 0:03

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

jason Over a year ago

I could use that, but then I would need to get rid of all of the NaNs

jason Over a year ago

the data in there is actually datetime.datetime objects.

unutbu Over a year ago

and the empty ones are...?

jason Over a year ago

it is a <type 'str'>. Sorry it was inherited.

jason Over a year ago

actually, all the data is strings. I mispoke

|

Collectives™ on Stack Overflow

python pandas clean up empty rows after last row of data

1 Answer 1

9 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Related