Dataframe slicing with string values

Question

I have a string dataframe that I would like to modify. I need to cut off each row of the dataframe at a value say A4 and replace other values after A4 with -- or remove them. I would like to create a new dataframe that has values only upto the string "A4". How would i do this?

import pandas as pd
columns = ['c1','c2','c3','c4','c5','c6']
values = [['A1', 'A2','A3','A4','A5','A6'],['A1','A3','A2','A5','A4','A6'],['A1','A2','A4','A3','A6','A5'],['A2','A1','A3','A4','A5','A6'], ['A2','A1','A3','A4','A6','A5'],['A1','A2','A4','A3','A5','A6']]
input = pd.DataFrame(values, columns)

columns = ['c1','c2','c3','c4','c5','c6']
values = [['A1', 'A2','A3','A4','--','--'],['A1','A3,'A2','A5','A4','--'],['A1','A2','A4','--','--','--'],['A2','A1','A3','A4','--','--'], ['A2','A1','A3','A4','--','--'],['A1','A2','A4','--','--','--']]
output =  pd.DataFrame(values, columns)

I don't see the string 'A4' anywhere, can you be more specific about what exactly you're trying to do? Looking at the accepted answer, I'm thinking that a DataFrame might not even be the best data structure for this. — AMC
– AMC, Commented Jan 17, 2020 at 22:43

jeremycg · Accepted Answer · 2020-01-17 21:28:56Z

1

You can make a small function, that will take an array, and modify the values after your desired value:

def myfunc(x, val):
    for i in range(len(x)):
        if x[i] == val:
            break
    x[(i+1):] = '--'
    return x

Then you need to apply the function to the dataframe in a rowwise (axis = 1) manner:

input.apply(lambda x: myfunc(x, 'A4'), axis = 1)


0   1   2   3   4   5
c1  A1  A2  A3  A4  --  --
c2  A1  A3  A2  A5  A4  --
c3  A1  A2  A4  --  --  --
c4  A2  A1  A3  A5  A4  --
c5  A2  A1  A4  --  --  --
c6  A1  A2  A4  --  --  --

answered Jan 17, 2020 at 21:28

jeremycg

25k6 gold badges67 silver badges78 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Kenan · Accepted Answer · 2020-01-17 21:19:22Z

0

I assume you will have values more than A4

df.replace('A([5-9])', '--', regex=True)

     0   1   2   3   4   5
c1  A1  A2  A3  A4  --  --
c2  A1  A3  A2  --  A4  --
c3  A1  A2  A4  A3  --  --
c4  A2  A1  A3  --  A4  --
c5  A2  A1  A4  A3  --  --
c6  A1  A2  A4  A3  --  --

answered Jan 17, 2020 at 21:19

Kenan

14.2k9 gold badges47 silver badges56 bronze badges

Collectives™ on Stack Overflow

Dataframe slicing with string values

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related