3

I want to replace a string if I don't have a specific delimiter/string in a column. If the delimiter already exists in that row then I don't want to touch that row. I have around 3.5million records.

Below is the sample set. I want to replace is with :

One:1
Two:2 
Three is 3 
Four is IV:4

Output should be like this

One:1
Two:2
Three:3
Four is IV:4

4 Answers 4

2

Try it without loop and a one-liner using loc

df = pd.DataFrame(["One:1", "Two:2", "Three is 3", "Four is IV:4", "Five is V"], columns=["myValues"])


df.loc[~df['myValues'].str.contains(':'), 'myValues'] = df.loc[~df['myValues'].str.contains(':'), 'myValues'].str.replace('is', ':')

print(df)

    myValues
0   One:1
1   Two:2
2   Three : 3
3   Four is IV:4
4   Five : V
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @Vaishali. Could you please also help me with this problem stackoverflow.com/questions/47984764/…
2

Option 1
Inplace with update

df.update(
    df.myValues.loc[
        lambda x: ~x.str.contains(':')
    ].str.replace('\s+is\s+', ':'))

       myValues
0         One:1
1         Two:2
2       Three:3
3  Four is IV:4

Option 2
Inline and using map

f = lambda x: x if ':' in x else x.replace(' is ', ':')
df.assign(myValues=list(map(f, v)))

       myValues
0         One:1
1         Two:2
2       Three:3
3  Four is IV:4

Comments

1

Other than .contains(), you can also use simple string operations:

df = pd.DataFrame(["One:1", "Two:2", "Three is 3", "Four is IV:4"], columns=["myValues"])
target = [":" not in e for e in df.myValues]
df.myValues[target] = df.myValues[target].str.replace(" is ",":")

Result:

       myValues
0         One:1
1         Two:2
2       Three:3
3  Four is IV:4

Comments

1

First, filter out all strings containing :. Then, replace " is " with ":" for all rows left. (In your example, the space around "is" is also deleted. Thus, I replace " is " with ":".)

df = pd.DataFrame(["One:1", "Two:2", "Three is 3", "Four is IV:4"], columns=["myValues"])
for idx, v in df[~df.myValues.str.contains(":")].iterrows():
    df.loc[idx].myValues = df.iloc[idx].myValues.replace(" is ", ":")

References

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.