1

I open raw data using pandas

df=pd.read_cvs(file)

Here's part of my dataframe look like:

37280  7092|156|Laboratory Data|A648C751-A4DD-4CZ2-85                               
47981  7092|156|Laboratory Data|Z22CD01C-8Z4B-4ZCB-8B                               
57982  7092|156|Laboratory Data|C12CE01C-8F4B-4CZB-8B

I'd like to replace all pipe('|') into tab ('\t') So I tried :

df.replace('|','\t')

But it never works. How could I do this? Many thanks!

0

2 Answers 2

2

The replace method on data frame by default is meant to replace values exactly match the string provided; You need to specify regex=True to replace patterns, and since | is a special character in regex, an escape is needed here:

df1 = df.replace("\|", "\t", regex=True)
df1
#       0                                                   1
#0  37280   7092\t156\tLaboratory Data\tA648C751-A4DD-4CZ2-85
#1  47981   7092\t156\tLaboratory Data\tZ22CD01C-8Z4B-4ZCB-8B
#2  57982   7092\t156\tLaboratory Data\tC12CE01C-8F4B-4CZB-8B

If we print the cell, the tab are printed as expected:

print(df1[1].iat[0])
# 7092  156 Laboratory Data A648C751-A4DD-4CZ2-85
Sign up to request clarification or add additional context in comments.

2 Comments

Yes it works! Problem comes from missing escape for the special regex. Thanks!
Glad it helps !
1

Just need to set the variable to itself: df = df.replace('|', '\t')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.