How to remove newline characters from string in pandas python

Question

For some reason, none of the solutions previously posted about this seem to answer my question.

I am reading in an excel page with 150+ sheets. I am looping through them and preparing the data to be concatenated together. (doing things like deleting unneeded/blank columns, and transforming some data) However, for some reason, I cannot get rid of any of the newline characters, no matter what I try. Here are some variations that I've tried so you can see what DIDN'T work.

import pandas as pd
import os

os.chdir(r'C:\Users\agray\Downloads')

sheets_dict = pd.read_excel('2022_Advanced_Control.xlsx', sheet_name=None)
df_list = list(sheets_dict.values())
df_list_clean = []

The top part stays the same, this loop portion is what changes.

for df in df_list:
    df.columns = [c.replace(' ', '_') for c in df.columns]

    df.drop(df.columns.difference(['Prescription_Drug_Name','Drug_Tier', 'Drug_Notes']), 1, inplace=True)
    df.drop(df.tail(3).index, inplace=True)
    df.loc[:, 'Prescription_Drug_Name'] = df.loc[:, 'Prescription_Drug_Name'].replace("\n", "", inplace=True)
    df_list_clean.append(df)

This gives me a column that has nothing but blank values.

Here's another way I tried

for df in df_list:
    df.columns = [c.replace(' ', '_') for c in df.columns]

    df.drop(df.columns.difference(['Prescription_Drug_Name','Drug_Tier', 'Drug_Notes']), 1, inplace=True)
    df.drop(df.tail(3).index, inplace=True)
    df['Prescription_Drug_Name'] = df['Prescription_Drug_Name'].replace(r'\n','', regex=True, inplace=True)
    df_list_clean.append(df)

This version is only applying to a copy, so none of the changes it says it's making are actually being made to my df. Any ideas how to get rid of all these "/n" characters in my column? Thanks!

That only removes newline characters from the beginning and end of string. I have strings that look like this: AAAAAA/nBBBBBBBB/nCCCCCCCCCCCCC @Corralien — Angus Gray
– Angus Gray, Commented Nov 11, 2021 at 21:00
If you use inplace=True. you must not use df['Prescription_Drug_Name'] = — Corralien
– Corralien, Commented Nov 11, 2021 at 21:02

Corralien · Accepted Answer · 2021-11-11 21:11:15Z

1

Use str.strip():

df['Prescription_Drug_Name'] = df['Prescription_Drug_Name'].str.replace(r'\n', '')

answered Nov 11, 2021 at 21:11

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Quang Hoang · Accepted Answer · 2021-11-11 21:16:11Z

1

I always advise against inplace=True. Make an explicit copy where you mean to. This version is only applying to a copy, so none of the changes... being made to my df. Why don't you clone your data like this:

for df in df_list:
    clean = df.copy()
    clean.columns = [c.replace(' ', '_') for c in df.columns]

    clean = clean.drop(df.columns.difference(['Prescription_Drug_Name','Drug_Tier', 'Drug_Notes']), 1)

    # drop last three rows
    clean = clean.iloc[:-3]

    # modify column, remove `inplace` here
    clean['Prescription_Drug_Name'] = clean['Prescription_Drug_Name'].replace(r'\n','', regex=True)
    df_list_clean.append(clean)

That being said, all of the above can be chained, so you can do something like this:

for df in df_list
    clean = (df.rename(columns=lambda x: x.replace(' ', '_'))
               .reindex(['Prescription_Drug_Name','Drug_Tier', 'Drug_Notes'], axis=1).dropna(axis=0, how='all')   # select only the columns
               .iloc[:-3]
               .assign(Prescription_Drug_Name=lambda x: x.replace(r'\n', '', regex=True)
             )
    df_list_clean.append(clean)

edited Nov 11, 2021 at 21:16

answered Nov 11, 2021 at 21:11

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

1 Comment

Angus Gray Over a year ago

Thank you, this is actually unexpectedly useful. I really like that idea of using a copy within my loop. That'll save me some headaches of figuring out if a change was made to the original or a copy.

Collectives™ on Stack Overflow

How to remove newline characters from string in pandas python

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related