2

I am trying to combine strings in my data frame. The dataframe looks like:

0          code   text1
1        507489   text2
2        507489   text3
3        506141   text4
4        506141   text5
5        504273   text6

My current code:

import pandas as pd

df = pd.read_csv("location.csv", header=None, delimiter=';', dtype='unicode', nrows=100)
new_header = df.iloc[0] 
df = df[1:] 
df.columns = new_header

df.groupby('code').agg('->'.join).reset_index()

df.to_csv (r'new_location\export_dataframe.csv', index = False, header=True)
print(df)

But I am not getting the expected results. The output looks the same as the input while I was expecting:

0          code   text1
1        507489   text2->text3
2        506141   text4->text5
3        504273   text6

Quite new to this so I must be making some easy mistake.

Dataframe that produces same result:

testf = {'code': ['1','2','2','4'],
        'text': [22000,25000,27000,35000]
        }

df = pd.DataFrame(testf, columns = ['code', 'text'])
2
  • hi , please add DF that will can run , don't forget we don't have your csv, and that will make time to implement you DF you have to follow this step before stackoverflow.com/help/minimal-reproducible-example Commented Apr 30, 2020 at 8:08
  • Added a df that produces the same result. Commented Apr 30, 2020 at 8:18

1 Answer 1

2

It seems you forget assign back, also was removed header=None in read_csv because in file is header used for columns names in DataFrame:

import pandas as pd

df = pd.read_csv("location.csv", sep=';', dtype='unicode', nrows=100)

df = df.groupby('code').agg('->'.join).reset_index()
print (df)
     code         text1
0  504273         text6
1  506141  text4->text5
2  507489  text2->text3

df.to_csv (r'new_location\export_dataframe.csv', index = False)
Sign up to request clarification or add additional context in comments.

5 Comments

Tried this but still producing the same result.
@Elmer - Solution was simplify, can you test?
Removed header=None but still doesn't work. How do do you mean forgot assign back? The .reset.index()?
@Elmer - I think df.groupby('code').agg('->'.join).reset_index() vs df = df.groupby('code').agg('->'.join).reset_index()
Solved. It was indeed df.groupby('code').agg('->'.join).reset_index() vs df = df.groupby('code').agg('->'.join).reset_index()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.