I need to split one column in csv file into two columns using python

Question

Hello everyone I am learning python I am new I have a column in a csv file with this example of value:

I want to divide the column programme based on that semi column into two columns for example

program 1: H2020-EU.3.1.
program 2: H2020-EU.3.1.7.

This is what I wrote initially

import csv
import os
with open('IMI.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file)
    
    with open('new_IMI.csv', 'w') as new_file:
              csv_writer = csv.writer(new_file, delimiter='\t')
              
    #for line in csv_reader:
       # csv_writer.writerow(line)

please note that after i do the split of columns I need to write the file again as a csv and save it to my computer

Please guide me

pandas provides a very easy to use .to_csv method. Perhaps do some research on this. This can be used with the .read_csv method which will read the CSV into a DataFrame object where the column can easily be split, then written back out. — s3dev
– s3dev, Commented Dec 7, 2021 at 12:37
I did this but I was blocked : import pandas as pd df = pd.read_csv('IMI2.csv', sep=';',encoding='cp1252' it will bring parser error and if I do this it won't put data in the right format: import pandas as pd df = pd.read_csv('IMI2.csv', sep=';',encoding='cp1252', names=['colA', 'colB', 'colC', 'colD','colE', 'colG','colH', 'colI', 'colJ', 'colK' , 'colL', 'colM', 'colN', 'colO', 'colP', 'colQ', 'colR', 'colS', 'colT', 'colU']) df — AhlemMustapha
– AhlemMustapha, Commented Dec 7, 2021 at 12:41
Does this answer your question? How to split a column in csv file into multiple column in python jupyter? Please avoid asking the same question twice — Riccardo Bucco
– Riccardo Bucco, Commented Dec 7, 2021 at 12:53

chitown88 · Accepted Answer · 2021-12-07 13:09:45Z

Using .loc to iterate through each row of a dataframe is somewhat inefficient. Better to split an entire column, with the expand=True to assign to the new columns. Also as stated, easy to use pandas here:

Code:

import pandas as pd

df = pd.read_csv('IMI.csv')
df[['programme1','programme2']] = df['programme'].str.split(';', expand=True)
df.drop(['programme'], axis=1, inplace=True)

df.to_csv('IMI.csv', index=False)

Example of output:

Before:

print(df)
       id     acronym  status                   programme           topics
0  945358  BIGPICTURE  SIGNED  H2020-EU.3.1.;H2020-EU3.1.7  IMI2-2019-18-01
1  821362      EBiSC2  SIGNED  H2020-EU.3.1.;H2020-EU3.1.7  IMI2-2017-13-06
2  116026     HARMONY  SIGNED                H202-EU.3.1.  IMI2-2015-06-04

After:

print(df)
       id     acronym  status           topics    programme1     programme2
0  945358  BIGPICTURE  SIGNED  IMI2-2019-18-01  H2020-EU.3.1.  H2020-EU3.1.7
1  821362      EBiSC2  SIGNED  IMI2-2017-13-06  H2020-EU.3.1.  H2020-EU3.1.7
2  116026     HARMONY  SIGNED  IMI2-2015-06-04  H2020-EU.3.1.           None

Mohammad Khoshbin · Accepted Answer · 2021-12-07 13:13:21Z

0

You can use pandas library instead of csv.

import pandas as pd
df = pd.read_csv('IMI.csv')
p1 = {}
p2 = {}

for i in range(len(df)):
    if ';' in df['programme'].loc[i]:
        p1[df['id'].loc[i]] = df['programme'].loc[i].split(';')[0]
        p2[df['id'].loc[i]] = df['programme'].loc[i].split(';')[1]

df['programme1'] = df['id'].map(p1)
df['programme2'] = df['id'].map(p2)

and if you want to delete programme column:

df.drop('programme', axis=1)

To save new csv file:

df.to_csv('new_file.csv', inplace=True)

edited Dec 7, 2021 at 13:13

answered Dec 7, 2021 at 12:51

Mohammad Khoshbin

6264 silver badges12 bronze badges

3 Comments

chitown88 Over a year ago

if you want to iterate through each row, probably better to use for i, row in df.iterrows(): as the way you have it is dependant on if the index is numerical sequence from 0 to n.

chitown88 Over a year ago

secondly, if you want to drop the column, you'll need to reassign the df or use inplace=True

Mohammad Khoshbin Over a year ago

About first comment, i wrote this for data shown in the picture. And for second, yes, my bad. I forgot to write `inplace=True'.

Collectives™ on Stack Overflow

I need to split one column in csv file into two columns using python

2 Answers 2

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related