Mapping CSV Header using a Dictionary

Question

I have a reference file that looks like this:

Experiment,Array,Drug
8983,Genechip,Famotidine
8878,Microarray,Dicyclomine
8988,Genechip,Etidronate
8981,Microarray,Flunarizine

I successfully created a dictionary mapping the Experiment numbers to the Drug name using the following:

reader = csv.reader(open('C:\Users\Troy\Documents\ExPSRef.txt'))
#Configure dictionary
result = {}
for row in reader:
    key = row[0]
    result[key] = row[2]
di = result

I want to map this dictionary to the header of another file which consists of the experiment number. It currently looks like this:

Gene,8988,8981,8878,8983
Vcp,0.011,-0.018,-0.032,-0.034
Ube2d2,0.034,0.225,-0.402,0.418
Becn1,0.145,-0.108,-0.421,-0.048
Lypla2,-0.146,-0.026,-0.101,-0.011

But it should look like this:

Gene,Etidronate,Flunarizine,Dicyclomine,Famotidine
Vcp,0.011,-0.018,-0.032,-0.034
Ube2d2,0.034,0.225,-0.402,0.418
Becn1,0.145,-0.108,-0.421,-0.048
Lypla2,-0.146,-0.026,-0.101,-0.011

I tried using:

import csv
import pandas as pd

reader = csv.reader(open('C:\Users\Troy\Documents\ExPSRef.txt'))
result = {}
    for row in reader:
    key = row[0]
    result[key] = row[2]
    di = result
df = pd.read_csv('C:\Users\Troy\Documents\ExPS2.txt')
df['row[0]'].replace(di, inplace=True)

but it returned a KeyError: 'row[0]'.

I tried the following as well, even transposing in order to merge:

import pandas as pd

df1 = pd.read_csv('C:\Users\Troy\Documents\ExPS2.txt',).transpose()
df2 = pd.read_csv('C:\Users\Troy\Documents\ExPSRef.txt', delimiter=',', engine='python')
df3 = df1.merge(df2)
df4 = df3.set_index('Drug').drop(['Experiment', 'Array'], axis=1)
df4.index.name = 'Drug'
print df4

and this time received MergeError('No common columns to perform merge on').

Is there a simpler way to map my dictionary to the header that would work?

can you show the code you used when you attempted to replace the headers, other than just the one line — Navidad20
– Navidad20, Commented Jan 17, 2017 at 13:22

Nickil Maveli · Accepted Answer · 2017-01-17 14:13:20Z

2

One of the things to keep in mind would be to making sure that both the keys corresponding to the mapper dictionary as well as the header which it is mapped to are of the same data type.

Here, one is a string and the other of integer type. So while reading itself, we'll let it not interpret dtype by setting it to str for the reference DF.

df1 = pd.read_csv('C:\Users\Troy\Documents\ExPS2.txt')               # Original
df2 = pd.read_csv('C:\Users\Troy\Documents\ExPSRef.txt', dtype=str)  # Reference

Convert the columns of the original DF to it's series representation and then replace the old value which were Experiment Nos. with the new Drug name retrieved from the reference DF.

df1.columns = df1.columns.to_series().replace(df2.set_index('Experiment').Drug)
df1

answered Jan 17, 2017 at 14:13

Nickil Maveli

29.8k10 gold badges86 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Navidad20 · Accepted Answer · 2017-01-17 15:25:20Z

1

I used csv for the whole script. This fixes the header you wanted and saves into a new file. The new filename can be replaced with the same one if that's what you prefer. This program is written with python3.

import csv 

with open('sample.txt', 'r') as ref:
    reader = csv.reader(ref)
    # skip header line
    next(reader)
    # make dictionary
    di = dict([(row[0], row[2]) for row in reader])

data = []
with open('sample1.txt', 'r') as df:
    reader = csv.reader(df)
    header = next(reader)
    new_header = [header[0]] + [di[i] for i in header if i in di]
    data = list(reader)

# used to make new file, can also replace with the same file name
with open('new_sample1.txt', 'w') as df_new:
    writer = csv.writer(df_new)
    writer.writerow(new_header)
    writer.writerows(data)

answered Jan 17, 2017 at 15:25

Navidad20

8327 silver badges11 bronze badges

1 Comment

TroyPGoff Over a year ago

I was able to successfully use @Nickil Maveli's script to solely use pandas. This though, does work as well as a csv script. Thanks!!

Collectives™ on Stack Overflow

Mapping CSV Header using a Dictionary

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related