0

I am trying to extract data from columns in a text file. One of the columns has a header which I also need to extract a whole column with repeating entries of the header, i.e:

col1 col2 col3
1     1     1
2     2     2
3     3     3

into:

col1 col2 col3  col3
1     1     1   col3
2     2     2   col3
3     3     3   col3

I am struggling isolating the header.

for line in my_file:
    line = line.split("\t")
    column = line[0:3] #col1-3

How do I get the header from col3 and then put it repeating? Do I have to split the line by "\n" first, then by "\t"?

I tried to do this but got an error message?

3
  • Is your file a csv file separated by tabs? Commented Dec 7, 2015 at 10:45
  • Its a text file separated by tabs Commented Dec 7, 2015 at 10:47
  • Can you post that error as edit. Commented Dec 7, 2015 at 10:54

3 Answers 3

1

Why dont you use pandas.

     import pandas as pd
     df = pd.read_csv("filename.tsv",sep="\t")

In order to get the column header also you can use

      df.ix[:,2:]
Sign up to request clarification or add additional context in comments.

Comments

1
with open('/home/prashant/Desktop/data.txt') as f:
for l in f:
    print l.strip( ).split("\n")

This might solve your problem results I'm getting are

[col1 col2 col3]

[1 1 1]

[2 2 2]

[3 3 3]

Comments

0

You could use Python's CSV module as follows. This can handle the splitting up of all of the columns for you automatically. By default it assumes columns are separated by commas, but this can be switched to a tab by specifying which delimiter to use:

import csv

with open('input.csv', 'rb') as f_input, open('output.csv', 'wb') as f_output:
    csv_input = csv.reader(f_input, delimiter='\t')
    csv_output = csv.writer(f_output, delimiter='\t')
    header = next(csv_input)
    csv_output.writerow(header + [header[-1]])

    for cols in csv_input:
        print cols
        csv_output.writerow(cols + [header[-1]])

For your given input, you will get the following output (columns are tab delimited):

col1    col2    col3    col3
1   1   1   col3
2   2   2   col3
3   3   3   col3

Tested using Python 2.7.9

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.