0

I am a newbie in python programming. I am reading a tab separated file and would like to do an operation which can replace multiple tabs (separating two columns) by single tab.

with open('file.tsv','r') as fin:
    cr = csv.reader(fin, delimiter='\t')
    filecontents = [line for line in cr]

I tried doing it by join function

with open('file.tsv','r') as fin:
    cr = csv.reader(fin, delimiter='\t')
    filecontents = ''.join([line.replace('\t\t', '\t') for line in cr])

I am getting below error.

AttributeError: 'list' object has no attribute 'replace'

How can I do it?

1
  • 1
    If your file is a well formatted tsv, two tabs mean that there's a column with blank value, if you remove one tab could make your tsv file lose consistency Commented Jan 17, 2020 at 15:03

2 Answers 2

2

You can use

re.sub

Giving it "[\t]+" is telling it find one or more tabs and replace it with 1.
Note the use of "\\" is to tell python its a special character.

import re
s = "a\\t\\t\\t\\t\\ta\\t\\t"
print (re.sub(r"[\\t]+", "\t", s))

output >>>
a   a
Sign up to request clarification or add additional context in comments.

Comments

0

One might use the str.replace method to ensure each line only contains a single tab in a row

filecontents = [line.replace('\t\t', '\t') for line in cr]

2 Comments

the above condition only replaces 2 tabs.. Maybe use it to match multiple tabs?
line.replace('\t\t', '\t') is giving error: AttributeError: 'list' object has no attribute 'replace'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.