IndexError: list index out of range CSV parser

Question

I am attempting to use this code to parse a csv file but cannot find my way around this error:

"File "(file location)", line 438, in parser_42

position = tmp2[1]

IndexError: list index out of range"

my csv file is structured like so:

mutant coefficient Score

Q41V -0.19 0.05

Q41L -0.08 0.26

Q41T -0.21 0.43

I23V -0.02 0.45

I61V 0.01 1.12

I want to take the mutants and separate 'Q' '41' and 'V', for example. I then want to create lists of positions and wt's and put them in numerical order.

The goal is to write the string "seq" to a new csv file

obviously, I am a beginner in python and data manipulation. I imagine that I am just overlooking something silly...Can anyone steer me in the right direction?

def parser_42(csv_in, fasta_in, *args):

    with open(csv_in, 'r') as tsv_in:
        tsv_in = csv.reader(tsv_in, delimiter='\t')
        next(tsv_in) # data starts on line 7
        next(tsv_in)
        next(tsv_in)
        next(tsv_in)
        next(tsv_in)
        next(tsv_in)

        for row in tsv_in:
            tmp = row[0].split(',')
            tmp2 = re.split('(\d+)', tmp[0])
            wt = tmp2[0]
            position = tmp2[1]
            substitution = tmp[2]

            seq = ""
            current_positions = []


            if position not in current_positions:
                current_positions += [position]
                print(current_positions)
                seq += wt
            else:
                continue

        print(seq)

it looks like your csv only has one value per row and youre trying to acces a second value with tmp2[1] — Craicerjack
– Craicerjack, Commented Feb 19, 2017 at 21:47
You probably have an empty line somewhere, possibly at the end of the file. — TigerhawkT3
– TigerhawkT3, Commented Feb 19, 2017 at 22:10
After you split, you can check the length of the result before you proceed to access indexes which do not exist. — Kenny Ostrom
– Kenny Ostrom, Commented Feb 19, 2017 at 23:50

Dan Vitale · Accepted Answer · 2017-02-20 19:44:51Z

for anyone who may be interested, this is how I solved my problem... if anyone has any suggestions on how to make this a little more concise, the advice would be appreciated. I know this probably seems like a roundabout way to fix a small issue but I learned a fair amount in the process so I am not overly concerned :). I basically replaced the .split() with regular expressions, which seems to be a bit more clean.

def parser_42(csv_in, fasta_in, *args):
    dataset = pd.DataFrame(columns=get_column_names())
    with open(csv_in) as tsv_in:
        tsv_in = csv.reader(tsv_in, delimiter='\t')
        next(tsv_in) #data starts on row 7
        next(tsv_in)
        next(tsv_in)
        next(tsv_in)
        next(tsv_in)
        next(tsv_in)
        save_path = '(directory path)'
        complete_fasta_filename = os.path.join(save_path, 'dataset_42_seq.fasta.txt')
        output_fasta_file = open(complete_fasta_filename, 'w')

        seq = ''
        current_positions = []

        for row in tsv_in:

         # regular expressions to split numbers and characters in single cell
            regepx_match = re.match(r'([A-Z])([0-9]+)([A-Z,*])', row[0], re.M | re.I)
            wt = regepx_match.group(1)
            position = int(regepx_match.group(2))
            substitution = regepx_match.group(3)

            if position not in current_positions:
                current_positions += [position]
                seq += wt
            else:
                continue
        seq = list(seq)

    # this zips seq and current_positions and sorts seq
        sorted_y_idx_list = sorted(range(len(current_positions)), key=lambda x: current_positions[x])
        Xs = [seq[i] for i in sorted_y_idx_list]

        seq1 = '>dataset_42 fasta\n'
        seq1 = seq1 + ''.join(Xs) # join to string


        output_fasta_file.write(seq1)
        output_fasta_file.close()

Collectives™ on Stack Overflow

IndexError: list index out of range CSV parser

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related