modify a string python

Question

I have a csv file structured in the following way:

num  mut
36    L
45    P
  ...

where num indicates the position of a mutation and mut indicates the mutation. I have to modify at the position num with the letter mut a string. I wrote the following code in python:

import pandas as pd
import os
df = pd.read_csv(r'file.csv')
df_tmp=df.astype(str)
df_tmp["folder"]=df_tmp["num"]+df_tmp["mut"] #add a third column
f = open("sequence.txt", 'r')
content = f.read()
for i in range(len(df)):
     num=df_tmp.num.loc[[i]]-13
     num=num.astype(int)
     prev=num-1
     prev=prev.astype(int)
     mut=df_tmp.mut.loc[[i]]
     mut=mut.astype(str)
     new="".join((content[:prev],mut,content[num:])) #this should modify the file

But it returns me

TypeError: slice indices must be integers or None or have an __index__ method

How can I solve?

Edit: maybe it is more clear what I want to do. I have to insert only the first mutation in my sequence, save it to a file, copy the file in a folder that is named as the third column (that I added in the code), make the same thing with the second mutation, then the third and so on. But I have to insert only one mutation at time.

your approach is really inefficient, you're looping and recreating the full string for each loop. the maximum complexity, assuming you change all characters would be O(n**2) while you can do it in O(n) — mozway
– mozway, Commented May 31, 2022 at 11:50
I edited the question, maybe now it is more clear why i use the loop @mozway — Tia Cava
– Tia Cava, Commented May 31, 2022 at 11:56

mozway · Accepted Answer · 2022-05-31 12:29:57Z

1

multiple mutations:

IIUC, you'd be better off pandas, convert your dataframe to dictionary, iterate and join:

# input DataFrame
df = pd.DataFrame({'num': [36, 45], 'mut': ['L', 'P']})

# input string
string = '-'*50
# '--------------------------------------------------'

# get the positions to modify
pos = df.set_index('num')['mut'].to_dict()
# {36: 'L', 45: 'P'}

# iterate over the string, replace hte characters if in the dictionary
# NB. define start=1 if you want the first position to be 1
new_string = ''.join([pos.get(i, c) for i,c in enumerate(string, start=0)])
# '------------------------------------L--------P----'

single mutations:

string = '-'*50
# '--------------------------------------------------'

for idx, r in df.iterrows():
    new_string = string[:r['num']-1]+r['mut']+string[r['num']:]
    # or
    # new_string = ''.join([string[:r['num']-1], r['mut'], string[r['num']:]])
    
    with open(f'file_{idx}.txt', 'w') as f:
        f.write(new_string)

output:

file_0.txt
-----------------------------------L--------------

file_1.txt
--------------------------------------------P-----

edited May 31, 2022 at 12:29

answered May 31, 2022 at 11:38

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Tia Cava Over a year ago

In this case it adds the mutation and not substitute it. So for example in your string of 50 dash, it becomes of 51 @mozway

Cranchian · Accepted Answer · 2022-05-31 11:46:42Z

0

I tried your code with a sample file.csv and an empty sequence.txt file,

in your code first line from for loop

num=df_tmp.num.loc[[i]]-13
#gives an error since the num in that location is str, to correct that:

num=df_tmp.num.loc[[i]].astype(int)-13 
# I used astype to convert it into int first

After this the next error is in last line , the slice indices type error, This is due to the fact that , the resulting prev and num you use to slice the content variable is not a int, to get the int value add a [0] to it in this way:

content="".join((content[:prev[0]],mut,content[num[0]:]))

There shouldn't be an error now.

answered May 31, 2022 at 11:46

Cranchian

4404 silver badges4 bronze badges

2 Comments

Tia Cava Over a year ago

now it gives me this: TypeError: sequence item 1: expected str instance, Series found @Cranchian

Cranchian Over a year ago

at which line does this error pop up exactly?

Collectives™ on Stack Overflow

modify a string python

2 Answers 2

multiple mutations:

single mutations:

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

multiple mutations:

single mutations:

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related