Python find and replace upon condition / with a function

Question

String = n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l

I want the script to look at a pair at a time meaning:

evaluate n76a+q80a. if abs(76-80) < 10, then replace '+' with a '_': else don't change anything. Then evaluate q80a+l83a next and do the same thing.

The desired output should be:

n76a_q80a_l83a+i153a+l203f_r207a_s211a_s215w_f216a+e283l

What i tried is,

def aa_dist(x):
if abs(int(x[1:3]) - int(x[6:8])) < 10:
    print re.sub(r'\+', '_', x)

with open(input_file, 'r') as alex:
    oligos_list = alex.read()
    aa_dist(oligos_list)

This is what I have up to this point. I know that my code will just replace all '+' into '_' because it only evaluates the first pair and and replace all. How should I do this?

i think the index value would change in case of i153a+l203f — Avinash Raj
– Avinash Raj, Commented Feb 11, 2015 at 23:59

Joran Beasley · Accepted Answer · 2015-02-12 00:03:09Z

2

import itertools,re

my_string =  "n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l"
#first extract the numbers    
my_numbers = map(int,re.findall("[0-9]+",my_string))
#split the string on + (useless comment)
parts = my_string.split("+")

def get_filler((a,b)):
    '''this method decides on the joiner'''
    return "_" if abs(a-b) < 10 else '+'

fillers = map(get_filler,zip(my_numbers,my_numbers[1:])) #figure out what fillers we need
print "".join(itertools.chain.from_iterable(zip(parts,fillers)))+parts[-1] #it will always skip the last part so gotta add it

is one way you might accomplish this... and is also an example of worthless comments

answered Feb 12, 2015 at 0:03

Joran Beasley

114k13 gold badges168 silver badges187 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Wilson Mak Over a year ago

Thanks! I am really new to programming in general and I'm not that familiar with the itertool module. Could you please explain a little more what the following two lines are doing exactly? fillers = map(get_filler,zip(my_numbers,my_numbers[1:])) #figure out what fillers we need print "".join(itertools.chain.from_iterable(zip(parts,fillers)))+parts[-1]

Joran Beasley Over a year ago

itertools.chain simply takes a 2d list and flattens it ... it is one of many ways to do that ... the line above zips the list of numbers with itself to get pairs of adjacent numbers and maps them to a function that decides on + or _

Avinash Raj · Accepted Answer · 2015-02-12 00:22:26Z

Through re module only.

>>> s = 'n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l'
>>> m = re.findall(r'(?=\b([^+]+\+[^+]+))', s)               # This regex would helps to do a overlapping match. See the  demo (https://regex101.com/r/jO6zT2/13)
>>> m
['n76a+q80a', 'q80a+l83a', 'l83a+i153a', 'i153a+l203f', 'l203f+r207a', 'r207a+s211a', 's211a+s215w', 's215w+f216a', 'f216a+e283l']
>>> l = []
>>> for i in m:
        if abs(int(re.search(r'^\D*(\d+)', i).group(1)) -    int(re.search(r'^\D*\d+\D*(\d+)', i).group(1))) < 10:
            l.append(i.replace('+', '_'))
        else:
            l.append(i)
>>> re.sub(r'([a-z0-9]+)\1', r'\1',''.join(l))
'n76a_q80a_l83a+i153a+l203f_r207a_s211a_s215w_f216a+e283l'

By defining a separate function.

import re
def aa_dist(x):
    l = []
    m = re.findall(r'(?=\b([^+]+\+[^+]+))', x)
    for i in m:
        if abs(int(re.search(r'^\D*(\d+)', i).group(1)) - int(re.search(r'^\D*\d+\D*(\d+)', i).group(1))) < 10:
            l.append(i.replace('+', '_'))
        else:
            l.append(i)
    return re.sub(r'([a-z0-9]+)\1', r'\1',''.join(l))

string = 'n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l'
print  aa_dist(string)

Output:

n76a_q80a_l83a+i153a+l203f_r207a_s211a_s215w_f216a+e283l

Collectives™ on Stack Overflow

Python find and replace upon condition / with a function

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related