Replace characters with particular format with a variable value in python

Question

I have filenames with the particular format as given

II.NIL.10.BHZ.M.2058.190.160877
II.NIL.10.BHA.M.2008.190.168857   
II.NIL.10.BHB.M.2078.198.160857
.
.
.

I want to remove the BH?.M part with the value in a string variable in name.

name=['T','D','FG'.....]

expected output

II.NIL.10.BHT.2058.190.160877
II.NIL.10.BHD.2008.190.168857   
II.NIL.10.BHFG.2078.198.160857
.
.
.

Is it possible with str.replace()?

str.replace() doesn't work with wildcards. You could use re.sub() instead. — Wups
– Wups, Commented Oct 22, 2020 at 10:49

JPI93 · Accepted Answer · 2020-10-22 11:03:01Z

You could use the built-in regex module (re) alongside the following pattern to effectively replace the content in your strings.

Pattern

'(?<=BH)[A-Z]+\.M'

This pattern looks behind (non-matching) to ensure to check for the substring 'BH', then matches on any uppercase character [A-Z] one or more times + followed by the substring '.M'.

Solution

The below solution uses re.sub() alongside the pattern outlined above to return a string with the substring matched by the pattern replaced with that defined here as replacement.

import re

original = 'II.NIL.10.BHB.M.2078.198.160857'
replacement = 'FG'
output = re.sub(r'(?<=BH)[A-Z]+\.M', replacement, original)

print(output)

Output

II.NIL.10.BHFG.2078.198.160857

Processing multiple files

To repeat this process for multiple files you could apply the above logic within a loop/comprehension, running the re.sub() function on each original/replacement pairing and storing/processing appropriately.

The below example uses the data from your original question alongside the above logic to create a list containing the results of each re.sub() operation by way of a dictionary mapping between the original filenames and substrings to be inserted using re.sub().

import re

originals = [
    'II.NIL.10.BHZ.M.2058.190.160877',
    'II.NIL.10.BHA.M.2008.190.168857',   
    'II.NIL.10.BHB.M.2078.198.160857'
]

replacements = ['T','D','FG']

mapping = {originals[i]: replacements[i] for i, _ in enumerate(originals)}

results = [re.sub(r'(?<=BH)[A-Z]+\.M', v, k) for k,v in mapping.items()]

for r in results:
    print(r)

Output

II.NIL.10.BHT.2058.190.160877
II.NIL.10.BHD.2008.190.168857
II.NIL.10.BHFG.2078.198.160857

Matthew Knill · Accepted Answer · 2020-10-22 10:51:59Z

1

Nope, you cannot use str.replace with a wildcard. You will have to use regex with something such as the following

import re

filenames = ['II.NIL.10.BHA.M.2008.190.168857 ', 'II.NIL.10.BHB.M.2078.198.160857', 
'II.NIL.10.BHC.M.2078.198.160857']
name = ['T','D','FG']

newfilenames = []

for i in range(len(filenames)):
    newfilenames.append(re.sub(r'BH.?\.M', 'BH'+name[i], filenames[i]))

print(' '.join(newfilenames)) # outputs II.NIL.10.BHT.2008.190.168857  II.NIL.10.BHD.2078.198.160857 II.NIL.10.BHFG.2078.198.160857

answered Oct 22, 2020 at 10:51

Matthew Knill

2522 silver badges13 bronze badges

Comments

Ajax1234 · Accepted Answer · 2020-10-22 16:36:37Z

1

You can use iter with next in the replacement lambda of re.sub:

import re
name = iter(['T','D','FG'])
s = """
  II.NIL.10.BHZ.M.2058.190.160877
  II.NIL.10.BHA.M.2008.190.168857   
  II.NIL.10.BHB.M.2078.198.160857
  """
result = re.sub('(?<=BH)\w\.\w', lambda x:f'{next(name)}', s)

Output:

II.NIL.10.BHT.2058.190.160877
II.NIL.10.BHD.2008.190.168857   
II.NIL.10.BHFG.2078.198.160857

answered Oct 22, 2020 at 16:36

Ajax1234

71.7k9 gold badges67 silver badges110 bronze badges

Collectives™ on Stack Overflow

Replace characters with particular format with a variable value in python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related