0

I'm reading a file and I need to replace certain empty tags ([[Image:]]).

The problem is every replacement has to be unique.

Here's the code:

import re
import codecs

re_imagematch = re.compile('(\[\[Image:([^\]]+)?\]\])')

wf = codecs.open('converted.wiki', "r", "utf-8")
wikilines = wf.readlines()
wf.close()

imgidx = 0
for i in range(0,len(wikilines)):
 if re_imagematch.search(wikilines[i]):
  print 'MATCH #######################################################'
  print wikilines[i]
  wikilines[i] = re_imagematch.sub('[[Image:%s_%s.%s]]' % ('outname', imgidx, 'extension'), wikilines[i])
  print wikilines[i]
  imgidx += 1

This does not work, as there can be many tags in one line:

Here's the input file.

[[Image:]][[Image:]]
[[Image:]]

This is what the output should look like:

[[Image:outname_0.extension]][Image:outname_1.extension]]
[[Image:outname_2.extension]]

This is what it currently looks likeö

[[Image:outname_0.extension]][Image:outname_0.extension]]
[[Image:outname_1.extension]]

I tried using a replacement function, the problem is this function gets only called once per line using re.sub.

2 Answers 2

3

You can use itertools.count here and take some advantage of the fact that default arguments are calculated when function is created and value of mutable default arguments can persist between function calls.

from itertools import count

def rep(m, cnt=count()):
    return '[[Image:%s_%s.%s]]' % ('outname', next(cnt) , 'extension')

This function will be invoked for each match found and it'll use a new value for each replacement.

So, you simply need to change this line in your code:

wikilines[i] = re_imagematch.sub(rep, wikilines[i])

Demo:

def rep(m, count=count()):
    return str(next(count))

>>> re.sub(r'a', rep, 'aaa')
'012'

To get the current counter value:

>>> from copy import copy
>>> next(copy(rep.__defaults__[0])) - 1
2
Sign up to request clarification or add additional context in comments.

1 Comment

@AshwiniChaudhary although the current counter value works, it might be easier to wrap in a class that exposes a property of the previously yielded value... Although - it's quite a bit more work :p
1

I'd use a simple string replacement wrapped in a while loop:

s = '[[Image:]][[Image:]]\n[[Image:]]'
pattern = '[[Image:]]'
i = 0
while s.find(pattern) >= 0:
    s = s.replace(pattern, '[[Image:outname_' + str(i) + '.extension]]', 1)
    i += 1
print s

1 Comment

Since I'm not a big expert in Python, and neither are the people here that have to understand this too, I am accepting your answer. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.