2

I have a long string (a "template") containing "replacement points" in the form of %MARK% (there can be more occurences in the string for a single given marker too). I want to replace these markers, controlled by a Python dictionary (it does not contain the % signs for markers), like:

rep_dict = { "TITLE": "This is my title", "CONTENT": "Here it is the content" }

The problem: simple call of replace() method one by one is not a good solution: the previous replacement may contain one of these marks, which then must not be replaced!

The solution should be fast enough, since I have large templates, and I need to replace many of them within a big loop. I have a very ugly and long implementation with many find()'s, counting offsets in the original string during the replacament process, etc. I have the hope that there is a much nicer, more compact, and quicker solution.

3
  • Pity you used %MARK% instead of {MARK} because you could have used conventional string formating with your dictionary. Also with ${MARK} or $MARK you could have used string templates. Commented Oct 12, 2011 at 10:52
  • @joaquin did you mean %(MARK) or {MARK}? The %... notation is deprecated, and the {...} one requires him to double plain-text curly braces: {{these braces make it to the output string}}, {these do not}. Commented Oct 12, 2011 at 10:53
  • No I was not refering to interpolation with % (btw it has no date of disappearance yet, despite what was said) but string.Template. I reedited and completed my comment Commented Oct 12, 2011 at 10:58

2 Answers 2

3

The easiest solution is

import re
re.sub(r'%(.+?)%', lambda m: rep_dict[m.group(1)], YOUR_TEMPLATE)

Not fast enough? Someone said 'do not use regex' and you obey? Parsing your template using some code in Python would be even more complex and slow (don't forget, re is written in C).

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, well, I didn't want to avoid regexps by intent (it's more like the case that I am somewhat beginner in Python) ... This is a nice and "elegant" solution, however there is a little problem with it: template may contain a marker, which is not in the rep_dict. In this case this solution produces an exception. I would need the unmodified %...% marker in the return string if there is no replacement information for that in the rep_dict.
If you want to ignore invalid markers, you should use rep_dict.get(m.group(1), m.group()) instead of rep_dict[m.group(1)]. docs.python.org/library/stdtypes.html#dict.get
Till now, this is my best solution (with more implementation details inside): def _replace_by_dict(rep, s): s = re.split("(%[A-Z0-9]{1,32}%)", s) for a, b in rep.items(): for c in range(len(s)): if s[c] == "%" + a + "%": s[c] = b return "".join(s) It was written since I've asked the question and before I read your answer. But your solution is much more elegant, just I have problems with the "unhandled markers", as I've mentioned.
You could just catch the KeyError and pass
Nice, your comment helped me to understand this now. Thank you for your answer/comments!
|
0

This was excellent. I have always used the excuse of not having time to learn RegEx, but always respected it. This post gave me the necessary to get started. This was my solution though, I found the group call was mixed up in the dictionary parameters:

retVal          = re.sub(r'%title', theTitle, template)
retVal          = re.sub(r'%([a-z]+?)+', \
                    lambda m: myDict.get(m.group(0)[1:], ''), retVal)

title was not in the dictionary, that is why I did it first. Requrements of the others in the team.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.