I would like to do some text conversion, such as reading in from a text file:
CONTENTS
1. INTRODUCTION
1.1 The Linear Programming Problem 2
1.2 Examples of Linear Problems 7
and writing to another text file:
("CONTENTS" "#")
("1. INTRODUCTION" "#")
("1.1 The Linear Programming Problem 2" "#11")
("1.2 Examples of Linear Problems 7" "#16")
The current Python code I use for such conversion is:
infile = open(infilename)
outfile = open(outfilename, "w")
pat = re.compile('^(.+?(\d+)) *$',re.M)
def zaa(mat):
return '("%s" "#%s")' % (mat.group(1),str(int(mat.group(2))+9))
outfile.write('(bookmarks \n')
for line in infile:
outfile.write(pat.sub(zaa,line))
outfile.write(')')
It will convert the original text to
CONTENTS 1. INTRODUCTION ("1.1 The Linear Programming Problem 2" "#11") ("1.2 Examples of Linear Problems 7" "#16")The last two lines are correct, but the first two lines are not. So I was wondering how to accommodate the first two lines, by modifying the current code, or using some different code?
The code was not written by me, but I would like to understand the usage of
re.sub()here. As I found from a Python website,re.sub(regex, replacement, subject) performs a search-and-replace across subject, replacing all matches of regex in subject with replacement. The result is returned by the sub() function. The subject string you pass is not modified.
But in my code, its usage is `pat.sub(zaa,line)', which seems to me not consistent to the quoted description. So I was wondering how to understand the usage in my code?
Thanks!
re.sub()thing too. Turns out there are two sub functions:re.sub(pattern, repl, string[, count])and another to be used with a compiled regex object:RegexObject.sub(repl, string[, count=0]). This function is using the latter syntax.