I am using Python 3.6, and have several thousand text documents that I have scanned from PDF files into a python 3 dictionary as a string. Each document is a separate dictionary entry of a single string. I am trying to use a regular expression search to extract the name and address information from each page. I have identified that the last name is always preceded by “Room #______” and followed by “Last/“ I have tried to do this, but it doesn’t seem to work. I am not at all familiar with lookaround constructs. Can anyone tell me what I’m doing wrong? My final code will have several of these searches, this is only the first.
memberRecord = memberData[1]
memberRegex = re.compile(r'''(
(?<=Room #______)\w+(?=Last)
$
)''', re.VERBOSE)
mo = memberRegex.search(memberRecord)
Room #____, you word andLast. TryRoom #______(.*?)Lastand when a match is found, grabmo.group(1).