1

I'm using Python regex to check a log file that contains the output of the Windows command tasklist for anything ending with .exe. This log file contains output from multiple callings of tasklist. After I get a list of strings with .exe in them, I want to write them out to a text file after checking to see if that string already exists in the output file. Instead of the desired output, it writes out duplicates of strings already present in the text file. (svchost.exe shows up several times for example.) The goal is to have a text file with a list of each unique process enumerated by tasklist with no duplicates of processes already written in the file.

import re

file1 = open('taskinfo.txt', 'r')
strings = re.findall(r'.*.exe', file1.read())
file1.close()
file2 = open('exes.txt', 'w+')
for item in strings:
    line_to_write = re.match(item, file2.read())
    if line_to_write == None:
        file2.write(item)
        file2.write('\n')
    else:
        pass

I used print statements to debug and made sure than item is the desired output.

0

1 Answer 1

4

There are some problems with your regex. Try this:

strings = re.findall(r'\b\S*\.exe\b', file1.read())

This will only take the text connected to the .exe by starting at a word boundary (\b) and grabbing all non-space characters (\S). Additionally, when you had .exe instead of \.exe, the . was matching as a wildcard, rather than a literal period.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.