I have a regex matcher that inspects web traffic in Python. The issue is that sometimes, if the page is larger than 1MB and consists of a single string, the regex takes forever to complete. I'm wondering if there is a way to set a max execution timeout?
My script reads the regexes from a file and then processes them one at a time:
def match_keywords_add_to_es(pastes):
match_list = get_re_match_list()
for paste in pastes:
log.info("matching %s"%paste["key"])
for match in match_list:
matched = match[1].findall(paste["raw"].lower())
if len(matched) > 0:
try:
paste["keywords"] = match[0]
res_paste = Paste(dictionary=paste)
Paste.add_paste(res_paste)
except Exception,e:
log.error("Failed to add the paste "+str(paste)+" with error %s"%e)
Example regexes:
secret_key:
match: .*secret(_)?key\s*=\s*(.*)?[A-Za-z0-9\/]{40}(.*)?.*
access_key:
match: .*access(_)?key\*=\*(.*)?[A-Z0-9]{20}(.*)?.*
example.com:
match: .*example\.com.*