I am quite a beginner in python, so probably the taks I would like to perform is relative simple. I have log files with first column timestamp and second column describes the action, and I need to get the time difference between two actions. The log file called log.txt looks like:
2017-05-11T12:22:12.760 End step: first action
2017-05-11T12:22:13.724 Start step: other action
2017-05-11T12:22:15.069 End step: other action
2017-05-11T12:22:15.933 Start step: first action
I wrote a basic script that parses trough the directories, searches for specific words and calculate the time difference. However, the code is really basic and would like to improve it a little bit, for example by defining a config function that arises error and keeps running when the key words are missing. Any suggestion would be really appreciate. the code I use looks like:
from datetime import datetime
from datetime import timedelta
import re
import os
import numpy as np
inputDir = os.path.dirname(os.path.realpath(__file__))
for subdir, dirs, files in os.walk(inputDir):
for file in files:
filepath = subdir + os.sep + file
if 'log.txt' in filepath:
filename=open(os.path.join(filepath))
with open("OutputTime.csv","a") as outfile:
for line in filename:
line = line.rstrip()
if re.search('Start step: first action', line):
start_first_step=line[:23]
start_step=datetime.strptime(start_first_step, "%Y-%m-%dT%H:%M:%S.%f")
if re.search('End step: first action', line):
end_first_step=line[:23]
end_step=datetime.strptime(end_first_step, "%Y-%m-%dT%H:%M:%S.%f")
minutes=end_step - start_step
minutes=minutes.total_seconds()/60
print minutes