I have a text log file that looks like this:
Line 1 - Date/User Information
Line 2 - Type of LogEvent
Line 3-X, variable number of lines with additional information,
could be 1, could be hundreds
Then the sequence repeats.
There are around 20K lines of log, 50+ types of log events, approx. 15K separate user/date events. I would like to parse this in Python and make this information queryable.
So I thought I'd create a class LogEvent that records user, date (which I extract and convert to datetime), action, description... something like:
class LogEvent():
def __init__(self,date,user):
self.date = date # string converted to datetime object
self.user = user
self.content = ""
Such an event is created each time a line of text with user/date information is parsed.
To add the log event information and any descriptive content, there could be something like:
def classify(self,logevent):
self.logevent = logevent
def addContent(self,lineoftext):
self.content += lineoftext
To process the text file, I would use readline() and proceed one line at a time. If the line is user/date, I instantiate a new object and add it to a list...
newevent = LogEvent(date,user)
eventlist.append(newevent)
and start adding action/content until I encounter a new object.
eventlist[-1].classify(logevent)
eventlist[-1].addContent(line)
All this makes sense (unless you convince me there is a smarter way to do it or a useful Python module I am not aware of). I'm trying to decide how best to classify the log event type when working with a set list of possible log event types that might hold more than 50 possible types, and I don't just want to accept the entire line of text as the log event type. Instead I need to compare the start of the line against a list of possible values...
What I don't want to do is have 50 of these:
if line.startswith("ABC"):
logevent = "foo"
if line.startswith("XYZ"):
logevent = "boo"
I thought about using a dict as lookup table but I am not sure how to implement that with the "startswith"... Any suggestions would be appreciated, and my apologies if I was way too long winded.
logeventattribute. Also, do you have the various types of log events in a list or better yet, a set?