python : reading a datetime from a log file using regex

Question

I have a log file which has text that looks like this.

Jul  1 03:27:12 syslog: [m_java][ 1/Jul/2013 03:27:12.818][j:[SessionThread <]^Iat com/avc/abc/magr/service/find.something(abc/1235/locator/abc;Ljava/lang/String;)Labc/abc/abcd/abcd;(bytecode:7)

There are two time formats in the file. I need to sort this log file based on the date time format enclosed in [].

This is the regex I am trying to use. But it does not return anything.

t_pat = re.compile(r".*\[\d+/\D+/.*\]")

I want to go over each line in file, be able to apply this pattern and sort the lines based on the date & time.

Can someone help me on this? Thanks!

Might it not be easer to use the date and time at the start of the line? — Ronnie
– Ronnie, Commented Jul 5, 2013 at 15:41
the time inside [] has more precision in terms of seconds. And I do get quite a few logs in a sec, that need to be sorted. — Supriya K
– Supriya K, Commented Jul 5, 2013 at 15:42
@MartijnPieters - It is a 'two digit' entry. So there is a space here. It would fit '28' or other two digits — Supriya K
– Supriya K, Commented Jul 5, 2013 at 15:44

beiller · Accepted Answer · 2013-07-05 15:50:11Z

2

You have a space in there that needs to be added to the regular expression

text = "Jul  1 03:27:12 syslog: [m_java][ 1/Jul/2013 03:27:12.818][j:[SessionThread <]^Iat com/avc/abc/magr/service/find.something(abc/1235/locator/abc;Ljava/lang/String;)Labc/abc/abcd/abcd;(bytecode:7)"
matches = re.findall(r"\[\s*(\d+/\D+/.*?)\]", text)
print matches
['1/Jul/2013 03:27:12.818']

Next parse the time using the following function

http://docs.python.org/2/library/time.html#time.strptime

Finally use this as a key into a dict, and the line as the value, and sort these entries based on the key.

edited Jul 5, 2013 at 15:50

answered Jul 5, 2013 at 15:44

beiller

3,1351 gold badge14 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Martijn Pieters · Accepted Answer · 2013-07-05 15:50:19Z

1

You are not matching the initial space; you also want to group the date for easy extraction, and limit the \D and .* patterns to non-greedy:

t_pat = re.compile(r".*\[\s?(\d+/\D+?/.*?)\]")

Demo:

>>> re.compile(r".*\[\s?(\d+/\D+?/.*?)\]").search(line).group(1)
'1/Jul/2013 03:27:12.818'

You can narrow down the pattern some more; you only need to match 3 letters for the month for example:

t_pat = re.compile(r".*\[\s?(\d{1,2}/[A-Z][a-z]{2}/\d{4} \d{2}:\d{2}:[\d.]{2,})\]")

edited Jul 5, 2013 at 15:50

answered Jul 5, 2013 at 15:44

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

1 Comment

Ronnie Over a year ago

I also think you need to make the last quantifier lazy: [\s?\d+/\D+/.*?]

Community · Accepted Answer · 2017-05-23 12:11:17Z

1

Read all the lines of the file and use the sort function and pass in a function that parses out the date and uses that as the key for sorting:

import re
import datetime

def parse_date_from_log_line(line):
    t_pat = re.compile(r".*\[\s?(\d+/\D+?/.*?)\]")
    date_string = t_pat.search(line).group(1)
    format = '%d/%b/%Y %H:%M:%S.%f'
    return datetime.datetime.strptime(date_string, format)

log_path = 'mylog.txt'
with open(log_path) as log_file:
    lines = log_file.readlines()
    lines.sort(key=parse_date_from_log_line)

edited May 23, 2017 at 12:11

CommunityBot

11 silver badge

answered Jul 5, 2013 at 16:20

user9903

2 Comments

Supriya K Over a year ago

I get the below error:date_string = t_pat.search(line).group(1) AttributeError: 'NoneType' object has no attribute 'group'

user9903 Over a year ago

@SupriyaK this is assuming that line is not None, there's no error checking in the code, if there were it would have to handle the None case and if there's no datetime in the line it needs to decide whether to skip it or not.

Collectives™ on Stack Overflow

python : reading a datetime from a log file using regex

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related