1

I am trying to search for a particular string in every line of a log file, and if that matches, i need to be able to get the Host information from that particular error.

Consider the log entries as below :

05-05-2014 00:02:02,771 [HttpProxyServer-thread-1314] ERROR fd - Empty user name specified in NTLM authentication. Prompting for auth again.
Host=tools.google.com, Port=80, Client ip=/10.253.168.128, port=37271, User-Agent: Google Update/1.3.23.9;winhttp;cup-ecdsa
05-05-2014 00:02:02,771 [HttpProxyServer-thread-2156] ERROR fd - Empty user name specified in NTLM authentication. Prompting for auth again.
Host=tools.google.com, Port=80, Client ip=/10.253.168.148, port=37273, User-Agent: Google Update/1.3.23.9;winhttp;cup-ecdsa
05-05-2014 00:02:02,802 [HttpProxyServer-thread-604] ERROR fd - Empty user name specified in NTLM authentication. Prompting for auth again.
Host=tools.google.com, Port=80, Client ip=/10.253.168.92, port=37280, User-Agent: Google Update/1.3.23.9;winhttp;cup

This is my code :

for line in log_file:

   if bool(re.search( r'Empty user name specified in NTLM authentication. Prompting for auth again.', line)):

   host = re.search(r'Host=(\D+.\D+.\D+,)', line).group(1)

Problem is the Host information is not in the same line as the error. It is in the next line. How do i get the re.search(r'Host=(\D+.\D+.\D+,)', line).group(1) to search in the next line that "line" is currently in?

2
  • What's wrong with reading the whole file? Commented Dec 24, 2014 at 7:15
  • @AvinashRaj, perhaps, that huge log files need not comfortably fit in memory... Commented Dec 24, 2014 at 7:19

3 Answers 3

2

Just insert a

line = next(log_file)

between the two statements you currently have in the for loop.

Sign up to request clarification or add additional context in comments.

Comments

0

Try this:

>>> import re
>>> fp = open('log_file')
>>> line = fp.readline()
>>> while line:
...    if 'Empty user name specified in NTLM authentication. Prompting for auth again.' in line:
...        host = re.search(r'Host=(\D+.\D+.\D+,)', fp.readline()).group(1)
...        #                                        ^^^^^^^^^^^^^^  
...        #                              this makes re search in the next line 
...        print host
...    line = fp.readline()
... 
tools.google.com,
tools.google.com,
tools.google.com,

2 Comments

Irshad! Worked like a freakin charm! But how did the contents of "line" get changed to the next line? Does that happen with the While statement?
See first line is read outside while. Now if this line contains Empty user .... auth again. , re searches in the next line using fp.readline() in host = re.search(r'Host=(\D+.\D+.\D+,)', fp.readline()).group(1). After that last code line in while i.e., line = fp.readline() reads next line and proceeds again with while.
0

Either write a regex that matches 2 successive lines from which you can extract the Host info of each, and loop over the matches instead of reading the file line-by-line, or add a flag that gets set when a line matches the error, and if that flag is set for a given line, you extract the host info & reset the flag instead of testing for the error.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.