3

I use python open large log file like

Thu Oct  4 23:14:40 2012 [pid 16901] CONNECT: Client "66.249.74.228"
Thu Oct  4 23:14:40 2012 [pid 16900] [ftp] OK LOGIN: Client "66.249.74.228", anon     password "[email protected]"
Thu Oct  4 23:17:42 2012 [pid 16902] [ftp] FAIL DOWNLOAD: Client "66.249.74.228",   "/pub/10.5524/100001_101000/100039/Assembly-2011/Pa9a_assembly_config4.scafSeq.gz",  14811136 bytes, 79.99Kbyte/sec
Fri Oct  5 00:04:13 2012 [pid 25809] CONNECT: Client "66.249.74.228"
Fri Oct  5 00:04:14 2012 [pid 25808] [ftp] OK LOGIN: Client "66.249.74.228", anon password "[email protected]"
Fri Oct  5 00:07:16 2012 [pid 25810] [ftp] FAIL DOWNLOAD: Client "66.249.74.228", "/pub/10.5524/100001_101000/100027/Raw_data/PHOlcpDABDWABPE/090715_I80_FC427DJAAXX_L8_PHOlcpDABDWABPE_1.fq.gz", 14811136 bytes, 79.99Kbyte/sec
Fri Oct  5 00:13:19 2012 [pid 27354] CONNECT: Client "1.202.186.53"
Fri Oct  5 00:13:19 2012 [pid 27353] [ftp] OK LOGIN: Client "1.202.186.53", anon password "[email protected]"

I want to read the lines from the end of file like tail command to get the recently 7 days record.

Here is my code, how can i change it.

import time
f= open("/opt/CLiMB/Storage1/log/vsftp.log")
def OnlyRecent(line):
   if  time.strptime(line.split("[")[0].strip(),"%a %b %d %H:%M:%S %Y")>     time.gmtime(time.time()-(60*60*24*7)): 
    return True
return False
filename= time.strftime('%Y%m%d')+'.log'
f1= open(filename,'w')
for line in f:
 if OnlyRecent(line):
        print line
        f1.write(line)
f.close()
f1.close()
1

3 Answers 3

3

Use file.seek() to jump to some offset from the end of a file. For example, to print the last 1Kb of a file without reading the beginning of a file, do this:

with open("/opt/CLiMB/Storage1/log/vsftp.log") as f:
     f.seek(-1000, os.SEEK_END)
     print f.read()
Sign up to request clarification or add additional context in comments.

3 Comments

Hi it say NameError: name 'os' is not defined
can i directly add f.seek(-1000, os.SEEK_END) follow the f= open("/opt/CLiMB/Storage1/log/vsftp.log") in my program
@JesseSiu import os, and I think you can directly add f.seek ..., though with ... is preferred.
0

I didn't check this, just reformat code:

  1. less verbose import from time module
  2. dropwhile instead of for..if
  3. with context to open/close files
  4. PEP8
  5. miscs

-

from time import time, gmtime, strptime
from itertools import dropwhile

deadline = gmtime(time()-(60*60*24*7))
formatting = "%a %b %d %H:%M:%S %Y"

def not_recent(line):
    return strptime(line.split("[")[0].strip(), formatting) <= deadline

with open("/opt/CLiMB/Storage1/log/vsftp.log") as f:
    filename = time.strftime('%Y%m%d')+'.log'
    with open(filename,'w') as f1:
        for line in dropwhile(not_recent, f):
            print line
            f1.write(line)

Comments

0

Another Implementation, considering you are dealing with huge log files

def tail(fname, n):
    fin = os.open(fname,os.O_RDONLY ) #Get an open file desc
    size = os.fstat(fin).st_size #Get the size from the stat
    fin = os.fdopen(fin) #Convert fd to file obj
    count = 0
    fin.seek(size) #Seek to the end of the file
    try:
        while count < n: #Loop until the count of newlines exceed the tail size
            pos = fin.tell() - 2 #Step backward
            if pos == -1: #Until you are past the begining
                raise StopIteration #When you end the Loop
            fin.seek(pos)
            if fin.read(1) == '\n': #And check if the next character is a new line
                count += 1 #Maintaining the count
    except StopIteration:
        pass

    return fin

Usage

for e in tail("Test.log",10):
    print e

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.