1

I want to tail multiple files concurrently and push the logs to scribe. I am reading the files from a Config file then I want to tail each file and send the logs to scribe. What I have tried is sends log for only the first file and doesn't for the others.

I want to run the tailing concurrently for each file and send the logs for each one of them at same time.

for l in Config.items('files'):
  print l[0]
  print l[1]
  filename = l[1]
  file = open(filename,'r')
  st_results = os.stat(l[1])
  st_size = st_results[6]
  file.seek(st_size)
  while 1:
    where = file.tell()
    line = file.readline()
    if not line:
      time.sleep(1)
      file.seek(where)
    else:
      print line, # already has newline
      category=l[0]
      message=line
      log_entry = scribe.LogEntry(category, message)
      socket = TSocket.TSocket(host='localhost', port=1463)
      transport = TTransport.TFramedTransport(socket)
      protocol = TBinaryProtocol.TBinaryProtocol(trans=transport, strictRead=False, strictWrite=False)
      client = scribe.Client(iprot=protocol, oprot=protocol)
      transport.open()
      result = client.Log(messages=[log_entry])
      transport.close()
1
  • Your problem seems very similar to an example given towards the end of this presentation about using python generators for system programming. The whole presentation is really quite interesting, but in Part 7 the author gives an example of using threading and queues to tail multiple logs. This isn't a complete answer (sorry!) but you may want to take a look at it. Commented Dec 21, 2011 at 12:55

2 Answers 2

2

Try something like this (Inspired by this)

import threading

def monitor_file(l): 

    print l[0]
    print l[1]
    filename = l[1]
    file = open(filename,'r')
    st_results = os.stat(l[1])
    st_size = st_results[6]
    file.seek(st_size)
    while 1:
      where = file.tell()
      line = file.readline()
      if not line:
        time.sleep(1)
        file.seek(where)
      else:
        print line, # already has newline
        category=l[0]
        message=line
        log_entry = scribe.LogEntry(category, message)
        socket = TSocket.TSocket(host='localhost', port=1463)
        transport = TTransport.TFramedTransport(socket)
        protocol = TBinaryProtocol.TBinaryProtocol(trans=transport, strictRead=False,       strictWrite=False)
        client = scribe.Client(iprot=protocol, oprot=protocol)
        transport.open()
        result = client.Log(messages=[log_entry])
        transport.close()


for l in Config.items('files'):
    thread = threading.Thread(target=monitor_file, args=(l))
Sign up to request clarification or add additional context in comments.

1 Comment

This wont tail all the files at the same time
1

A different implementation of @Pengman's idea:

#!/usr/bin/env python
import os
import time
from threading import Thread

def follow(filename):
    with open(filename) as file:
        file.seek(0, os.SEEK_END) # goto EOF
        while True:
            for line in iter(file.readline, ''):
                yield line
            time.sleep(1)

def logtail(category, filename):
    print category
    print filename
    for line in follow(filename):
        print line,
        log_entry(category, line)

for args in Config.items('files'):
    Thread(target=logtail, args=args).start()

Where log_entry() is a copy of the code from the question:

def log_entry(category, message):
    entry = scribe.LogEntry(category, message)
    socket = TSocket.TSocket(host='localhost', port=1463)
    transport = TTransport.TFramedTransport(socket)
    protocol = TBinaryProtocol.TBinaryProtocol(trans=transport,strictRead=False,
                                               strictWrite=False)
    client = scribe.Client(iprot=protocol, oprot=protocol)
    transport.open()
    result = client.Log(messages=[entry])
    transport.close()

follow() could be implemented using FS monitoring tools, see tail -f in python with no time.sleep.

3 Comments

using this it is sending only the first log entry to scribe and the subsequent log entry is shown but not sent to scribe
@Rishabh: Is log_entry() called concurrently for multiple files (add debug output if you are not sure)? Does log_entry() work if you call it in a loop from a single thread (for i in range(10): log_entry('test', str(i)))? multiple threads (replace follow() with a fake data to test it)?
when I tailed the logs which the scribe made they were all working fine

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.