3

I'm writing a script that parses a file with HTTP traffic lines, and takes out the domains and currently just prints them to the screen. I'm using httpry to continuously write the traffic to a file. Here is the script I'm using to strip out the domain names

#!/usr/bin/python

import re

input = open("results.txt","r")

for line in input:
    domain = line.split()[6]
    if domain != "-":
        print domain

While this script works great, I'd like a way to continuously run this script so that as new traffic gets added to the input file, the script is able to strip it out. I can't just run awk on the output of httpry, as I'm eventually going to be entering these domains into a Mongo database, and I'll need the script to do that as well. If anyone could give me some ideas how to constantly run this python script on the output, but not reprint previous entries, it would be much appreciated. Thanks.

2 Answers 2

6

Try this tail -f implementation as found at http://code.activestate.com/recipes/157035-tail-f-in-python/

import time

while 1:
    where = file.tell()
    line = file.readline()
    if not line:
        time.sleep(1)
        file.seek(where)
    else:
        print line, # already has newline
Sign up to request clarification or add additional context in comments.

Comments

0

Node.js has a nice readline module that should handle this nicely:

var readline = require('readline')
  , fs = require('fs')

var input = process.stdin; // or: fs.createReadStream('input.txt');
var output = process.stdout; // or: fs.createWriteStream('output.txt')

var reader = readline.createInterface({
  input: input,
  output: output
});

reader.on('line', function(line) {
  this.write(line.split(/[ ]+/)[6]);
});

Save this in a .js file and do node domains.js, or whatever you named it. Or cat file | node domains.js.

It should integrate nicely with mongodb in the future, too :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.