Tuesday, 7 August 2012

Apache Logging to MongoDB Using a Named Pipe

Following on from the last article on remote logging where I collected all the logs into one place, I wanted to be able to query them -so that I can ask questions like "which pages take longer than  5 seconds to produce", 'how many of our redirects fail?" and the like.

I believe that there are already MySQL modules for rsyslogd, but I had written a script in Python to interpret logs and put them into MongoDB, so I wanted to use that.

The obvious solution to join the two programs together seemed to be to used a named pipe, a vaery basic type of interprocess communictation where one program chucks bits down the pipe and the other pulls them out.

Chucking them in is easy enough, firsty create your pipe with mkfifo , then  just change the line is the rsyslog server config from :

:programname, isequal, "apache2" /var/log/oneGiantHeapOfLogs.log
:programname, isequal, "apache2" |/tmp/logger_pipe

Reading from the pipe was pretty easy too as this Python snippet shows :

#Open the pipe
in_pipe = open(sys.argv[1], "r")
#Loop forever
while True :
    line = in_pipe.readline()[:-1]    #Supposedly makes this blocking
    if len(line) == 0 :                       #Happens in test

In short, open the pipe, loop forever reading lines. We want this to keep going so the program is wrapped in an exception handler to catch any log parsing errors.

Having got the input to the program it's just a problem or parsing that input -we can take a look at that in another post.