So I have multiple websites running under apache2 virtualhost - and I wanted to use GoAccess to process the access.log for each site.
The directory structure is like so:
/home/www/site1/html
/home/www/site1/log
/home/www/site1/stats
/home/www/site2/html
/home/www/site2/log
/home/www/site2/stats
Some sites contain two different access.log files -
ssl.access.log for SSL
access.log for non-SSL
These are located in the /log directory of each site
I wanted a cronjob to run every night to process the stats with GoAccess, but I didn't want to write multiple lines of nearly duplicate commands.
I've never written a bash script before, and so I do not know if this is the most efficient way of doing things.
Each report that is generated, needs the month/year in it, so each night it gets overwritten with that months latest stats.
the reports are outputted in the stats directory of each site, in the following format
yyyy-mm.html
sslyyyy-mm.html
The Script
#!/bin/bash
# find all log files which match ess.log
LOG_FILES="/home/www/*/log/*ess.log"
# set the date format
NOW=$(date +"%Y-%m")
# loop through each log file
for f in $LOG_FILES
do
# drop back from /home/www/site/log to /home/www/site
path=`dirname $f`
path=`dirname $path`
# get the current log filename
filename=`basename $f`
# if /home/www/site/stats does not exist - create it
if [ ! -d "$path/stats" ]; then
mkdir "$path/stats"
fi
# get the first part of the log filename
prefix=(${filename//./ })
# if its equal to access, then it's not ssl log, so remove the prefix
if [ $prefix == 'access' ]; then
prefix=''
fi
# run the goaccess process
goaccess -f $f --date-format=%d/%b/%Y --log-format='%h %^[%d:%^] "%r" %s %b "%R" "%u"' -a > "$path/stats/$prefix$NOW.html"
done
I know this is a fairly simple task, but as I have not any experience specifically in this it would be great to know where I could improve this.