My connections log file is structured as follows:
hostname direction timestamp bps
Here's a fragment of my log file:
www.youtube.com DOWNLOAD 1479897661131903 23508910
www.youtube.com UPLOAD 1479897661131922 735
fonts.gstatic.com DOWNLOAD 1479897660289990 527
ssl.gstatic.com UPLOAD 1479897660152435 2094
fonts.gstatic.com DOWNLOAD 1479897660290973 6662177
I want to sort it according both timestamp and hostname: I tried
sort -k 3 -o sortedTimestamps.log connectionLog.txt
and the result is
ssl.gstatic.com UPLOAD 1479897660152435 2094
fonts.gstatic.com DOWNLOAD 1479897660289990 527
fonts.gstatic.com DOWNLOAD 1479897660290973 6662177
www.youtube.com DOWNLOAD 1479897661131903 23508910
www.youtube.com UPLOAD 1479897661131922 735
Now, this is just a sample: there are more and more rows, and for now, with the sort above, the log file is just sorted by timestamp. Since I need to plot this, I'd like to have different log files according to hostname and direction, containing timestamp and bps.
The final result would be having one log file for each hostname:
www.youtube.com_DOWNLOAD_log,
www.youtube.com_UPLOAD_log,
fonts.gstatic.com_DOWNLOAD_log,
fonts.gstatic.com_UPLOAD_log
and so on; each log file should contain just two columns, sorted timestamp and its corresponding bps.
E.g.: www.youtube.com_DOWNLOAD_log contains:
timestamp1 bps1
timestamp2 bps2
timestamp3 bps3
...
Plotting this on a graph, X-axis would be timestamp, and Y-axis bps. I will plot them all together and see how bps changes in time for various connections.
P.S.: this is my first attempt to visualize data, so there may be a smarter way to plot a log file structured like mine, but since here questions should be answered and not discussed, please help me splitting my log file in multiple log files, one for each hostname-direction.
Edit(2): thanks to Kalavan, here's my script:
Oh, the pipe! Oh, the power of Bash! I love it! Here's my full script:
#!/bin/bash
echo -e "\nCleaning previous log files...\n"
rm *.log
# File name: HOSTNAME_DIRECTION.log
sort -k1 -k3n connectionLog.txt | awk '{print $3 " " $8 >> $1"_"$2".log"}'
to_plot_upload_files="plot "
to_plot_download_files=" plot "
for file in $(ls *UPLOAD.log); do
to_plot_upload_files="$to_plot_upload_files \"$file\" using 1:2 with lines, "
done
for file in $(ls *DOWNLOAD.log); do
to_plot_download_files="$to_plot_download_files \"$file\" using 1:2 with lines, "
done
echo $to_plot_upload_files | gnuplot -persist
echo $to_plot_download_files | gnuplot -persist