I have a file like this:
Date Time Level Gridsize Miss Temp Parameter ID
1988-05-16 12:00:00 -3 1 0 27.060 -1
1988-05-16 12:00:00 -3 1 0 9.0300 -2
1988-05-16 12:00:00 -3 1 0 1.2000 -3
1988-05-17 12:00:00 -3 1 0 27.100 -1
1988-05-17 12:00:00 -3 1 0 9.0200 -2
1988-05-17 12:00:00 -3 1 0 1.2300 -3
1988-05-18 12:00:00 -3 1 0 27.190 -1
1988-05-18 12:00:00 -3 1 0 9.0400 -2
1988-05-18 12:00:00 -3 1 0 1.2200 -3
These are temperature data from sensors in different locations. The field Parameter ID determines if the data was taken from sensor -1, -2 or -3. I'm exporting this data to a CSV file to analyze it. The desired format is:
Date Time -1 -2 -3
1988-05-16 12:00:00 27.060 9.0300 1.2000
1988-05-17 12:00:00 27.100 9.0200 1.2300
1988-05-18 12:00:00 27.190 9.0400 1.2200
It groups data by date and time, and separates in columns regarding the value of Parameter ID.
I'm not sure about how doable this is with AWK, but I'm once I mostly use SSH to prepare this data, a solution with AWK is very interesting for me. Also, if there are other tools to achieve this on bash, I'm interested :) (sed, whatever, but got to use native linux commands because I'm not allowed to install anything on the server)
What I'm doing nowadays
Nowadays I run a code for each sensor
$ awk '$NF == "-1" {print $1";"$2";"$6}' > netcdf_extract_1.csv
$ awk '$NF == "-2" {print $1";"$2";"$6}' > netcdf_extract_2.csv
$ awk '$NF == "-3" {print $1";"$2";"$6}' > netcdf_extract_3.csv
And import them on python. Then I group them by date and time and got my table.