2

I have a csv file like this(date values just 7 days, but not static):

DATE,PATH,%
20.05.2015,RAM,0
20.05.2015,SWAP,00
20.05.2015, /, 000
20.05.2015, /dev, 0000
21.05.2015,RAM,1
21.05.2015,SWAP,11
21.05.2015, /, 111
21.05.2015, /dev, 1111
22.05.2015,RAM,2
22.05.2015,SWAP,22
22.05.2015, /, 222
22.05.2015, /dev, 2222
22.05.2015, /root, 22222
22.05.2015, /root2, 222222
23.05.2015, /root2, 333333
24.05.2015, /root2, 444444
25.05.2015, /root2, 555555
26.05.2015, /root2, 666666

I need to print date values of rows to columns like this:

PATH,20.05.2015,21.05.2015,22.05.2015,23.05.2015,24.05.2015,25.05.2015,26.05.2015
RAM,0,1,2,,,,
SWAP,00,11,22,,,,
/,000,111,222,,,,
/dev,0000,1111,2222,,,,
/root,,,22222,,,,
/root2,,,222222,333333,444444,555555,666666

Any suggestion for that with awk or other way?

I tried solutions on this topic:

An efficient way to transpose a file in Bash

But it's a different question, I can't fix for my question.

Excel view for easy understanding:

enter image description here

3
  • first: any suggestion on how to handle the question? you should provide some indication on what is the logic of this. There is no point in showing two files and ask a script to go from one to another one. Instead, indicate what occurs with the data, what it means and what you tried so far Commented May 26, 2015 at 14:02
  • Edited my question. Date values have just 7 different dates, but not static. But lines not static too, some days just 2 lines, some days more. Thats why I use ... because its not static. Commented May 26, 2015 at 14:06
  • possible duplicate of Transpose a file in bash Commented May 26, 2015 at 14:17

2 Answers 2

1
$ cat tst.awk
BEGIN { FS="[[:space:]]*,[[:space:]]*"; OFS="," }
NR==1 { cell[++numDates,++numPaths] = $2; next }
{ date = $1; path = $2 }
!(date in date2nr) { date2nr[date] = ++numDates; cell[numDates,1] = date }
!(path in path2nr) { path2nr[path] = ++numPaths; cell[1,numPaths] = path }
{ cell[date2nr[date],path2nr[path]] = $3 }
END {
    for (pathNr=1; pathNr<=numPaths; pathNr++) {
        for (dateNr=1; dateNr<=numDates; dateNr++) {
            printf "%s%s", cell[dateNr,pathNr], (dateNr<numDates?OFS:ORS)
        }
    }
}

$ awk -f tst.awk file
PATH,20.05.2015,21.05.2015,22.05.2015,23.05.2015,24.05.2015,25.05.2015,26.05.2015
RAM,0,1,2,,,,
SWAP,00,11,22,,,,
/,000,111,222,,,,
/dev,0000,1111,2222,,,,
/root,,,22222,,,,
/root2,,,222222,333333,444444,555555,666666
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, added excel example for easy understanding. I'm working on your example, I think I need to select distinct values of dates and print after PATH,
The excel images helped clarify a lot, thanks for providing that. I've updated my answer.
1

You desired output is not a pure transposition of the input, nor is it clear if there is a pure mapping of the number of fields to blank fields in the desired output. (I think that is true.)

If that is correct, this Python program does what you want:

import csv
from collections import OrderedDict

data=OrderedDict()

with open(fn) as f:
    reader=csv.reader(f, skipinitialspace=True)
    header=next(reader)
    data[header[1]]=[]
    for line in reader:
        if line[1] not in data:
            data[line[1]]=[]

with open(fn) as f:
    reader=csv.reader(f, skipinitialspace=True)
    header=next(reader)
    for line in reader:
        data[header[1]].append(line[0])
        data[line[1]].append(line[2])
        for e in set(data.keys())-set([header[1],line[1]]):
            data[e].append('')

for k, v in data.items():
    print k, ','.join(v)    

Prints:

PATH 20.05.2015,20.05.2015,20.05.2015,20.05.2015,21.05.2015,21.05.2015,21.05.2015,21.05.2015,22.05.2015,22.05.2015,22.05.2015,22.05.2015,22.05.2015,22.05.2015,23.05.2015,24.05.2015,25.05.2015,26.05.2015
RAM 0,,,,1,,,,2,,,,,,,,,
SWAP ,00,,,,11,,,,22,,,,,,,,
/ ,,000,,,,111,,,,222,,,,,,,
/dev ,,,0000,,,,1111,,,,2222,,,,,,
/root ,,,,,,,,,,,,22222,,,,,
/root2 ,,,,,,,,,,,,,222222,333333,444444,555555,666666

1 Comment

Edited my question, sorry for misunderstanding. I don't want to multiple column for same date values. Added excel example. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.