UNIX - Adding header to the first column of every line

Question

I have a question about processing files in UNIX line by line. What I have right now is this -

Source file:

header-1 header-sub1
field1|field2|field3|field4
field5|field6|field7|field8
header-2
field9|field0|fieldA|fieldB

Now I want to process this file line by line and generate an output file. The header should be appended to the first column of every line until the next header is found. That is in essence the output file should be as below:

Output:

header-1 header-sub1|field1|field2|field3|field4
header-1 header-sub1|field5|field6|field7|field8
header-2|field9|field0|fieldA|fieldB

The shell script loop that I have with me is this -

while read line 
do
    echo "Line ---> ${line}"
    if [ $line = "header-1" -o $line = "header-2" ]
    then
        first_col=$line
    else
        complete_line=`echo $first_col"|"$line`
        echo "$complete_line" >> out.csv
    fi
done < input.txt

Shouldn't the input file be read line by line and then create an appended "complete line"? The thing is the program will treat header-1 and header-sub1 as two distinct fields and it will not match the complete header line 1. But I know they are on the same line, so they should be considered as a single line. Or maybe I am missing out on the logic and/or syntax somewhere?

Also is there any way I can use sed or awk to create such a file? Thanks in advance for any suggestions.

fedorqui · Accepted Answer · 2014-03-03 13:12:34Z

4

You can use this awk:

$ awk 'BEGIN{OFS="|"} /^header/ {h=$0; next} {print h, $0}' file
header-1 header-sub1|field1|field2|field3|field4
header-1 header-sub1|field5|field6|field7|field8
header-2|field9|field0|fieldA|fieldB

Explanation

BEGIN{OFS="|"} set the output field separator as |.
/^header/ {h=$0; next} if the line starts with header, then store it without printing.
{print h, $0} on the rest of the lines, print the stored header first.

answered Mar 3, 2014 at 13:12

fedorqui

294k113 gold badges592 silver badges640 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

KG3 Over a year ago

Thanks. I will try this and post the results here.

jaypal singh Over a year ago

+1 though instead of using header as regex, you can use NF to store the lines where NF==1 as headers.

jaypal singh Over a year ago

Since you split the lines on | it will still consider it as header. But it all depends on OPs data.

fedorqui Over a year ago

Aaaaah sorry, @jaypal I misunderstood your previous comment. I thought you were saying NR==1, while now I see you said NF==1. That makes a lot of sense, yes. I will leave it like it is, as you said because we don't know how OPs data looks like. Thanks master :)

fedorqui Over a year ago

@jaypal I also cannot help checking SO from my phone :) Yes, it initially had FS set but then I saw it is not necessary. Regarding OFS, it is needed in the print h, $0. Otherwise it would print a space instead of |.

|

potong · Accepted Answer · 2014-03-03 15:51:14Z

1

This might work for you (GNU sed):

sed -r '/^header/{h;d};G;s/(.*)\n(.*)/\2|\1/' file

Store the header in the hold space and inserts it before non-header lines.

answered Mar 3, 2014 at 15:51

potong

59.3k6 gold badges55 silver badges92 bronze badges

Collectives™ on Stack Overflow

UNIX - Adding header to the first column of every line

2 Answers 2

Explanation

9 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Explanation

9 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related