3

I am having a single file in which it contains 1st and 2nd column with item code and name, then from 3rd to 12th column which contains its 10 days consumption quantity continuously. Now i need to convert that into 10 different files. In each the 1st and 2nd column should be the same item code and item name and the 3rd column will contain the consumption quantity of one day in each..

input file:

Code  | Name | Day1 | Day2 | Day3 |... 

10001 | abcd | 5 | 1 | 9 |...    
10002 | degg | 3 | 9 | 6 |...    
10003 | gxyz | 4 | 8 | 7 |...

I need the Output in different file as

file 1:

Code  | Name | Day1

10001 | abcd | 5   
10002 | degg | 3   
10003 | gxyz | 4   

file 2:

Code  | Name | Day2

10001 | abcd | 1   
10002 | degg | 9   
10003 | gxyz | 8  

file 3:

Code  | Name | Day3

10001 | abcd | 9   
10002 | degg | 6   
10003 | gxyz | 7 

and so on....

I wrote a code like this

awk 'BEGIN { FS = "\t" } ; {print $1,$2,$3}' FILE_NAME > file1;
awk 'BEGIN { FS = "\t" } ; {print $1,$2,$4}' FILE_NAME > file2;
awk 'BEGIN { FS = "\t" } ; {print $1,$2,$5}' FILE_NAME > file3;

and so on...

Now i need to write it with in a 'for' or 'while' loop which would be faster...

I dont know the exact code, may be like this..

for (( i=3; i<=NF; i++)) ; do awk 'BEGIN { FS = "\t" } ; {print $1,$2,$i}' input.tsv > $i.tsv; done

kindly help me to get the output as i explained.

3
  • you are mixing shell and awk... use awk alone.. gnu.org/software/gawk/manual/html_node/For-Statement.html Commented May 14, 2017 at 12:06
  • Sorry, Im unaware to differentiate awk and shell. If possible, Kindly tell me the code directly to get that output. @Sundeep Commented May 14, 2017 at 12:17
  • have a look at syntax from the doc link in earlier comment... you just need to move that for loop inside awk... give it a try Commented May 14, 2017 at 12:18

3 Answers 3

2

If you absolutely need to to use a loop in Bash, then your loop can be fixed like this:

for ((i = 3; i <= 10; i++)); do awk -v field=$i 'BEGIN { FS = "\t" } { print $1, $2, $field }' input.tsv > file$i.tsv; done

But it would be really better to solve this using pure awk, without shell at all:

awk -v FS='\t' '
  NR == 1 {
    for (i = 3; i < NF; i++) {
      fn = "file" (i - 2) ".txt";
      print $1, $2, $i > fn;
      print "" >> fn;
    }
  }
  NR > 2 {
    for (i = 3; i < NF; i++) {
      fn = "file" (i - 2) ".txt";
      print $1, $2, $i >> fn;
    }
  }' inputfile

That is, when you're on the first record, create the output files by writing the header line and a blank line (as in specified in your question).

For the 3rd and later records, append to the files.

Note that the code in your question suggests that the fields in the file are separated by tabs, but the example files seem to use | padded with variable number of spaces. It's not clear which one is your actual case. If it's really tab-separated, then the above code will work. If in fact it's as the example inputs, then change the first line to this:

awk -v OFS=' | ' -v FS='[ |]+' '
Sign up to request clarification or add additional context in comments.

3 Comments

Hi Janos, Can you give your mail id. I want to show u something about my original requirement. @janos
Hi @ArunVenkitusamy, I would rather not. If your real requirement is different from what is in your question, that's very unfortunate, and I wish you would have written that first. It's not fair to ask something, get an answer, and then change the question to something else. If some minor clarification is needed, edit your question and maybe we can help. If you need something different, it would be better to ask a new question.
Hi @janos, sorry for wasting your time.. I have created a new question. Kindly have a look [stackoverflow.com/questions/43965359/…
2

bash + cut solution:

input.tsv test content:

Code | Name | Day1 | Day2 | Day3
10001 | abcd | 5 | 1 | 9
10002 | degg | 3 | 9 | 6
10003 | gxyz | 4 | 8 | 7

day_splitter.sh script:

#!/bin/bash

n=$(cat $1 | head -1 | awk -F'|' '{print NF}') # total number of fields
for ((i=3; i<=$n; i++))
do
    fn="Day"$(($i-2))  # file name containing `Day` number 
    $(cut -d'|' -f1,2,$i $1 > $fn".txt")
done

Usage:

bash day_splitter.sh input.tsv

Results:

$cat Day1.txt
Code | Name | Day1 
10001 | abcd | 5 
10002 | degg | 3 
10003 | gxyz | 4 

$cat Day2.txt
Code | Name | Day2 
10001 | abcd | 1 
10002 | degg | 9 
10003 | gxyz | 8

$cat Day3.txt
Code | Name | Day3
10001 | abcd | 9
10002 | degg | 6
10003 | gxyz | 7

Comments

1

In pure awk:

$ awk 'BEGIN{FS=OFS="|"}{for(i=3;i<=NF;i++) {f="file" (i-2); print $1,$2,$i >> f; close(f)}}' file

Explained:

$ awk '
BEGIN {
    FS=OFS="|" }             # set delimiters
{
    for(i=3;i<=NF;i++) {     # loop the consumption fields
        f="file" (i-2)       # create the filename
        print $1,$2,$i >> f  # append to target file
        close(f) }           # close the target file
}' file

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.