1

I have a number of csv files that use the following format:

file_1.csv
line 1 -- header row
line 2 -- header row
line 3 -- data row

file_2.csv
line 1 -- header row
line 2 -- header row
line 3 -- data row
...
file_n.csv
line 1 -- header row
line 2 -- header row
line 3 -- data row

and would like to script something that puts them all in one single file having the 2 header lines copied only once, as follows:

fileMerged.csv
line 1 -- header row
line 2 -- header row
line 3 -- data row from file_1
line 4 -- data row from file_2
...
line n+2 -- data row from file_n

What is the best way to achieve this in a Linux server?

0

2 Answers 2

4

use awk:

awk 'FNR==NR||FNR>2' file_*.csv > fileMerged.csv
Sign up to request clarification or add additional context in comments.

Comments

1
#!/usr/bin/env bash

files=( file_*.csv ) # collect input filenames in an array

{
  head -n 2 "${files[0]}" # output the header lines (using the 1st file)
  tail -q -n +3 "${files[@]}" # append the data lines from all files, in sequence
}  > out.csv

lihao's elegant answer offers a simpler solution that clearly satisfied the OP's requirements.


If you're interested in a variation of the problem where lines should be copied cyclically from the input files: the respective first lines from each input file, followed by the respective second lines, ...:

#!/usr/bin/env bash

files=( file_*.csv ) # collect input filenames in an array

{
  head -n 2 "${files[0]}" # output the header lines (using the 1st file)
  paste -d'\n' "${files[@]}" | tail -n +"$(( 1 + 2 * ${#files[@]} ))"
}  > out.csv

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.