0

I need to sort many files and then dump into many files in order of 1.csv, 2.csv, 3.csv and so on with each file of equal size.

The following pipe sorts and then dump into a single huge file

cat input_files | sort > one_huge_file

How do I dump into multiple files?

6
  • 1
    Note that you should be using sort input_files > one_huge_file or even sort -o one_huge_file input_files which has the additional (possible) benefit that one_huge_file could be one of the input files, though in this case, it probably wouldn't. The cat | sort notation is a candidate for the UUOC award. Commented Jan 25, 2012 at 15:18
  • 1
    How do you define 'equal size'? Same number of lines? Do you know how many lines are in the source files? How many output files do you need? Commented Jan 25, 2012 at 15:20
  • @Jonathan Leffler: good advise on keeping the one_huge_file. equal size means same number of lines, each source line can contain different number of lines, I can do a count on each source file. I want to have for example 10000 lines per output file. Commented Jan 25, 2012 at 15:27
  • @Jonathan Leffler: what's wrong with cat | sort notation, is there a better way? Commented Jan 25, 2012 at 15:28
  • @user121196 There is nothing wrong with cat | sort per se, but why to fork another process when sort can read the input file on it's own. You are only calling cat so that sort can read stream of data thru the pipe, isn't it? Commented Jan 25, 2012 at 15:59

1 Answer 1

2

Take a look at this useful tool:

$ man split
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.