0

How am I able to split a file into multiple files based on $2?

Example of input,

123,hello,world
124,hello,planet
125,universe,hello
126,hello,universe

Desired output,

hello.txt >

123,hello,world
124,hello,planet
126,hello,universe

universe.txt >

125,universe,hello
0

1 Answer 1

2

With GNU awk:

awk '{name=$2 ".txt"; print >>name; close(name)}' FS=',' file

A very similar question at stackoverflow.com: split file by lines and keep the first string as a header for output files

2
  • 1
    Opening and closing each file for every line is seriously inefficient. GNU/awk will handle about 1024 open files concurrently. (It actually handles any number, but behind the scenes it closes a random one to keep the number currently open below 1024.) Best safe strategy is probably to keep a hash of the names in use, and close all the files when the number open gets too large. Timed optimum is to load up a million input lines at once, and then write all the lines for one file at a time, rinse and repeat. Commented Aug 22, 2020 at 16:22
  • 1
    Here's an earlier question about pretty much exactly the same thing, along with a few answers and some discussion on their ups and downs: awk how to separate in different files all the lines with the same content in a given column Commented Aug 22, 2020 at 16:25

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.