I'm attempting to merge & dedupe several different versions of the same kind of plain txt file using gnu sort that comes with Ubuntu 18 lts. I have used sort a lot almost daily with no issues sorting files of 1gb+ in size.
However, i have the following command that still couldn't complete when i left it going for 10 hours in the background (around 600mb total of data):
find backups -type f -iname 'file0.txt' -o -iname 'file1.txt' -o -iname 'file2.txt' -o -iname 'file3.txt' -exec sort -u {} + > "combined.txt"
The sort part is what is causing issues, the rest of the command is irrelevant from my testing. I have cat all the files into a single file of ~600 mb and when i try to sort -u this file, it still hangs for ever even when setting memory buffer to 80% with around 6gb free ram. I also have no issues with disk space.
While it is still running, i have dragged in an unsorted 3gb text file and successfully sorted -u it. I'm doing this in a virtual machine if that could matter.
what could cause this behavior?
topif it is CPU issue or if it's waiting for I/O (in which case this may be a broken/slow disk issue). Try running yoursortwithstraceto see what's going on. Also there is a+sign in your command which looks weird?-execaction.