0

I'm attempting to merge & dedupe several different versions of the same kind of plain txt file using gnu sort that comes with Ubuntu 18 lts. I have used sort a lot almost daily with no issues sorting files of 1gb+ in size.

However, i have the following command that still couldn't complete when i left it going for 10 hours in the background (around 600mb total of data):

find backups -type f -iname 'file0.txt' -o -iname 'file1.txt' -o -iname 'file2.txt' -o -iname 'file3.txt' -exec sort -u {} + > "combined.txt"

The sort part is what is causing issues, the rest of the command is irrelevant from my testing. I have cat all the files into a single file of ~600 mb and when i try to sort -u this file, it still hangs for ever even when setting memory buffer to 80% with around 6gb free ram. I also have no issues with disk space.

While it is still running, i have dragged in an unsorted 3gb text file and successfully sorted -u it. I'm doing this in a virtual machine if that could matter.

what could cause this behavior?

5
  • Check with top if it is CPU issue or if it's waiting for I/O (in which case this may be a broken/slow disk issue). Try running your sort with strace to see what's going on. Also there is a + sign in your command which looks weird? Commented Oct 30, 2019 at 9:44
  • 1
    @MartinWickman The + is part of the -exec action. Commented Oct 30, 2019 at 10:12
  • Out of curiosity, are you doing this as a root? Commented Nov 3, 2019 at 4:58
  • @DavidRankin-ReinstateMonica i double checked it and my original command is correctly sorting all files with the specified names, not just the first file found. Commented Nov 10, 2019 at 2:12
  • @j58765436 I apologize for the confusion, your or'ed conditions relate to filename selection not sort results. (coffee must have run out...) Commented Nov 10, 2019 at 2:16

1 Answer 1

0

Setting LC_ALL=C before issuing the sort command or export LC_ALL=C below the shebang at the begging of a script file solved it. Not sure why the particular text added to the text files last update caused the command to get stuck forever without LC_ALL=C though.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.