9

I have split a large text file into a number of sets of smaller ones for performance testing that i'm doing. There are a number of directories like this:

/home/brianly/output-02 (contains 2 files myfile.chunk.00 and myfile.chunk.01)
/home/brianly/output-04 (contains 4 files...)
/home/brianly/output-06 (contains 6 files...)

It's important to note that there is an increasing number of files in each directory. What I need to do is run an executable against each of the text files in the output directories. The command looks something like this against a single file:

./myexecutable -i /home/brianly/output-02/myfile.chunk.00 -o /home/brianly/output-02/myfile.chunk.00.processed

Here the -i parameter is the input file and -o parameter is the output location.

In C# I'd loop over the directories get the list of files in each folder, then loop over them to run the commandlines. How do I traverse a directory structure like this using bash, and execute the command with the correct parameters based on the location and files in that location?

6 Answers 6

9

For this kind of thing I always use find together with xargs:

$ find output-* -name "*.chunk.??" | xargs -I{} ./myexecutable -i {} -o {}.processed

Now since your script processes only one file at a time, using -exec (or -execdir) directly with find, as already suggested, is just as efficient, but I'm used to using xargs, as that's generally much more efficient when feeding a command operating on many arguments at once. Thus it's a very useful tool to keep in one's utility belt, so I thought it ought to be mentioned.

Sign up to request clarification or add additional context in comments.

Comments

8

Something like:

for x in `find /home/brianonly -type f`
do
./yourexecutable -i $x -o $x.processed
done

4 Comments

there are back-ticks before find and after f
for x in $(find /home/brianonly -type f); do ... done $() is so much more readable than back-ticks. My $.02.
I think yours would try to process the processed files.
Dennis, you're right. I am standing by my disclaimer of: something like :)
2

As others have suggested, use find(1):

# Find all files named 'myfile.chunk.*' but NOT named 'myfile.chunk.*.processed'
# under the directory tree rooted at base-directory, and execute a command on
# them:
find base-directory -name 'output.*' '!' -name 'output.*.processed' -exec ./myexecutable -i '{}' -o '{}'.processed ';'

Comments

1

From the information provided, it sounds like this would be a completely straightforward translation of your C# idea.

for i in /home/brianly/output-*; do
    for j in "$i/"*.[0-9][0-9]; do
        ./myexecutable -i "$j" -o "$j.processed"
    done
done

Comments

0

That's what the find command is for.

http://linux.die.net/man/1/find

Comments

0

Use find and exec. Have a look at following

http://tldp.org/LDP/abs/html/moreadv.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.