1

I have the following use case: I need to process a big number of files. Each processing looks more or less like this:

1) read file

2) perform operation (a) on the content

3) perform operation (b) on the content

4) perform operation (c) on the content

    ... 

n) delete file

Spring Batch seems like a good solution to this problem with one exception: I don't want to read all the files in step 1), pass all of them to step 2) etc. because it would use up a lot of memory.

EDIT: I commit my files to memory (not to the DB). This is why I'd prefer to process the files one by one or in batches. I mean: run all steps on a single file/batch (the file/batch gets removed in the last step, memory gets cleaned up), then proceed to the next file/batch and so on.

Does Spring Batch have a mechanism supporting multiple execution of all steps over and over? Or should I just run the same job multiple times until I've run out of files?

Thanks and best regards, Peter

2 Answers 2

2

in Spring Batch Doc this is handled under multi-file input

it works with one step and what this will do is:

  • create list of file locations
  • open first file, read/process till end, close file
  • open next... and so on
Sign up to request clarification or add additional context in comments.

3 Comments

hmm... Thanks, this will help me a little bit in some cases, but still if I need to run multiple steps, this will cause me to load all my files at once, right? And this is something I'd like to avoid, not to clutter up my memory. But thanks for the tip :)
no the files are handled sequentially and the individual file is not in memory completely, the implementation uses standard java mechanism with file(pointer) and buffered streams, so only the read items * commit-rate are held in memory (e.g. commit-rate 1000, means 1000 read lines, converted to items)
Oh, sorry, it seems I did not explain my case completely. I commit the read file content into memory. This is why I'd prefer to execute all my steps on a single file, then execute all steps on the next file and so on... (because in the last step I remove the file).
1

In your simple case for N files you need to execute N jobs, each one is passed a file name as JobParameter. Each your processing step cannot be expressed in terms of Spring Batch, but you can use CompositeItemProcessor to chain your processors.

1 Comment

Sorry, this question is a little outdated, but what you're writing here is exactly what I eventually did :) Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.