2

I have a directory composed of empty files and I want to iterate through each of their names. However I have nearly 20 million of these empty files and to load them all into memory with fs.readdir or fs.readdirSync would both take needlessly long and devour all my memory on the system.

What would be a way to go about this?

Ideally I would look for something that reads file by file in an async fashion with code that would ressemble the following:

readdirfilebyfile((filename)=>{....}) so that at no point would I keep the entire list of files in memory.

The current solution I am using is dumping all the file names into a single file which I then read as a data stream. However, this is just running away from a problem that I should know how to solve without resorting to this.

6
  • Duplicate? stackoverflow.com/questions/25757293/… There also is groups.google.com/forum/#!topic/nodejs/t0ziBVsPRqw There are some indirect solutions at least in the second link. There is no direct solution purely within node.js I think. But the answer in the linked SO thread says it's no problem (and he tried), you just have to ensure not to start processing them all at once, just reading that big list seems to be fine. Okay - 600 MB of memory if I use his calculations as a basis for your scenario.... Commented May 27, 2017 at 18:26
  • @Mörre I upvoted the question just now because OP didn't get a functional answer. The answer still wants to read all file names into memory. I want something that works by accessing an n file without reading the ones that came before it or after. Commented May 27, 2017 at 18:46
  • What are you trying to accomplish? What action do you want to perform on those filenames? Commented May 27, 2017 at 19:37
  • That is why I included the 2nd link. Did you have a look at it? There is no "pure" node.js solution. Commented May 27, 2017 at 20:21
  • @robertklep Pass the names to an arbitrary function that will do things irrelevant to filesystem I/O Commented May 27, 2017 at 20:32

1 Answer 1

1

What about this one? pv is pipe viewer, a rate limiter for bash pipes.

const spawn = require('child_process').spawn;
const exec = require('child_process').exec;
const tail = spawn('tail -f /tpm/filelist | pv -l -L 10 -q');
tail.stdout.on('data', fileName => {
  // parse filenames here
  console.log(fileName);
});
exec('ls > /tpm/filelist');
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.