Is there any better file search algorithm than recursion?

Question

I have used recursion to search for particular type of file (for example .pdf files is used here). My recursion algorithm searches for all subfolder. However I found that it lacks performance when there is too many sub-folder. sub-sub-folder, sub-sub-sub-folder. I want to know if there is better algorithm for file searching.

Below is my recursion code for file searching. I have used .pdf file as an example

import java.io.File;
public class FInd {
    public static void main(String[] args) {
        File f = new File("D:/");
        find(f);    
    }
    public static void find(File f){    
        File []list = f.listFiles();
        try{
            for(int i=0;i<list.length && list.length>0;i++){    
                if(list[i].isFile() && (list[i].getName().contains(".pdf")) ||
                        list[i].getName().contains(".PDF"))
                    System.out.println(list[i].getAbsolutePath());
                if(list[i].isDirectory()) find(list[i]);
            }
        }catch(Exception e){    
        }
    }
}

This code is somewhat faster or equal to when compared to search option in file explorer. I want to know any faster algorithm than this

Recursion is not an algorithm, its an implementation choice. It seems that you have a search space and you need to explore it to find the file. So, unless there is a name-wise relationship between folders, you need to explore the whole space. — vim_
– vim_, Commented Apr 5, 2017 at 16:32
Use Files.walkFileTree if you are using jdk7 or more docs.oracle.com/javase/tutorial/essential/io/find.html — Elias
– Elias, Commented Apr 5, 2017 at 16:38
Possible duplicate of Better file search algorithm than creating a list of files — halileohalilei
– halileohalilei, Commented Apr 5, 2017 at 16:40
I don't think changing search pattern (aka algorithm) will make a difference to the overall performance of scanning entire directory hierarchy. Performance is entirely constrained by disk performance for actually reading the directory structure. No way to improve that, other than replacing drive with SSD. — Andreas
– Andreas, Commented Apr 5, 2017 at 16:41

Abdou · Accepted Answer · 2017-04-06 09:59:35Z

2

try the iterative way

public class Find {
public static void main(String[] args) {

  File f = new File("D:/");

  Stack stack = new Stack<File>();

  stack.push(f);

  while (!stack.empty())
  {    
      f = (File) stack.pop();
      File []list = f.listFiles();
      try{
          for(int i=0;i<list.length && list.length>0;i++){    
              if(list[i].isFile() && (list[i].getName().contains(".pdf")) ||
                      list[i].getName().contains(".PDF"))
                  System.out.println(list[i].getAbsolutePath());
              if(list[i].isDirectory()) stack.push(list[i]);
          }
      }catch(Exception e){    
  }
}

edited Apr 6, 2017 at 9:59

answered Apr 5, 2017 at 19:42

Abdou

3301 silver badge9 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kangkan Lahkar Over a year ago

Thank you. This was helpful. Just small correction : f = stack.pop(); File []list = f2.listFiles(); should be replaced with f = (File) stack.pop(); File []list = f.listFiles();

Abdou Over a year ago

I'm glad it was helpful. I will edit the the answer (I have no Idea why I put f2 instead of f :p).

bichito · Accepted Answer · 2017-04-05 20:55:51Z

2

the problem with threading is that launching them has a cost, so the increase in file browsing + recursion has to be better than the additional cost of N folders/threads.

This is a simple method that uses a loop (the classical replacement for recursion)

static boolean avoidRecursion(String target){
    File currentDir = new File(System.getProperty("user.home"));
    Stack<File> dirs = new Stack<File>();
    dirs.push(currentDir);

    do{
        for(File f : dirs.pop().listFiles()){
            if (f.isDirectory())
                dirs.push(f);
            else{
                if (f.getName().equals(target))
                    return true;
            }
        }
    }while(!dirs.isEmpty());
    return false;
}

Measure both approaches and choose the option that is faster

edited Apr 5, 2017 at 20:55

answered Apr 5, 2017 at 17:02

bichito

1,4242 gold badges19 silver badges23 bronze badges

1 Comment

Kangkan Lahkar Over a year ago

This looks interesting. I will give it a try

mirisbowring · Accepted Answer · 2017-04-05 16:35:36Z

1

Probaply you could use multithreading...

Each folder you enter, you start at new thread... Even if you have more threads than your CPU, it ist not a Problem since Windows Can run much more threads...

answered Apr 5, 2017 at 16:35

mirisbowring

709 bronze badges

4 Comments

Andreas Over a year ago

Unlikely to improve performance. More likely to reduce performance, since performance bottleneck is reading from disk, and trying to read from two places on disk at the same time will just kill performance. (disclaimer: assuming non-SSD disk)

Jaroslaw Pawlak Over a year ago

Also, you should be very careful and limit the number of threads. Creating thousands of threads is likely to hang up your entire OS.

mirisbowring Over a year ago

@JaroslawPawlak you are right, but keep in mind, that the thread stops after finishing an folder :)... Mostly, folders are not so long (File content) which why threads are closing really fast too...

mirisbowring Over a year ago

But He could also create an specific amount of threads and handle theese...

Avneet Paul · Accepted Answer · 2017-04-06 02:26:23Z

1

Use the Files.walk() method which returns a Java8 Stream. You can parallelize that calculation quite easily by using a parallel stream.

Use the following convenient idiom in a try with resources method:

try(Stream vals = Files.walk(rootPath)){ .... }

In the rootPath, you could use Paths.get("root location") to actually get to the root location.

answered Apr 6, 2017 at 2:26

Avneet Paul

3031 silver badge7 bronze badges

Collectives™ on Stack Overflow

Is there any better file search algorithm than recursion?

4 Answers 4

2 Comments

1 Comment

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related