I'd like to parallelize a C-program which recursively calculates the size of a directory and its sub-directories, using OpenMP and C.
My issue is, that when I get into a directory using opendir, and I iterate through the sub-directories using readdir I can only access them one by one until I've reached the last sub-directory. It all works well sequentially.
When parallelizing the program, however, I think it would make sense to split the number of sub-directories in half (or even smaller partitions) and recurse through the sub-directories using OpenMp Tasks.
Obviously I can't simply split the problem size (= number of sub-directories) in half, because of the structure of the for-loop, and loops like this cannot be parallelized using #pragma omp for.
Does anybody have an idea on how to split this function into tasks? Any help would be greatly appreciated.
This is some of my code (I've removed parts I do not deem relevant for this question.)
int calculate_folder_size(const char *path) {
struct stat sb;
if (S_ISREG(sb.st_mode)) { // if it's a file, not a directory (base case)
return sb.st_size;
}
DIR *folder = opendir(path);
struct dirent *element;
size_t size = 4096;
for (element = readdir(folder); element != NULL; element = readdir(folder)) {
//(...)
if (element->d_type == DT_DIR) {
// recursive call of calculate_folder_size
size += calculate_folder_size(name);
} else {
//(...)
}
}
}
closedir(folder);
return size;
}