0

I have multiple subfolders that need to be created however many levels deep, is there a better/cleaner way to do this than the below?

whenever we get to "l" only then should it creater the 5th level of folders "p" & "p"

import os

base_path = fr"C:\Some_Location"

level_1 = ["a", "b", "c"]
level_2 = ["d", "e", "f", "g", "h"]
level_3 = ["i", "j", "k", "l"]
level_4 = ["m", "n"]
level_5 = ["o", "p"]

for i in level_1:
    for j in level_2:
        for k in level_3:
            for l in level_4:
                if k == "l":
                    for m in level_5:
                        newpath = os.path.join(base_path,i,j,k,l,m)
                        if not os.path.exists(newpath):
                            os.makedirs(newpath)
                else:
                    newpath = os.path.join(base_path,i,j,k,l)
                    if not os.path.exists(newpath):
                        os.makedirs(newpath)
7
  • This question is similar to: How to create new folder?. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem. Commented Mar 13 at 22:11
  • check also the documentation: os.makedirs and the argument exist_ok Commented Mar 13 at 22:11
  • Use a 2-dimensional list of lists, rather than 5 separate variables. Then you can use itertools.product() to loop over all the combinations. Commented Mar 13 at 22:23
  • @cards it seems like OP knows how to create folders, they are just asking for a more efficient way of creating a large number? Commented Mar 13 at 22:27
  • Since you have a special condition based upon a value at a specific level about another specific level, it will be hard to propose something generic. It would be possible to write something shorter, but it wouldn't be much faster or readable (since creating the folders will be the slow part anyway). The only easy optimisation here seems to be using the exist_ok as @cards points out. Commented Mar 13 at 22:32

1 Answer 1

0

Depending on your definition of cleaner/better, here's a solution that's a bit more generic and pythonic:

from typing import TypeAlias
import os

t: TypeAlias = dict[tuple[str, ...], "t"] | None


def create_structure(fs: t, parents: list):
    for ds, sub_fs in fs.items():
        for d in ds:
            if sub_fs is None:
                os.makedirs(os.path.join(*parents, d), exist_ok=True)
            else:
                create_structure(sub_fs, parents + [d])


def main():
    structure = {
        ("a", "b", "c"): {
            ("d", "e", "f", "g", "h"): {
                ("i", "j", "k"): {
                    ("m", "n"): None
                },
                ("l"): {
                    ("m", "n"): {
                        ("o", "p"): None
                    }
                }
            }
        }
    }

    create_structure(structure, [fr"C:\Some_Location"])


if __name__ == '__main__':
    main()

Changes:

  • uses a nested data structure representing exactly what you need, including the requirement to only have "level 5" directories created for directories that have l at "level 3"
  • works for an arbitrary number of levels, and could work for more complex requirements
  • code and data moved out of the global namespace and into a function
  • types declared (note the use of tuples instead of lists, to allow their use as dictionary keys)
  • more efficient use of os.makedirs with exist_ok

If you need this to be faster overall as well, you could experiment with collecting the directories to create from the function (by turning it into a generator for example) and batching those calls out to a mkdir -p statement, if you're on Linux. I'm not sure there's going to be a much faster way of doing this on Windows.

It's possible that creating folders from multiple threads could be faster, but it seems likely that the file system and the OS will turn out to be the bottleneck, just serialising what you are trying to parallelise.

Sign up to request clarification or add additional context in comments.

2 Comments

Amazing thanks so much, appreciate you taking the time, I'm fairly inexperienced so I can already learn a lot from this short snippet. Would another "improvement" be to have the levels still as list variables for easy editing in case of reuse, as "m" and "n" are, e.g.: level_1 = ["a", "b", "c"] structure = { (*level_1) } @cards - thank you for letting me know about exist_ok
I don't think that's the kind of reuse you'd get a lot of mileage out of, since it's just reuse for you. If you're thinking you may use this a lot, consider making it configurable through a .json file an create a read and write method to allow for a roundtrip. You can then manipulate the .json (or the vanilla dictionary after reading it) before turning it back into this data type.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.