3

I was wondering if someone could point me in the right direction. Im trying to create a nested dictionary from a list of file paths, that would resemble the below. This list will change depending on the users input, so i imagine it would need to be recursive. Any pointers on where to begin?

EDIT: Also, the dictionary will be converted to JSON and used to create graphs using D3.js.

fileDict = [
    {
        "name": "BaseLevel",
        "children": [
          {
            "name": "/etc/",
            "children": [
              {
                "name": "/etc/passwd",
              },
              {
                "name": "/etc/group"
              }
            ]
          },
          {
            "name": "/root/",
            "children": [
              {
                "name": "/root/test",
              }
            ]
          }
        ]
      }
    ]

The closest example i've been able to get is with this

    records = ["base/images/graphs/one.png", "base/images/tikz/two.png",
"base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"]

recordsSplit = map(lambda x: x.split("/"), records)

for record in recordsSplit:
    here = result
    for item in record[:-1]:
        if not item in here:
            here[item] = {}
            here = here[item]
        if "###content###" not in here:
            here["###content###"] = []
            here["###content###"].append(record[-1])

print json.dumps(result, indent=4)
2
  • Your example shows only a single level. Are you trying to classify paths only by the first level of directory, or build a "tree" of nested dictionaries, with a separate dictionary for each level of the path? Commented Jan 29, 2016 at 1:02
  • Trying to a build a tree of nested dictionaries. I aim to end up with a d3.js graph, with a tree structure representing the file hierarchy Commented Jan 29, 2016 at 10:08

2 Answers 2

2

Could it be worth making a class rather than a dict? Wrote up a short one that should do what you want

class FileSystem():
    
    def __init__(filePath=None):
        self.children = []
        if files != None:
            try:
                self.name, child = files.split("/", 2)
                self.children.append(FileSystem(filePath))
            except (ValueError):
                 pass
            
    def addChild(filePath):
        self.children.append(FileSystem(filePath))
    
    def getChildren():
        return self.children

    def printAllChildren():
        print "Name: "+ self.name
        print "{ Children:"
        for child in self.children:
            child.printAllChildren()
        print "}"

You could then enter the first path and save a reference to it like

myFileSystem = FileSystem("base/pictures/whatever.png")

This myFileSystem will be your reference to the "base" level, and using that and it's methods you should be able to do what you want.

And then when you have a second path to add you would have to find the correct node to add it to by using getChildren() on myFileSystem until you find a discrepancy, then use addChild() to add the rest of the filepath to that node. Then using myFileSystem.printAllChildren() will print out the whole file system.

-------EDIT-------

Wasn't too happy with my half written code and liked the challenge so here is a easy to use class

class FileSystem():

    def __init__(self,filePath=None):
        self.children = []
        if filePath != None:
            try:
                self.name, child = filePath.split("/", 1)
                self.children.append(FileSystem(child))
            except (ValueError):
                self.name = filePath
            
    def addChild(self, filePath):
        try:
            thisLevel, nextLevel = filePath.split("/", 1)
            try:
                if thisLevel == self.name:
                    thisLevel, nextLevel = nextLevel.split("/", 1)
            except (ValueError):
                self.children.append(FileSystem(nextLevel))
                return
            for child in self.children:
                if thisLevel == child.name:
                    child.addChild(nextLevel)
                    return
            self.children.append(FileSystem(nextLevel))
        except (ValueError):
            self.children.append(FileSystem(filePath))

    def getChildren(self):
        return self.children
        
    def printAllChildren(self, depth = -1):
        depth += 1
        print "\t"*depth + "Name: "+ self.name
        if len(self.children) > 0:
            print "\t"*depth +"{ Children:"
            for child in self.children:
                child.printAllChildren(depth)
            print "\t"*depth + "}"
        
records = ["base/images/graphs/one.png", "base/images/tikz/two.png",
"base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"]

myFiles = FileSystem(records[0])
for record in records[1:]:
    myFiles.addChild(record)

myFiles.printAllChildren()      

As you can see at the end when i simply do myFiles.addChild(record), the addChild function now takes care of finding the right place in the tree for it to go in. The printAllChildren() gives the correct output at least for those parameters.

Let me know if any of it doesnt make sense, like I said its not fully tested so some corner cases (e.g. trying to add another base?) might make it go weird.

EDIT2

class FileSystem():

    def __init__(self,filePath=None):
        self.children = []
        if filePath != None:
            try:
                self.name, child = filePath.split("/", 1)
                self.children.append(FileSystem(child))
            except (ValueError):
                self.name = filePath

    def addChild(self, filePath):
        try:
            thisLevel, nextLevel = filePath.split("/", 1)
            try:
                if thisLevel == self.name:
                    thisLevel, nextLevel = nextLevel.split("/", 1)
            except (ValueError):
                self.children.append(FileSystem(nextLevel))
                return
            for child in self.children:
                if thisLevel == child.name:
                    child.addChild(nextLevel)
                    return
            self.children.append(FileSystem(nextLevel))
        except (ValueError):
            self.children.append(FileSystem(filePath))

    def getChildren(self):
        return self.children

    def printAllChildren(self, depth = -1):
        depth += 1
        print "\t"*depth + "Name: "+ self.name
        if len(self.children) > 0:
            print "\t"*depth +"{ Children:"
            for child in self.children:
                child.printAllChildren(depth)
            print "\t"*depth + "}"
            
    def makeDict(self):
        if len(self.children) > 0:
            dictionary = {self.name:[]}
            for child in self.children:
                dictionary[self.name].append(child.makeDict())
            return dictionary
        else:
            return self.name
                

records = ["base/images/graphs/one.png", "base/images/tikz/two.png",
"base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"]

myFiles = FileSystem(records[0])
for record in records[1:]:
    myFiles.addChild(record)

print myFiles.makeDict()      
Sign up to request clarification or add additional context in comments.

5 Comments

This answer is fantastic, and definitely helpful so thank you. Unfortunately, my mistake for not mentioning, it must be a dictionary as it will needed to be converted to JSON, and displayed using D3.js
Oh I see, sorry I should've read your attempt, you got a json keyword in there. I dont know much aboutJSON unfortunately, however you could definitely mould my solution into an actual dict. One way would be to extend the dict class (literally declare the class as class FileSystem(dict): and then edit the current methods such that self.name is set as the key and self.children is set as the value. Either that or have a makeIntoDict() method which builds and returns a dict (in a similar way to how printAll() does currently), by setting self.name the key & self.children as the value.
I can have a go at it later if no one else has come up with anything
See edit 2, it was a little rushed but I think it works.
Apologies for the late reply, I was away for the weekend. Your second edit has worked beautifully so thank you!
0

when you have a files like:

['testdata/hhohoho.mdf', 'testdata/dvojka/rerere.bdf', 'testdata/jedna/sss.txt']

you got output structure like:

Name: testdata
{ Children:
    Name: hhohoho.mdf
    Name: rerere.bdf
    Name: sss.txt
}

you have a mistake in:

self.children.append(FileSystem(nextLevel))
    except (ValueError):
        self.children.append(FileSystem(filePath))

solved like:

 self.children.append(FileSystem(thisLevel))
        for child in self.children:
            if thisLevel == child.name:
                child.addChild(nextLevel)
                return


Name: testdata
{ Children:
    Name: hhohoho.mdf
    Name: dvojka
    { Children:
            Name: rerere.bdf
    }
    Name: jedna
    { Children:
            Name: sss.txt
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.