1

I'm trying to write code that will take an object that can contain an arbitrarily large number of nested objects, each of which can contain nested objects, and so on. I'm trying to write a function that will find the first object that matches a given criteria. I don't care about matching all objects.

Python 2 is the language here, but this theory should reach into any language.

class MyObject:
    myType = ""
    myChildren = list([])
    def __init__(self, t):
        self.myType = t

a = MyObject("top")
a.myChildren.append(MyObject("obj1"))
a.myChildren.append(MyObject("obj2"))
a.myChildren.append(MyObject("obj3"))
a.myChildren[1].myChildren.append(MyObject("objA"))
a.myChildren[1].myChildren.append(MyObject("objB"))
a.myChildren[1].myChildren.append(MyObject("objC"))
a.myChildren[1].myChildren[0].myChildren.append(MyObject("Sam"))
a.myChildren[1].myChildren[0].myChildren.append(MyObject("Max"))
a.myChildren[1].myChildren[0].myChildren.append(MyObject("Waldo"))
a.myChildren[1].myChildren[1].myChildren.append(MyObject("Adam"))
a.myChildren[1].myChildren[1].myChildren.append(MyObject("Waldo"))
a.myChildren[2].append("Waldo"))

At this point we should have a recursion tree that looks something like this:

[top]
 |-- [obj1]
 |-- [obj2]
 |    |-- [objA]
 |    |    |-- [Sam]
 |    |    |-- [Max]
 |    |    \-- [Waldo]
 |    |-- [objB]
 |    |    |-- [Adam]
 |    |    \-- [Waldo]
 |    \-- [objC]
 \-- [obj3]
      \-- [Waldo]

So now, I want to write a function that will find and return a reference to the first instance of a Waldo. As you can see, there might be more than one Waldo. I'd only be interested in the top/obj2/objA/Waldo instance.

The only recursive code I've done in the past uses recursive returns to return progressively more content. For example: a function that will create a recursion tree:

def printTree(rootObj,indentLevel=0):
    output = "%s%s\n" % (" "*indentLevel, rootObj.myType)
    for obj in rootObj:
        output += printTree(obj,indentLevel+1)
    return output

The thing is, as I said, I only want the first instance, and I want to stop walking the tree at that point. Walking the entire tree would be HUGELY inefficient once the trees get large and the desired object is somewhere early in the tree.

I can't seem to figure out how the logic to make this happen would work. My initial thought was something like this:

def findFirst(rootObj, desiredType):
    # test this object first
    if (rootObj.myType == desiredType): 
        return rootObj
    # we are not the correct object, so now test all sub-objects.
    for (obj in rootObj):
        if (findFirst(obj,desiredType) is not None):
            return obj
    return None # nothing was found

However this doesn't quite work. As is it's just returning the first object even though it doesn't match.

Could someone help shed some light on the logic that would be needed to accomplish this?

EDIT: some clarification. The method of recursion is that we will descend as deep as possible into the tree and then work our way back out for each object. The list is also ordered. So for example, the path of recursion for the above tree would be:

top -> obj1 -> obj2 -> objA -> Sam -> Max -> Waldo -> objB -> Adam -> Waldo -> objC -> obj3 -> Waldo

Additionally, I did give an example of a self-created object, however, the objects in the application I want to use this code in are dynamically created; thus, serializing or converting the object tree would end up walking the entire tree anyway, which slows things down since we're generating dynamic objects based on state.

A good way of looking at what I'm looking for is a recursive file search function. Let's say we have a folder on disk and we want the first file, anywhere in that folder, that matches a given filename based on a regular expression and has a size within a given criteria. (This means we have to do computations on each object, we can't simply just check variables and filter based on them.) However, we don't want to parse the entire drive's directory structure if that file is in the root directory or in the first directory we come to. The real spirit of my question is "how can you escape out of this recursive loop as soon as you find something that matches your desired criteria?"

2
  • So you want the first occurrence with respect to the children's index values? I would think the first occurrence would be the the waldo that falls 2 layers deep into the tree instead of the waldo that's on layer 3 (but attached at index 1) Commented Feb 9, 2015 at 18:02
  • No, the first object that we come to when walking the tree via the method of going in as deep as possible first. In other words, I don't want to check all of depth 1, and then check all of depth 2, and so on. Instead, I want to check depth 1 objects, but when one is encountered with another layer, descend into that layer first before going onto the next level 1 object. Commented Feb 9, 2015 at 18:11

2 Answers 2

0

A good approach would be to use JSON (link) to represent the desired structure, then parse it and create a python object based on this. Take a look here for more.

Sign up to request clarification or add additional context in comments.

2 Comments

I did not mention in my original post, but the code I am writing will deal with pre-existing objects, not ones that I explicitly create and can parse later. Imagine that we were doing this with an XML DOM. (It's not an XML DOM, but the same idea - the objects are provided to me.)
First of all, include this in your question, I think it is important. Also, are you sure your objects are not in a standard-parsing format (like JSON)?
0

You can make the recursive function return a value (the object you're looking for) when it finds it. That way you just run the recursive function like normal and end when the function returns something.

Pseudo code:

function findFirst(rootObj, desiredType){

   var objectToReturn

   //check current object, return if found
   if (rootObj.myType == desiredType)
       return rootObj

    // Return null if no children and this wasnt desired type
    if(rootObj.hasChildren == false)
        return null

    // Has children, not yet found anything
    // return the recursive result
    // this will return null or a value
    // if we get a null, keep looking 
    // otherwise return the value and be done.
    for(obj in rootObj){
        objectToReturn = findFirst(obj, desiredType)

        // if the result is a value, return it
        if(objectToReturn != null)
            return objectToReturn 
        // otherwise, keep looking            
    }               
}

Something like this. The point being that as soon as you find a value you return it. If you return any value the entire recursive function stops because a value was returned. You just need to make sure that you're checking for a returned value before sending the recursive method back out on a memory intensive mission.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.