1

I'm searching through a list of scripts, and in each script, I'm parsing it out and among other things, finding the subscripts.

Whenever I find a subscript, I want to add it to the list of scripts I'm searching through.

I came up with this while loop:

while keep_checking == True:
    TMP = deepcopy(FILE_LIST)
    for fname in TMP:
        if not fname in processed:
            SCL_FILE = fname
            break
    handleSCL(SCL_FILE)
    processed.add(SCL_FILE)
    if processed == FILE_LIST:
        keep_checking = False
        break

The code above does the job, but I feel like dirty. handleSCL() is searching for the file and adding any new subscripts to FILE_LIST.

Is there a cleaner way of doing this?

1
  • what's the purpose of deepcopy here? How is handleSCL updating FILE_LIST? Through global data? It's pretty difficult to tell what everything is doing here. I think a little more context is necessary ... Commented Sep 13, 2012 at 18:25

4 Answers 4

1

I would use a method similar to the A* pathfinding algorithm (just without the pathfinding part).

  • Open list: placesfiles not yet examined.
  • Closed list: placesfiles already examined.

Start by adding your first file to openlist; then iterate across every element in openlist. For each element, find all files, and for each new file, check if it's in either list. If it's in neither, add it to openlist. When finished with the element, add it to closedlist.

This is a pretty effective and clean way of going through all of the elements without duplication.

EDIT: upon further consideration, you could use one ordered list, and iterate through it, adding new files to the end of the list. [beginning-current] is the closedlist, and [current-end] is the openlist. A* requires two lists because of sorting and path cost calculations, but you are doing a full search, so you don't need that feature. Then you just need a "add if not exist" for the single list.

Sign up to request clarification or add additional context in comments.

Comments

1

Your loop needs some cleanups!

break will break out of the while loop, no need for keep_checking. Also no need for TMP, use it directly in the for loop.

while processed != FILE_LIST:
    for fname in deepcopy(FILE_LIST):
        if not fname in processed:
            SCL_FILE = fname
            break

    handleSCL(SCL_FILE)
    processed.add(SCL_FILE)

will do the same work in less code.

1 Comment

Wow that is much cleaner! So is this approach efficient or is there a better way to do it?
0

After much thinking, I ended up writing a quick custom queue.

class PerQueue(object):

    def __init__(self):
        self._init()
        self.all_files = set()
        self.current   = None
        self.files     = set()
        self._init     = False
        self.cur_files = set()

    def _setflag(self, value):
        self._init = value
        for item in self.all_files:
            if item.startswith('ss'):
                self.cur_files.add(item)

    def _getflag(self):
        return self._init

    def empty(self):
        n = self._empty()
        return n

    def pushMany(self, itemList):
        for item in itemList:
            self.push(item)

    def push(self, item):
        if not item in self.all_files and not item in self.files:
            self._put(item)

    def pop(self):
        # I don't want errors
        if not self.empty():
            self.current = self._get()
            self.all_files.add(self.current)
            if self.init:
                self.cur_files.add(self.current)
        else:
            self.current = None
        return self.current

    def _init(self):
        self.files = set()

    def _empty(self):
        return not self.files

    def _get(self):
        return self.files.pop()

    def _put(self, item):
        self.files.add(item)

    init = property(_getflag, _setflag)

This allowed me to handle a couple of special conditions (using all_files and cur_files) along with the init flag. At most we have a couple of hundred files to process at any time, so I wasn't worried about size constraints.

Comments

0

This would be way cleaner... and probably unnecessary at this point:

for fname in deepcopy(FILE_LIST):
     handleSCL(fname)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.