3

I've got a function which parses a sentence by building up a chart. But Python holds on to whatever memory was allocated during that function call. That is, I do

best = translate(sentence, grammar)

and somehow my memory goes up and stays up. Here is the function:

from string import join
from heapq import nsmallest, heappush
from collections import defaultdict

MAX_TRANSLATIONS=4 # or choose something else

def translate(f, g):
    words = f.split()
    chart = {}
    for col in range(len(words)):
        for row in reversed(range(0,col+1)):
            # get rules for this subspan                                        
            rules = g[join(words[row:col+1], ' ')]
            # ensure there's at least one rule on the diagonal                  
            if not rules and row==col:
                rules=[(0.0, join(words[row:col+1]))]
            # pick up rules below & to the left                                 
            for k in range(row,col):
                if (row,k) and (k+1,col) in chart:
                    for (w1, e1) in chart[row, k]:
                        for (w2, e2) in chart[k+1,col]:
                            heappush(rules, (w1+w2, e1+' '+e2))
            # add all rules to chart                                            
            chart[row,col] = nsmallest(MAX_TRANSLATIONS, rules)
    (w, best) = chart[0, len(words)-1][0]
    return best

g = defaultdict(list)
g['cela'] = [(8.28, 'this'), (11.21, 'it'), (11.57, 'that'), (15.26, 'this ,')]
g['est'] = [(2.69, 'is'), (10.21, 'is ,'), (11.15, 'has'), (11.28, ', is')]
g['difficile'] = [(2.01, 'difficult'), (10.08, 'hard'), (10.19, 'difficult ,'), (10.57, 'a difficult')]

sentence = "cela est difficile"
best = translate(sentence, g)

I'm using Python 2.7 on OS X.

18
  • 4
    Which version of Python are you using? The sample code isn't complete; there is no free join function in standard Python, nor is there an nsmallest, nor is any sample data given. This last is particularly important, as g could be storing additional data with each run of translate, depending on the type of g. Commented Mar 24, 2012 at 22:56
  • 1
    See also: Python memory profiler, How do I profile memory usage in Python? Commented Mar 24, 2012 at 23:27
  • 1
    Still needs complete sample data in order to be reproducible. Commented Mar 24, 2012 at 23:40
  • 1
    Did you read the SSCCE link? By "complete", we mean "something we can copy and paste which runs and shows the problem". I attempted to run what you posted, and chose a value for MAX_TRANSLATIONS, set a sentence, built a grammar dictionary, and heapified the values. The result? "KeyError: 'cela est'". Commented Mar 24, 2012 at 23:51
  • 1
    @Dan: it wasn't copy, paste & run, as per SSCCE and Writing the Perfect Question. Commented Mar 25, 2012 at 2:28

2 Answers 2

1

Within the function, you set rules to an element of grammar; rules then refers to that element, which is a list. You then add items to rules with heappush, which (as lists are mutable) means grammar holds on to the pushed values via that list. If you don't want this to happen, use copy when assigning rules or deepcopy on the grammar at the start of translate. Note that even if you copy the list to rules, the grammar will record an empty list every time you retrieve an element for a missing key.

Sign up to request clarification or add additional context in comments.

1 Comment

There we go. :) Oh me, not reminding myself of the peculiarities of a new language before I start playing around with it...
0

Try running gc.collect after you run the function.

1 Comment

Hi jordoex. I tried that, both at the end of the function and after running it. No effect.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.