2

I would like to have @cached decorator akin @memoized that stores cached values of a function as an attribute of the function. Something like this

def cached(fcn):
    def cached_fcn(*args,**kwargs):
        call_signature=",".join([repr(a) for a in args] +
                                [repr(kwa[0])+"="+repr(kwa[1])
                                 for kwa in sorted(kwargs.items()) ])
        if call_signature not in cached_fcn.cache:
            cached_fcn.cache[call_signature] = fcn(*args,**kwargs)
        return copy.deepcopy(cached_fcn.cache[call_signature])
    cached_fcn.__name__ = fcn.__name__
    cached_fcn.__doc__ = fcn.__doc__
    cached_fcn.__annotations__ = fcn.__annotations__    
    cached_fcn.cache = dict()
    return cached_fcn

@cached
def fib(n):
    if n in (0,1): return 1
    return fin(n-1) + fib(n-2)

Assuming that the function does not access anything global, is it safe to do that? What if threading is used?

1
  • 4
    I don't see a problem with your implementation, just want to mention functools.lru_cache can achieve the same thing. Commented Aug 31, 2020 at 6:45

1 Answer 1

1

There is one pitfall that may be relevant to your implementation. Observe

def pf(*args, **kwargs):
    print(args)
    print(kwargs)

and call this with

pf(1, k="a")
pf(1, "a")
pf(k="a", x=1)

All argument specs are valid specs for a function with signature f(x, k) (with or without defaults) - so you can't really know the order of the arguments, their names, and sorting on kwargs is definitely not enough in a general case (empty in the second example, while args is empty in the last with order reversed). Defaults make this worse as if f(x, k=3) is the definition, then f(2, 3) and f(2) and f(x=2) f(2, k=3) and f(x=2, k=3) (also reversed) are the same, with differing kwargs and args passed to the wrapper.

A more robust solution will use inspect.getargspec(your_function). This uses reflection to know the actual argument names of the function as they were defined. You then have to "fill in" the arguments your are given in *args and **kwargs, and use that to generate your call signature:

import inspect
def f(x, k=3): pass
argspec = inspect.getargspec(f) # returns ArgSpec(args=['x', 'k'], varargs=None, keywords=None, defaults=(3,)) 

Now you can generate a call signature (from *args and **kwargs):

signature = {}
for arg, default in zip(reversed(argspec.args), reversed(argspec.defaults)):
    signature[arg] = default

set_args = set()
for arg, val in zip(argspec.args, args):
    set_args.add(arg)
    signature[arg] = val

for arg, val in kwargs.items():
    # if arg in set_args:
    #    raise TypeError(f'{arg} set both in kwargs and in args!')
    # if arg not in argspec.args:
    #    raise TypeError(f'{arg} is not a valid argument for function!')
    signature[arg] = val

# if len(signature) == len(argspec.args):
#     raise TypeError(f'Received {len(signature)} arguments but expected {len(argspec.args)} arguments!')

Then you can use the dictionary signature itself as the call signature. I showed some "correctness" checks above though you may just want to let the call itself to the function detect and fail. I did not handle functions with **kwargs and *args (the actual used names are given in argspec). I think they may just involve having args and kwargs keys in signature. I am still not sure how robust the above is.

Even better, use the builtin functools.lru_cache which does what you want.

Regarding threading, you have the same dangers as anytime multiple threads access the same array. There is nothing special about function attributes. lru_cache should be safe (there was one bug that was resolved) with the one caveat:

To help measure the effectiveness of the cache and tune the maxsize parameter, the wrapped function is instrumented with a cache_info() function that returns a named tuple showing hits, misses, maxsize and currsize. In a multi-threaded environment, the hits and misses are approximate

Sign up to request clarification or add additional context in comments.

6 Comments

Yes, I see this pitfall now. Thanks. I know that implementations exist. My question really is, is it save/robust to use custom function attributes for this sort of purposes. Another example would be, if I want to log all function calls together with some additional info, e.g. time spent in the call, etc. It's more like a question about using function attributes to collect info at runtime.
@R.Matveev In Python you can (and it's safe to) save whatever you want on any object you want. That's a very different question than what you asked.
@R.Matveev Regarding threading, you have the same pitfalls as when multiple threads are setting the same variable, there is nothing special about attributes on a functions.
Indeed, I did not formulate the question well. Your last two comments answer the intended question. Should one edit the question in this situation?
@R.Matveev No, that would be very disrespectful to all the work the answerers put (in this case only me). Better to ask a new question, which should be much more concise (no need to mention caches for example, or how you implemented them.) Thanks for asking and not just editing, that's not for-granted for new users at all.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.