Storing cached values of a function as an attribute of the function in Python

Question

I would like to have @cached decorator akin @memoized that stores cached values of a function as an attribute of the function. Something like this

def cached(fcn):
    def cached_fcn(*args,**kwargs):
        call_signature=",".join([repr(a) for a in args] +
                                [repr(kwa[0])+"="+repr(kwa[1])
                                 for kwa in sorted(kwargs.items()) ])
        if call_signature not in cached_fcn.cache:
            cached_fcn.cache[call_signature] = fcn(*args,**kwargs)
        return copy.deepcopy(cached_fcn.cache[call_signature])
    cached_fcn.__name__ = fcn.__name__
    cached_fcn.__doc__ = fcn.__doc__
    cached_fcn.__annotations__ = fcn.__annotations__    
    cached_fcn.cache = dict()
    return cached_fcn

@cached
def fib(n):
    if n in (0,1): return 1
    return fin(n-1) + fib(n-2)

Assuming that the function does not access anything global, is it safe to do that? What if threading is used?

I don't see a problem with your implementation, just want to mention functools.lru_cache can achieve the same thing. — fusion
– fusion, Commented Aug 31, 2020 at 6:45

kabanus · Accepted Answer · 2020-08-31 08:03:37Z

1

There is one pitfall that may be relevant to your implementation. Observe

def pf(*args, **kwargs):
    print(args)
    print(kwargs)

and call this with

pf(1, k="a")
pf(1, "a")
pf(k="a", x=1)

All argument specs are valid specs for a function with signature f(x, k) (with or without defaults) - so you can't really know the order of the arguments, their names, and sorting on kwargs is definitely not enough in a general case (empty in the second example, while args is empty in the last with order reversed). Defaults make this worse as if f(x, k=3) is the definition, then f(2, 3) and f(2) and f(x=2) f(2, k=3) and f(x=2, k=3) (also reversed) are the same, with differing kwargs and args passed to the wrapper.

A more robust solution will use inspect.getargspec(your_function). This uses reflection to know the actual argument names of the function as they were defined. You then have to "fill in" the arguments your are given in *args and **kwargs, and use that to generate your call signature:

import inspect
def f(x, k=3): pass
argspec = inspect.getargspec(f) # returns ArgSpec(args=['x', 'k'], varargs=None, keywords=None, defaults=(3,))

Now you can generate a call signature (from *args and **kwargs):

signature = {}
for arg, default in zip(reversed(argspec.args), reversed(argspec.defaults)):
    signature[arg] = default

set_args = set()
for arg, val in zip(argspec.args, args):
    set_args.add(arg)
    signature[arg] = val

for arg, val in kwargs.items():
    # if arg in set_args:
    #    raise TypeError(f'{arg} set both in kwargs and in args!')
    # if arg not in argspec.args:
    #    raise TypeError(f'{arg} is not a valid argument for function!')
    signature[arg] = val

# if len(signature) == len(argspec.args):
#     raise TypeError(f'Received {len(signature)} arguments but expected {len(argspec.args)} arguments!')

Then you can use the dictionary signature itself as the call signature. I showed some "correctness" checks above though you may just want to let the call itself to the function detect and fail. I did not handle functions with **kwargs and *args (the actual used names are given in argspec). I think they may just involve having args and kwargs keys in signature. I am still not sure how robust the above is.

Even better, use the builtin functools.lru_cache which does what you want.

Regarding threading, you have the same dangers as anytime multiple threads access the same array. There is nothing special about function attributes. lru_cache should be safe (there was one bug that was resolved) with the one caveat:

To help measure the effectiveness of the cache and tune the maxsize parameter, the wrapped function is instrumented with a cache_info() function that returns a named tuple showing hits, misses, maxsize and currsize. In a multi-threaded environment, the hits and misses are approximate

edited Aug 31, 2020 at 8:03

answered Aug 31, 2020 at 6:58

kabanus

26.3k7 gold badges48 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

R. Matveev Over a year ago

Yes, I see this pitfall now. Thanks. I know that implementations exist. My question really is, is it save/robust to use custom function attributes for this sort of purposes. Another example would be, if I want to log all function calls together with some additional info, e.g. time spent in the call, etc. It's more like a question about using function attributes to collect info at runtime.

kabanus Over a year ago

@R.Matveev In Python you can (and it's safe to) save whatever you want on any object you want. That's a very different question than what you asked.

kabanus Over a year ago

@R.Matveev Regarding threading, you have the same pitfalls as when multiple threads are setting the same variable, there is nothing special about attributes on a functions.

R. Matveev Over a year ago

Indeed, I did not formulate the question well. Your last two comments answer the intended question. Should one edit the question in this situation?

kabanus Over a year ago

@R.Matveev No, that would be very disrespectful to all the work the answerers put (in this case only me). Better to ask a new question, which should be much more concise (no need to mention caches for example, or how you implemented them.) Thanks for asking and not just editing, that's not for-granted for new users at all.

|

Collectives™ on Stack Overflow

Storing cached values of a function as an attribute of the function in Python

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related