Python 中 lru_cache 的使用和实现 (2)

日期：2021-05-18 栏目：程序人生浏览：次

lru_cache 的具体逻辑是在函数中实现的，还是一样，列出源码，保留注释。

def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo): # Constants shared by all lru cache instances: sentinel = object() # unique object used to signal cache misses make_key = _make_key # build a key from the function arguments PREV, NEXT, KEY, RESULT = 0, 1, 2, 3 # names for the link fields cache = {} hits = misses = 0 full = False cache_get = cache.get # bound method to lookup a key or return None cache_len = cache.__len__ # get cache size without calling len() lock = RLock() # because linkedlist updates aren't threadsafe root = [] # root of the circular doubly linked list root[:] = [root, root, None, None] # initialize by pointing to self if maxsize == 0: def wrapper(*args, **kwds): # No caching -- just a statistics update nonlocal misses misses += 1 result = user_function(*args, **kwds) return result elif maxsize is None: def wrapper(*args, **kwds): # Simple caching without ordering or size limit nonlocal hits, misses key = make_key(args, kwds, typed) result = cache_get(key, sentinel) if result is not sentinel: hits += 1 return result misses += 1 result = user_function(*args, **kwds) cache[key] = result return result else: def wrapper(*args, **kwds): # Size limited caching that tracks accesses by recency nonlocal root, hits, misses, full key = make_key(args, kwds, typed) with lock: link = cache_get(key) if link is not None: # Move the link to the front of the circular queue link_prev, link_next, _key, result = link link_prev[NEXT] = link_next link_next[PREV] = link_prev last = root[PREV] last[NEXT] = root[PREV] = link link[PREV] = last link[NEXT] = root hits += 1 return result misses += 1 result = user_function(*args, **kwds) with lock: if key in cache: # Getting here means that this same key was added to the # cache while the lock was released. Since the link # update is already done, we need only return the # computed result and update the count of misses. pass elif full: # Use the old root to store the new key and result. oldroot = root oldroot[KEY] = key oldroot[RESULT] = result # Empty the oldest link and make it the new root. # Keep a reference to the old key and old result to # prevent their ref counts from going to zero during the # update. That will prevent potentially arbitrary object # clean-up code (i.e. __del__) from running while we're # still adjusting the links. root = oldroot[NEXT] oldkey = root[KEY] oldresult = root[RESULT] root[KEY] = root[RESULT] = None # Now update the cache dictionary. del cache[oldkey] # Save the potentially reentrant cache[key] assignment # for last, after the root and links have been put in # a consistent state. cache[key] = oldroot else: # Put result in a new link at the front of the queue. last = root[PREV] link = [last, root, key, result] last[NEXT] = root[PREV] = cache[key] = link # Use the cache_len bound method instead of the len() function # which could potentially be wrapped in an lru_cache itself. full = (cache_len() >= maxsize) return result def cache_info(): """Report cache statistics""" with lock: return _CacheInfo(hits, misses, maxsize, cache_len()) def cache_clear(): """Clear the cache and cache statistics""" nonlocal hits, misses, full with lock: cache.clear() root[:] = [root, root, None, None] hits = misses = 0 full = False wrapper.cache_info = cache_info wrapper.cache_clear = cache_clear return wrapper

函数开始的地方 2~14 行定义了一些关键变量，

hits 和 misses 分别表示缓存命中和没有命中的次数

root 双向循环链表的头结点，每个节点保存前向指针、后向指针、key 和 key 对应的 result，其中 key 为 _make_key 函数根据参数结算出来的字符串，result 为被修饰的函数在给定的参数下返回的结果。注意，root 是不保存数据 key 和 result 的。

cache 是真正保存缓存数据的地方，类型为 dict。cache 中的 key 也是 _make_key 函数根据参数结算出来的字符串，value 保存的是 key 对应的双向循环链表中的节点。

接下来根据 maxsize 不同，定义不同的 wrapper。

maxsize == 0，其实也就是没有缓存，那么每次函数调用都不会命中，并且没有命中的次数 misses 加 1。

maxsize is None，不限制缓存大小，如果函数调用不命中，将没有命中次数 misses 加 1，否则将命中次数 hits 加 1。

限制缓存的大小，那么需要根据 LRU 算法来更新 cache，也就是 42~97 行的代码。

如果缓存命中 key，那么将命中节点移到双向循环链表的结尾，并且返回结果（47~58 行）

这里通过字典加双向循环链表的组合数据结构，实现了用 O(1) 的时间复杂度删除给定的节点。

如果没有命中，并且缓存满了，那么需要将最久没有使用的节点（root 的下一个节点）删除，并且将新的节点添加到链表结尾。在实现中有一个优化，直接将当前的 root 的 key 和 result 替换成新的值，将 root 的下一个节点置为新的 root，这样得到的双向循环链表结构跟删除 root 的下一个节点并且将新节点加到链表结尾是一样的，但是避免了删除和添加节点的操作（68~88 行）

转载注明出处：https://www.heiqu.com/wpzzfw.html

Python 中 lru_cache 的使用和实现 (2)

相关推荐