惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Martin Heinz's Blog

Recent Docker BuildKit Features You're Missing Out On Modern Git Commands and Features You Should Be Using Everything You Can Do with Python's textwrap Module Monitoring Indoor Air Quality with Prometheus, Grafana and a CO2 Sensor Everything You Can Do with Python's bisect Module You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache A Collection of Docker Images To Solve All Your Debugging Needs Weird Python "Features" That Might Catch You By Surprise Lessons Learned From Writing 100 Articles Debugging Crashes and Deadlocks in Python using PyStack Goodbye etcd, Hello PostgreSQL: Running Kubernetes with an SQL Database Remote Interactive Debugging of Python Applications Running in Kubernetes The Right Way to Run Shell Commands From Python Real Multithreading is Coming to Python - Learn How You Can Use It Now Python's Missing Batteries: Essential Libraries You're Missing Out On Kubernetes-Native Synthetic Monitoring with Kuberhealthy Make Your CLI Demos a Breeze with Zero Stress and Zero Mistakes Reduce - The Power of a Single Python Function Why I Will Never Use Alpine Linux Ever Again Cgroups - Deep Dive into Resource Management in Kubernetes Dictionary Dispatch Pattern in Python Boost Your Python Application Performance using Continuous Profiling Lazy Evaluation Using Recursive Python Generators Python Magic Methods You Haven't Heard About Getting Started with Mastodon API in Python Backup-and-Restore of Containers with Kubernetes Checkpointing API Getting Started with Google APIs in Python Python CLI Tricks That Don't Require Any Code Whatsoever All The Ways To Introspect Python Objects at Runtime What is Python's "self" Argument, Anyway? Python List Comprehensions Are More Powerful Than You Might Think You Should Be Using Python's Walrus Operator - Here's Why Recipes and Tricks for Effective Structural Pattern Matching in Python It's Time to Say Goodbye to These Obsolete Python Libraries Advanced Features of Kubernetes' Horizontal Pod Autoscaler Data and System Visualization Tools That Will Boost Your Productivity Stop Messing with Kubernetes Finalizers Automate All the Boring Kubernetes Operations with Python End-to-End Monitoring with Grafana Cloud with Minimal Effort Bitly | bit.ly/3JLmSgA Bitly | bit.ly/3uETfbi Ultimate CI Pipeline for All of Your Python Projects Bitly | bit.ly/3M30D82 Bitly | bit.ly/3oMJ6qR Bitly | bit.ly/3IRD7IK Bitly | bit.ly/3A3B69t Profiling and Analyzing Performance of Python Programs Bitly | bit.ly/30uviIM Bitly | bit.ly/3E1X2mw Bitly | bit.ly/3Dv7JxP Bitly | bit.ly/3GG1BEz Bitly | bit.ly/3lLavs4 Bitly | bit.ly/39TqP3m Bitly | bit.ly/3A5Mpx8 Bitly | bit.ly/3kGwPl4 Bitly | bit.ly/3iHtulU Bitly | bit.ly/3xGjtKS Bitly | bit.ly/3h8DZg0 Bitly | bit.ly/2RQn1dG Bitly | bit.ly/3p2B5wW Bitly | bit.ly/3tULpb0 Bitly | bit.ly/2PHVudx Bitly | bit.ly/3uPtnb0 Bitly | bit.ly/3dg3QR9 Bitly | bit.ly/3qHtSkZ Bitly | bit.ly/3kIkTPr Bitly | bit.ly/3qlRAUN Bitly | bit.ly/3pCUJ26 Hardening Docker and Kubernetes with seccomp Bitly | bit.ly/34ZhIMt Bitly | bit.ly/3qSO7h0 Bitly | bit.ly/3muGLOk Bitly | bit.ly/35xN79v Bitly | bit.ly/3mLGshK Bitly | bit.ly/2IvkGQl Bitly | bit.ly/2Sk1KFK Bitly | bit.ly/3iCNIL6 Bitly | bit.ly/3beQPpy Saving Your Linux Machine from Certain Death New Features in Python 3.9 You Should Know About Deploy Any Python Project to Kubernetes Analyzing Docker Image Security Recursive SQL Queries with PostgreSQL Automating Every Aspect of Your Python Project Tour of Python Itertools Implementing 2D Physics in Javascript Ultimate Setup for Your Next Python Project Making Python Programs Blazingly Fast Security and Cryptography Mistakes You Are Probably Doing All The Time Going Serverless with OpenFaaS and Golang - Building Optimized Templates Going Serverless with OpenFaaS and Golang - The Ultimate Setup and Workflow Setting Up Swagger Docs for Golang API Building RESTful APIs in Golang Pytest Features, That You Need in Your (Testing) Life Setting up GitHub Package Registry with Docker and Golang Ultimate Setup for Your Next Golang Project Python Tips and Trick, You Haven't Already Seen, Part 2. Tricks for Postgres and Docker that will make your life easier Getting The Most Out of Reading Books - Reading The "Professional Way" Python Tips and Trick, You Haven't Already Seen
A Guide to Python's Weak References Using weakref Module
Martin · 2024-06-26 · via Martin Heinz's Blog

Chances are that you never touched and maybe haven't even heard about Python's weakref module. While it might not be commonly used in your code, it's fundamental to inner workings of many libraries, frameworks and even Python itself. So, in this article we will explore what it is, how is it helpful, and how you could incorporate it into your code as well.

The Basics

To understand weakref module and weak references, we first need a little intro to garbage collection in Python.

Python uses reference counting as a mechanism for garbage collection - in simple terms - Python keeps a reference count for each object we create and the reference count is incremented whenever the object is referenced in code; and it's decremented when an object is de-referenced (e.g. variable set to None). If the reference count ever drop to zero, the memory for the object is deallocated (garbage-collected).

Let's look at some code to understand it a little more:


import sys

class SomeObject:
    def __del__(self):
        print(f"(Deleting {self=})")

obj = SomeObject()

print(sys.getrefcount(obj))  # 2

obj2 = obj
print(sys.getrefcount(obj))  # 3

obj = None
obj2 = None

# (Deleting self=<__main__.SomeObject object at 0x7d303fee7e80>)

Here we define a class that only implements a __del__ method, which is called when object is garbage-collected (GC'ed) - we do this so that we can see when the garbage collection happens.

After creating an instance of this class, we use sys.getrefcount to get current number of references to this object. We would expect to get 1 here, but the count returned by getrefcount is generally one higher than you might expect, that's because when we call getrefcount, the reference is copied by value into the function's argument, temporarily bumping up the object's reference count.

Next, if we declare obj2 = obj and call getrefcount again, we get 3 because it's now referenced by both obj and obj2. Conversely, if we assign None to these variables, the reference count will decrease to zero, and eventually we will get the message from __del__ method telling us that the object got garbage-collected.

Well, and how do weak references fit into this? If only remaining references to an object are weak references, then Python interpreter is free to garbage-collect this object. In other words - a weak reference to an object is not enough to keep the object alive:


import weakref

obj = SomeObject()

reference = weakref.ref(obj)

print(reference)  # <weakref at 0x734b0a514590; to 'SomeObject' at 0x734b0a4e7700>
print(reference())  # <__main__.SomeObject object at 0x707038c0b700>
print(obj.__weakref__)  # <weakref at 0x734b0a514590; to 'SomeObject' at 0x734b0a4e7700>

print(sys.getrefcount(obj))  # 2

obj = None

# (Deleting self=<__main__.SomeObject object at 0x70744d42b700>)

print(reference)  # <weakref at 0x7988e2d70590; dead>
print(reference())  # None

Here we again declare a variable obj of our class, but this time instead of creating second strong reference to this object, we create weak reference in reference variable.

If we then check the reference count, we can see that it did not increase, and if we set the obj variable to None, we can see that it immediately gets garbage-collected even though the weak reference still exist.

Finally, if try to access the weak reference to the already garbage-collected object, we get a "dead" reference and None respectively.

Also notice that when we used the weak reference to access the object, we had to call it as a function (reference()) to retrieve to object. Therefore, it is often more convenient to use a proxy instead, especially if you need to access object attributes:


obj = SomeObject()

reference = weakref.proxy(obj)

print(reference)  # <__main__.SomeObject object at 0x78a420e6b700>

obj.attr = 1
print(reference.attr)  # 1

When To Use It

Now that we know how weak references work, let's look at some examples of how they could be useful.

A common use-case for weak references is tree-like data structures:


class Node:
    def __init__(self, value):
        self.value = value
        self._parent = None
        self.children = []

    def __repr__(self):
        return "Node({!r:})".format(self.value)

    @property
    def parent(self):
        return self._parent if self._parent is None else self._parent()

    @parent.setter
    def parent(self, node):
        self._parent = weakref.ref(node)

    def add_child(self, child):
        self.children.append(child)
        child.parent = self

root = Node("parent")
n = Node("child")
root.add_child(n)
print(n.parent)  # Node('parent')

del root
print(n.parent)  # None

Here we implement a tree using a Node class where child nodes have weak reference to their parent. In this relation, the child Node can live without parent Node, which allows parent to be silently removed/garbage-collected.

Alternatively, we can flip this around:


class Node:
    def __init__(self, value):
        self.value = value
        self._children = weakref.WeakValueDictionary()

    @property
    def children(self):
        return list(self._children.items())

    def add_child(self, key, child):
        self._children[key] = child

root = Node("parent")
n1 = Node("child one")
n2 = Node("child two")
root.add_child("n1", n1)
root.add_child("n2", n2)
print(root.children)  # [('n1', Node('child one')), ('n2', Node('child two'))]

del n1
print(root.children)  # [('n2', Node('child two'))]

Here instead, the parent keeps dictionary of weak references to its children. This uses WeakValueDictionary - whenever an element (weak reference) referenced from the dictionary gets dereferenced elsewhere in the program, it automatically gets removed from the dictionary too, so we don't have manage lifecycle of dictionary items.

Another use of weakref is in Observer design pattern:


class Observable:
    def __init__(self):
        self._observers = weakref.WeakSet()

    def register_observer(self, obs):
        self._observers.add(obs)

    def notify_observers(self, *args, **kwargs):
        for obs in self._observers:
            obs.notify(self, *args, **kwargs)


class Observer:
    def __init__(self, observable):
        observable.register_observer(self)

    def notify(self, observable, *args, **kwargs):
        print("Got", args, kwargs, "From", observable)


subject = Observable()
observer = Observer(subject)
subject.notify_observers("test", kw="python")
# Got ('test',) {'kw': 'python'} From <__main__.Observable object at 0x757957b892d0>

The Observable class keeps weak references to its observers, because it doesn't care if they get removed. As with previous examples, this avoids having to manage the lifecycle of dependant objects. As you probably noticed, in this example we used WeakSet which is another class from weakref module, it behaves just like the WeakValueDictionary but is implemented using Set.

Final example for this section is borrowed from weakref docs:


import tempfile, shutil
from pathlib import Path

class TempDir:
    def __init__(self):
        self.name = tempfile.mkdtemp()
        self._finalizer = weakref.finalize(self, shutil.rmtree, self.name)

    def __repr__(self):
        return "TempDir({!r:})".format(self.name)

    def remove(self):
        self._finalizer()

    @property
    def removed(self):
        return not self._finalizer.alive

tmp = TempDir()
print(tmp)  # TempDir('/tmp/tmp8o0aecl3')
print(tmp.removed)  # False
print(Path(tmp.name).is_dir()) # True

This showcases one more feature of weakref module, which is weakref.finalize. As the name suggest it allows executing a finalizer function/callback when the dependant object is garbage-collected. In this case we implement a TempDir class which can be used to create a temporary directory - in ideal case we would always remember to clean up the TempDir when we don't need it anymore, but if we forget, we have the finalizer that will automatically run rmtree on the directory when the TempDir object is GC'ed, which includes when program exits completely.

Real-World Examples

The previous section has shown couple practical usages for weakref, but let's also take a look at real-world examples - one of them being creating a cached instance:


import logging
a = logging.getLogger("first")
b = logging.getLogger("second")
print(a is b)  # False

c = logging.getLogger("first")
print(a is c)  # True

The above is basic usage of Python's builtin logging module - we can see that it allows to only associate a single logger instance with a given name - meaning that when we retrieve same logger multiple times, it always returns the same cached logger instance.

If we wanted to implement this, it could look something like this:


class Logger:
    def __init__(self, name):
        self.name = name

_logger_cache = weakref.WeakValueDictionary()

def get_logger(name):
    if name not in _logger_cache:
        l = Logger(name)
        _logger_cache[name] = l
    else:
        l = _logger_cache[name]
    return l

a = get_logger("first")
b = get_logger("second")
print(a is b)  # False

c = get_logger("first")
print(a is c)  # True

And finally, Python itself uses weak references, e.g. in implementation of OrderedDict:


from _weakref import proxy as _proxy

class OrderedDict(dict):

    def __new__(cls, /, *args, **kwds):
        self = dict.__new__(cls)
        self.__hardroot = _Link()
        self.__root = root = _proxy(self.__hardroot)
        root.prev = root.next = root
        self.__map = {}
        return self

The above is snippet from CPython's collections module. Here, the weakref.proxy is used to prevent circular references (see the doc-strings for more details).

Conclusion

weakref is fairly obscure, but at times very useful tool that you should keep in your toolbox. It can be very helpful when implementing caches or data structures that have reference loops in them, such as doubly linked lists.

With that said, one should be aware on weakref support - everything said here and in the docs is CPython specific and different Python implementations will have different weakref behavior. Also, many of the builtin types don't support weak references, such as list, tuple or int.