惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Martin Heinz's Blog

A Guide to Python's Weak References Using weakref Module Recent Docker BuildKit Features You're Missing Out On Modern Git Commands and Features You Should Be Using Everything You Can Do with Python's textwrap Module Monitoring Indoor Air Quality with Prometheus, Grafana and a CO2 Sensor Everything You Can Do with Python's bisect Module You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache A Collection of Docker Images To Solve All Your Debugging Needs Weird Python "Features" That Might Catch You By Surprise Lessons Learned From Writing 100 Articles Debugging Crashes and Deadlocks in Python using PyStack Goodbye etcd, Hello PostgreSQL: Running Kubernetes with an SQL Database Remote Interactive Debugging of Python Applications Running in Kubernetes The Right Way to Run Shell Commands From Python Python's Missing Batteries: Essential Libraries You're Missing Out On Kubernetes-Native Synthetic Monitoring with Kuberhealthy Make Your CLI Demos a Breeze with Zero Stress and Zero Mistakes Reduce - The Power of a Single Python Function Why I Will Never Use Alpine Linux Ever Again Cgroups - Deep Dive into Resource Management in Kubernetes Dictionary Dispatch Pattern in Python Boost Your Python Application Performance using Continuous Profiling Lazy Evaluation Using Recursive Python Generators Python Magic Methods You Haven't Heard About Getting Started with Mastodon API in Python Backup-and-Restore of Containers with Kubernetes Checkpointing API Getting Started with Google APIs in Python Python CLI Tricks That Don't Require Any Code Whatsoever All The Ways To Introspect Python Objects at Runtime What is Python's "self" Argument, Anyway? Python List Comprehensions Are More Powerful Than You Might Think You Should Be Using Python's Walrus Operator - Here's Why Recipes and Tricks for Effective Structural Pattern Matching in Python It's Time to Say Goodbye to These Obsolete Python Libraries Advanced Features of Kubernetes' Horizontal Pod Autoscaler Data and System Visualization Tools That Will Boost Your Productivity Stop Messing with Kubernetes Finalizers Automate All the Boring Kubernetes Operations with Python End-to-End Monitoring with Grafana Cloud with Minimal Effort Bitly | bit.ly/3JLmSgA Bitly | bit.ly/3uETfbi Ultimate CI Pipeline for All of Your Python Projects Bitly | bit.ly/3M30D82 Bitly | bit.ly/3oMJ6qR Bitly | bit.ly/3IRD7IK Bitly | bit.ly/3A3B69t Profiling and Analyzing Performance of Python Programs Bitly | bit.ly/30uviIM Bitly | bit.ly/3E1X2mw Bitly | bit.ly/3Dv7JxP Bitly | bit.ly/3GG1BEz Bitly | bit.ly/3lLavs4 Bitly | bit.ly/39TqP3m Bitly | bit.ly/3A5Mpx8 Bitly | bit.ly/3kGwPl4 Bitly | bit.ly/3iHtulU Bitly | bit.ly/3xGjtKS Bitly | bit.ly/3h8DZg0 Bitly | bit.ly/2RQn1dG Bitly | bit.ly/3p2B5wW Bitly | bit.ly/3tULpb0 Bitly | bit.ly/2PHVudx Bitly | bit.ly/3uPtnb0 Bitly | bit.ly/3dg3QR9 Bitly | bit.ly/3qHtSkZ Bitly | bit.ly/3kIkTPr Bitly | bit.ly/3qlRAUN Bitly | bit.ly/3pCUJ26 Hardening Docker and Kubernetes with seccomp Bitly | bit.ly/34ZhIMt Bitly | bit.ly/3qSO7h0 Bitly | bit.ly/3muGLOk Bitly | bit.ly/35xN79v Bitly | bit.ly/3mLGshK Bitly | bit.ly/2IvkGQl Bitly | bit.ly/2Sk1KFK Bitly | bit.ly/3iCNIL6 Bitly | bit.ly/3beQPpy Saving Your Linux Machine from Certain Death New Features in Python 3.9 You Should Know About Deploy Any Python Project to Kubernetes Analyzing Docker Image Security Recursive SQL Queries with PostgreSQL Automating Every Aspect of Your Python Project Tour of Python Itertools Implementing 2D Physics in Javascript Ultimate Setup for Your Next Python Project Making Python Programs Blazingly Fast Security and Cryptography Mistakes You Are Probably Doing All The Time Going Serverless with OpenFaaS and Golang - Building Optimized Templates Going Serverless with OpenFaaS and Golang - The Ultimate Setup and Workflow Setting Up Swagger Docs for Golang API Building RESTful APIs in Golang Pytest Features, That You Need in Your (Testing) Life Setting up GitHub Package Registry with Docker and Golang Ultimate Setup for Your Next Golang Project Python Tips and Trick, You Haven't Already Seen, Part 2. Tricks for Postgres and Docker that will make your life easier Getting The Most Out of Reading Books - Reading The "Professional Way" Python Tips and Trick, You Haven't Already Seen
Real Multithreading is Coming to Python - Learn How You Can Use It Now
Martin · 2023-05-15 · via Martin Heinz's Blog

Python is 32 years old language, yet it still doesn't have proper, true parallelism/concurrency. This is going to change soon, thanks to introduction of a "Per-Interpreter GIL" (Global Interpreter Lock) which will land in Python 3.12. While release of Python 3.12 is some months away, the code is already there, so let's take an early peek at how we can use it to write truly concurrent Python code using sub-interpreters API.

Sub-Interpreters

Let's first explain how this "Per-Interpreter GIL" solves Python's lack of proper concurrency.

Simply put, GIL or Global Interpreter Lock is a mutex that allows only one thread to hold the control of the Python interpreter. This means that even if you create multiple threads in Python (e.g. using threading module) only one thread at the time will run.

With introduction of "Per-Interpreter GIL", individual Python interpreters don't share the same GIL anymore. This level of isolation allows each of these sub-interpreters to run really concurrently. This means, that we can bypass Python's concurrency limitations by spawning additional sub-interpreters, where each of them will have its own GIL (global state).

For more in-depth explanation see PEP 684 which describes this feature/change.

Setup

To use this new, bleeding edge feature, we need to install up-to-date Python version, that requires building from source:


# https://devguide.python.org/getting-started/setup-building/#unix-compiling
git clone https://github.com/python/cpython.git
cd cpython

./configure --enable-optimizations --prefix=$(pwd)/python-3.12
make -s -j2
./python
# Python 3.12.0a7+ (heads/main:22f3425c3d, May 10 2023, 12:52:07) [GCC 11.3.0] on linux
# Type "help", "copyright", "credits" or "license" for more information.

Where is it? (C-API)

We have the latest and greatest version installed, so how do we use the sub-interpreters? Can we simply import it? Well, unfortunately, not yet.

As pointed-out in PEP-684:

...this is an advanced feature meant for a narrow set of users of the C-API.

The features of Per-Interpreter GIL are - for now - only available using C-API, so there's no direct interface for Python developers. Such interface is expected to come with PEP 554, which - if accepted - is supposed to land in Python 3.13, until then we will have to hack our way to the sub-interpreter implementation.

So, while there is no documentation for it, or documented module we could import, there are bits and pieces in CPython codebase that show us a lot about how to use it.

We have 2 options here:

  • We can use _xxsubinterpreters module which is implemented in C, hence the weird name. Because it's implemented in C, you can't easily inspect the code (at least not in Python).
  • Or we can take advantage of CPython's test module which has sample Interpreter (and Channel) classes used for testing.

# Choose one of these:
import _xxsubinterpreters as interpreters
from test.support import interpreters

For the most part, in the following examples, we will be using the second option.

We've found the sub-interpreters, but we will also need to borrow some helper functions from Python's test module that we will use to pass code to sub-interpreter:


from textwrap import dedent
import os
# https://github.com/python/cpython/blob/
#   15665d896bae9c3d8b60bd7210ac1b7dc533b093/Lib/test/test__xxsubinterpreters.py#L75
def _captured_script(script):
    r, w = os.pipe()
    indented = script.replace('\n', '\n                ')
    wrapped = dedent(f"""
        import contextlib
        with open({w}, 'w', encoding="utf-8") as spipe:
            with contextlib.redirect_stdout(spipe):
                {indented}
        """)
    return wrapped, open(r, encoding="utf-8")


def _run_output(interp, request, channels=None):
    script, rpipe = _captured_script(request)
    with rpipe:
        interp.run(script, channels=channels)
        return rpipe.read()

Putting the interpreters module and the above helpers together, we can spawn our first sub-interpreter:


from test.support import interpreters

main = interpreters.get_main()
print(f"Main interpreter ID: {main}")
# Main interpreter ID: Interpreter(id=0, isolated=None)

interp = interpreters.create()

print(f"Sub-interpreter: {interp}")
# Sub-interpreter: Interpreter(id=1, isolated=True)

# https://github.com/python/cpython/blob/
#   15665d896bae9c3d8b60bd7210ac1b7dc533b093/Lib/test/test__xxsubinterpreters.py#L236
code = dedent("""
            from test.support import interpreters
            cur = interpreters.get_current()
            print(cur.id)
            """)

out = _run_output(interp, code)

print(f"All Interpreters: {interpreters.list_all()}")
# All Interpreters: [Interpreter(id=0, isolated=None), Interpreter(id=1, isolated=None)]
print(f"Output: {out}")  # Result of 'print(cur.id)'
# Output: 1

One way to spawn and run a new interpreter is to use the create function and then pass the interpreter to the _run_output helper function along with code we want to execute.

An easier way is to simply...


interp = interpreters.create()
interp.run(code)

...use the run method of an interpreter.

However, if we try to run either of the above 2 code snippets, we will receive the following error:


Fatal Python error: PyInterpreterState_Delete: remaining subinterpreters
Python runtime state: finalizing (tstate=0x000055b5926bf398)

To avoid it, we also need to clean up any dangling interpreters:


def cleanup_interpreters():
    for i in interpreters.list_all():
        if i.id == 0:  # main
            continue
        try:
            print(f"Cleaning up interpreter: {i}")
            i.close()
        except RuntimeError:
            pass  # already destroyed

cleanup_interpreters()
# Cleaning up interpreter: Interpreter(id=1, isolated=None)
# Cleaning up interpreter: Interpreter(id=2, isolated=None)

Threading

While running the code with above helper functions works, it might be more convenient to use familiar interface in threading module:


import threading

def run_in_thread():
    t = threading.Thread(target=interpreters.create)
    print(t)
    t.start()
    print(t)
    t.join()
    print(t)

run_in_thread()
run_in_thread()

# <Thread(Thread-1 (create), initial)>
# <Thread(Thread-1 (create), started 139772371633728)>
# <Thread(Thread-1 (create), stopped 139772371633728)>
# <Thread(Thread-2 (create), initial)>
# <Thread(Thread-2 (create), started 139772371633728)>
# <Thread(Thread-2 (create), stopped 139772371633728)>

Here we pass the interpreters.create function to the Thread which automatically spawns the new sub-interpreter inside a thread.

We can also combine the 2 approaches and pass the helper function to the threading.Thread:


import time

def run_in_thread():
    interp = interpreters.create(isolated=True)
    t = threading.Thread(target=_run_output, args=(interp, dedent("""
            import _xxsubinterpreters as _interpreters
            cur = _interpreters.get_current()

            import time
            time.sleep(2)
            # Can't print from here, won't bubble-up to main interpreter

            assert isinstance(cur, _interpreters.InterpreterID)
            """)))
    print(f"Created Thread: {t}")
    t.start()
    return t


t1 = run_in_thread()
print(f"First running Thread: {t1}")
t2 = run_in_thread()
print(f"Second running Thread: {t2}")
time.sleep(4)  # Need to sleep to give Threads time to complete
cleanup_interpreters()

Here we also demonstrate how to use the _xxsubinterpreters module instead of one in test.support. We also sleep for 2 seconds in each thread to simulate some "work". Notice, that we don't even bother calling join on the threads, we simply clean up the interpreters when they complete.

Channels

If we dig through the CPython test module a little more, we will also find that there is an implementation of RecvChannel and SendChannel classes which resemble the channels known from Golang. To use them:


# https://github.com/python/cpython/blob/
#   15665d896bae9c3d8b60bd7210ac1b7dc533b093/Lib/test/test_interpreters.py#L583
r, s = interpreters.create_channel()

print(f"Channel: {r}, {s}")
# Channel: RecvChannel(id=0), SendChannel(id=0)

orig = b'spam'
s.send_nowait(orig)
obj = r.recv()
print(f"Received: {obj}")
# Received: b'spam'

cleanup_interpreters()
# Need clean up, otherwise:

# free(): invalid pointer
# Aborted (core dumped)

This example shows how we can create a channel with receiver (r) and sender (s) end. We can then pass data to the sender using send_nowait and read it on the other side with recv function. This channel is really just another sub-interpreter - so same as before - we need to do clean up when we're done with it.

Digging Deeper

And finally, if we want to mess with or tweak the sub-interpreter options, which are generally set in C code, then we can use the code from test.support module, more specifically run_in_subinterp_with_config:


import test.support

def run_in_thread(script):
    test.support.run_in_subinterp_with_config(
        script,
        use_main_obmalloc=True,
        allow_fork=True,
        allow_exec=True,
        allow_threads=True,
        allow_daemon_threads=False,
        check_multi_interp_extensions=False,
        own_gil=True,
    )

code = dedent(f"""
            from test.support import interpreters
            cur = interpreters.get_current()
            print(cur)
            """)

run_in_thread(code)
# Interpreter(id=7, isolated=None)
run_in_thread(code)
# Interpreter(id=8, isolated=None)

This function is a Python API for this C function. It provides some sub-interpreter options such own_gil which specifies whether the sub-interpreter should have its own GIL.

Conclusion

I'm very happy to see that this change is finally coming to CPython and I want to highlight the amazing work and perseverance by Eric Snow, who made this happen.

With that said - as you could see here - the API isn't exactly easy to use, so unless you have C expertise or very urgent need for sub-interpreters, you might be better off waiting for proper support (hopefully) in Python 3.13. We have waited for many years, what's one more, right? Alternatively, you could try extrainterpreters project which provides a friendlier Python API to sub-interpreters.

While hard to use for average Python developer, I'm sure tool/library developers will put this to good use, and we will see performance improvements popping up in many libraries that can leverage sub-interpreters through the C-API.