惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Martin Heinz's Blog

A Guide to Python's Weak References Using weakref Module Recent Docker BuildKit Features You're Missing Out On Modern Git Commands and Features You Should Be Using Everything You Can Do with Python's textwrap Module Monitoring Indoor Air Quality with Prometheus, Grafana and a CO2 Sensor Everything You Can Do with Python's bisect Module You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache A Collection of Docker Images To Solve All Your Debugging Needs Weird Python "Features" That Might Catch You By Surprise Lessons Learned From Writing 100 Articles Goodbye etcd, Hello PostgreSQL: Running Kubernetes with an SQL Database Remote Interactive Debugging of Python Applications Running in Kubernetes The Right Way to Run Shell Commands From Python Real Multithreading is Coming to Python - Learn How You Can Use It Now Python's Missing Batteries: Essential Libraries You're Missing Out On Kubernetes-Native Synthetic Monitoring with Kuberhealthy Make Your CLI Demos a Breeze with Zero Stress and Zero Mistakes Reduce - The Power of a Single Python Function Why I Will Never Use Alpine Linux Ever Again Cgroups - Deep Dive into Resource Management in Kubernetes Dictionary Dispatch Pattern in Python Boost Your Python Application Performance using Continuous Profiling Lazy Evaluation Using Recursive Python Generators Python Magic Methods You Haven't Heard About Getting Started with Mastodon API in Python Backup-and-Restore of Containers with Kubernetes Checkpointing API Getting Started with Google APIs in Python Python CLI Tricks That Don't Require Any Code Whatsoever All The Ways To Introspect Python Objects at Runtime What is Python's "self" Argument, Anyway? Python List Comprehensions Are More Powerful Than You Might Think You Should Be Using Python's Walrus Operator - Here's Why Recipes and Tricks for Effective Structural Pattern Matching in Python It's Time to Say Goodbye to These Obsolete Python Libraries Advanced Features of Kubernetes' Horizontal Pod Autoscaler Data and System Visualization Tools That Will Boost Your Productivity Stop Messing with Kubernetes Finalizers Automate All the Boring Kubernetes Operations with Python End-to-End Monitoring with Grafana Cloud with Minimal Effort Bitly | bit.ly/3JLmSgA Bitly | bit.ly/3uETfbi Ultimate CI Pipeline for All of Your Python Projects Bitly | bit.ly/3M30D82 Bitly | bit.ly/3oMJ6qR Bitly | bit.ly/3IRD7IK Bitly | bit.ly/3A3B69t Profiling and Analyzing Performance of Python Programs Bitly | bit.ly/30uviIM Bitly | bit.ly/3E1X2mw Bitly | bit.ly/3Dv7JxP Bitly | bit.ly/3GG1BEz Bitly | bit.ly/3lLavs4 Bitly | bit.ly/39TqP3m Bitly | bit.ly/3A5Mpx8 Bitly | bit.ly/3kGwPl4 Bitly | bit.ly/3iHtulU Bitly | bit.ly/3xGjtKS Bitly | bit.ly/3h8DZg0 Bitly | bit.ly/2RQn1dG Bitly | bit.ly/3p2B5wW Bitly | bit.ly/3tULpb0 Bitly | bit.ly/2PHVudx Bitly | bit.ly/3uPtnb0 Bitly | bit.ly/3dg3QR9 Bitly | bit.ly/3qHtSkZ Bitly | bit.ly/3kIkTPr Bitly | bit.ly/3qlRAUN Bitly | bit.ly/3pCUJ26 Hardening Docker and Kubernetes with seccomp Bitly | bit.ly/34ZhIMt Bitly | bit.ly/3qSO7h0 Bitly | bit.ly/3muGLOk Bitly | bit.ly/35xN79v Bitly | bit.ly/3mLGshK Bitly | bit.ly/2IvkGQl Bitly | bit.ly/2Sk1KFK Bitly | bit.ly/3iCNIL6 Bitly | bit.ly/3beQPpy Saving Your Linux Machine from Certain Death New Features in Python 3.9 You Should Know About Deploy Any Python Project to Kubernetes Analyzing Docker Image Security Recursive SQL Queries with PostgreSQL Automating Every Aspect of Your Python Project Tour of Python Itertools Implementing 2D Physics in Javascript Ultimate Setup for Your Next Python Project Making Python Programs Blazingly Fast Security and Cryptography Mistakes You Are Probably Doing All The Time Going Serverless with OpenFaaS and Golang - Building Optimized Templates Going Serverless with OpenFaaS and Golang - The Ultimate Setup and Workflow Setting Up Swagger Docs for Golang API Building RESTful APIs in Golang Pytest Features, That You Need in Your (Testing) Life Setting up GitHub Package Registry with Docker and Golang Ultimate Setup for Your Next Golang Project Python Tips and Trick, You Haven't Already Seen, Part 2. Tricks for Postgres and Docker that will make your life easier Getting The Most Out of Reading Books - Reading The "Professional Way" Python Tips and Trick, You Haven't Already Seen
Debugging Crashes and Deadlocks in Python using PyStack
Martin · 2023-07-18 · via Martin Heinz's Blog

There are certain bugs and issues that are very hard to troubleshoot. Just ask yourself, "How would I debug deadlock, segmentation fault, crashing application, or a hanging process?"

Now there's a tool to help you will all of that - PyStack is a powerful new debugger that - according to docs - "uses forbidden magic to let you inspect the stack frames of a running Python process or a Python core dump."

So, let's put that statement to a test and explore how to use PyStack to track down and troubleshoot all the above-mentioned issues and bugs...

Why?

There are many debugging tools out there, including interactive debuggers in your IDE, so you might be wondering, "Why is this even needed?"

While traditional debuggers or even simple print statements are often enough for debugging, there are bugs and issues that are much harder to track down and troubleshoot.

One such category are deadlocks or any application that gets stuck. When you have hanging process, you might ask yourself, "Is it doing something? Or is it stuck/dead/deadlocked?" and PyStack can help answer that.

Another category is hybrid applications - that is - applications that combine C/C++ and Python. That includes Python extension modules, but also libraries such as NumPy or TensorFlow which many of us use every day. Well, how would you debug NumPy crashing with segfault?

And then there are weird issues that happen in specific circumstances, e.g. under heavy load; after apps has been running for X hours, etc.

But aren't there tools for this already? Well, yes, but actually no.

Only real alternative is GDB which isn't exactly user-friendly, and it can't debug both Python and C/C++ at the same time.

There are also 2 Python tools austin and py-spy, those are however profilers and not debuggers.

Now, what is so special about PyStack and why it's superior to the above-mentioned tools? For a start, unlike GDB, PyStack only reads memory (optionally stops the process), and doesn't modify it. GDB can modify your code, which can be a big no-no.

It can also inspect all core dump files - which is a big statement, because core dumps can be wild. It's also able to prints local variables as part of stacktrace. And it automatically downloads debugging information (if not available) for the particular distro. That's just a sample of reasons why PyStack is very cool.

Setup

Before we start debugging, we need to do some setup:


sudo apt update
sudo apt install systemd-coredump python3-pip python3.10-venv

echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope

python3 -m venv ./venv
source ./venv/bin/activate

pip install pystack pytest-pystack

One of the features of PyStack is ability to inspect core dumps, so we need to enable core dump generation in our system - the simplest way to achieve that is to install systemd-coredump. All the core dumps will then be automatically stored in /var/lib/systemd/coredump/. Also, to be able to trace a process, PyStack needs to send ptrace syscalls to the traced process, we can temporarily enable that with echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope.

Finally, we need to install pystack itself, as well as the Pytest plugin for it.

Debugging

PyStack has 2 ways of debugging a program - either by attaching to a running process or by analyzing core dump of a crashed process.

Let's start with the former. We will use the following snippet that simply sleeps forever, simulating a hanging process:


# wait.py
from time import sleep

def wait():
    some_var = "data"
    other_var = [1, 2, 3]
    while True:
        sleep(5)
        print("Sleeping...")

wait()

We can run it in the background and then run pystack remote against its PID to see what's happening in the code:


nohup python wait.py &
# [1] 44000

pystack remote 44000 --locals --no-block
# Traceback for thread 44000 (python) [] (most recent call last):
#     (Python) File "/home/.../wait.py", line 18, in <module>
#         wait()
#     (Python) File "/home/.../wait.py", line 15, in wait
#         sleep(5)
#       Locals:
#         other_var: [1, 2, 3]
#         some_var: "data"

We already know that the program hangs on the sleep(5) call and the above output confirms it. It also shows us the local variables which can provide important context.

Here's also a screenshot, so that you can see nice color highlighting:

pystack remote

Unfortunately, the above example might not work on your Mac or Windows, because PyStack only supports Linux. It however works just fine in Docker!

Let's demonstrate that with another code snippet, this time actual deadlock of 2 threads:


import threading
import time

l1 = threading.Lock()
l2 = threading.Lock()

def f1(name):
    print('thread', name, 'about to lock l1')
    with l1:
        print('thread', name, 'has lock l1')
        time.sleep(0.3)
        print('thread', name, 'about to lock l2')
        with l2:
            print('thread', name, 'run into deadlock')

def f2(name):
    print('thread', name, 'about to lock l2')
    with l2:
        print('thread', name, 'has lock l2')
        time.sleep(0.3)
        print('thread', name, 'about to lock l1')
        with l1:
            print('thread', name, 'run into deadlock')


if __name__ == '__main__':
    t1 = threading.Thread(target=f1, args=['t1', ])
    t2 = threading.Thread(target=f2, args=['t2', ])

    t1.start()
    t2.start()
    t1.join()
    t2.join()

To make this into a container image, you can use the following Dockerfile:


FROM python:3.10
COPY deadlock.py .

RUN pip install pystack pytest-pystack

CMD python3 -u deadlock.py

To then build it, run it and debug it:


docker build -t python-pystack -f Dockerfile .
docker run --cap-add=SYS_PTRACE --name python-pystack --rm python-pystack

docker exec -it python-pystack /bin/bash
root@438137c1086c:/# ps aux

# USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
# root           1  0.0  0.0   2484   576 ?        Ss   15:20   0:00 /bin/sh -c python3 -u deadlock.py
# root           7  0.0  0.0 159348  7784 ?        Sl   15:20   0:00 python3 -u deadlock.py
# root          10  0.0  0.0   6056  3592 pts/0    Ss   15:20   0:00 /bin/bash
# root          16  0.0  0.0   8652  3204 pts/0    R+   15:20   0:00 ps aux

root@438137c1086c:/# pystack remote 7 --locals --no-block

When we run the container, we pass --cap-add=SYS_PTRACE to docker run - this gives the container privileges to trace a process. Next, we exec into it and find the process we want to trace. After that, we run pystack remote and get the following output:

Docker deadlock

Core Files

Sometimes we aren't so lucky to have the process running to be able to debug it live. Sometimes the application just crashes with nothing useful in logs. In those cases we might need to inspect the core dump of the process.

If you're unfamiliar with core files (core dumps), then here's definition:

The core dump refers to the recording of the state of the program, i.e. its resources in memory and processor.

Essentially, a snapshot of the process when it crashed - produced when the familiar message Segmentation fault (core dumped) occurs.

To analyze a core dump, we need to force Python to crash, for example by seg-faulting or aborting:


# crash.py
import os
import ctypes

def crash():
    some_var = "data"
    other_var = [1, 2, 3]
    print(some_var)
    ctypes.string_at(0)  # Segmentation Fault (core dumped)
    # os.abort()         # Aborted (core dumped)

crash()

If we then run this script, we will get the familiar message:


python crash.py
# Segmentation fault (core dumped)

cp /var/lib/systemd/coredump/core.python.1000.49...e5d.44204.1688657327000000.zst core.zst

unzstd core.zst
# core.zst            : 4292608 bytes

pystack core ./core --locals
# ...
pystack core ./core --native
# ...

And we can find the core file in the /var/lib/systemd/coredump/. It's in .zst format, which is a compressed archive. To uncompress it, we use unzstd. We can then analyze the core file with pystack core ..., which gives the following output:

pystack core

Notice that here we again see all the local variables, which is very cool because we're working with a crashed program, so you can't just call __repr__ on variables.

While debugging a live process is often more convenient, you can also force the process to crash with kill -SIGABRT <PID> which will generate a core file.

As was mentioned in the beginning, you're most likely to find use for PyStack when using libraries like NumPy or PyTorch that have modules written in C/C++. So, let's look at a real NumPy bug that causes a segfault:


# pip install numpy
from multiprocessing import shared_memory
import numpy as np

def np_array_from_shm_name(name, shape=(2,)):
    shm = shared_memory.SharedMemory(name=name)
    np_array = np.ndarray(shape, buffer=shm.buf, dtype=np.int32)
    np_array[:] = (0, 0)
    return np_array

if __name__ == "__main__":
    shm = shared_memory.SharedMemory(create=True, size=8)
    np_array = np_array_from_shm_name(shm.name)
    np_array[:] = (0, 0)  # segfaults
    shm.unlink()

This is a code snippet adapted from this GitHub issue. If you run it, you will get a segfault and analyzing the generated core file with PyStack gives much more information for the issue, which would be very hard to gather without it:

NumPy segfault

Pytest

During initial setup we installed pytest-pystack along with pystack itself, so let's put it to some use. This Pytest plugin allows us to automatically run pystack remote on the process running our tests in case the particular test exceeds given timeout.

For demonstration, will use following two test cases:


# examples_test.py
import time
import threading
import deadlock

def test_sleeping_test():
    print("start")
    time.sleep(6)
    print("finished")

def test_deadlock():
    t1 = threading.Thread(target=deadlock.f1, args=['t1', ])
    t2 = threading.Thread(target=deadlock.f2, args=['t2', ])

    t1.start()
    t2.start()
    t1.join()
    t2.join()

First one simply sleeps for a couple of seconds, while the other reuses the deadlock example from earlier. To run the test suite with PyStack plugin enabled, use:


pytest -s --pystack-threshold=2 --pystack-args='--locals' --pystack-output-file='./pystack.log'

This will execute the tests as usual, but if any of them exceeds 2 seconds, then PyStack will inspect the process and output the trace into specified file. I won't paste the generated file here, because it's quite long, but you can view it in this Gist.

Closing Thoughts

While PyStack is a tool with quite a narrow use case, and you won't need it when debugging normal issues, it can be invaluable when issue like segfault or deadlock actually arises.

Being able to figure out what a running program is up to or what a program was doing when it crashed is incredibly useful and there really aren't any other tools that can do it so well. Also, if you're already using a profiler (e.g. py-spy or austin) or other tools that inspect stack information, then you should give PyStack a shot because these tools complement each other nicely.