惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Martin Heinz's Blog

A Guide to Python's Weak References Using weakref Module Recent Docker BuildKit Features You're Missing Out On Modern Git Commands and Features You Should Be Using Everything You Can Do with Python's textwrap Module Monitoring Indoor Air Quality with Prometheus, Grafana and a CO2 Sensor Everything You Can Do with Python's bisect Module You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache A Collection of Docker Images To Solve All Your Debugging Needs Weird Python "Features" That Might Catch You By Surprise Lessons Learned From Writing 100 Articles Debugging Crashes and Deadlocks in Python using PyStack Goodbye etcd, Hello PostgreSQL: Running Kubernetes with an SQL Database The Right Way to Run Shell Commands From Python Real Multithreading is Coming to Python - Learn How You Can Use It Now Python's Missing Batteries: Essential Libraries You're Missing Out On Kubernetes-Native Synthetic Monitoring with Kuberhealthy Make Your CLI Demos a Breeze with Zero Stress and Zero Mistakes Reduce - The Power of a Single Python Function Why I Will Never Use Alpine Linux Ever Again Cgroups - Deep Dive into Resource Management in Kubernetes Dictionary Dispatch Pattern in Python Boost Your Python Application Performance using Continuous Profiling Lazy Evaluation Using Recursive Python Generators Python Magic Methods You Haven't Heard About Getting Started with Mastodon API in Python Backup-and-Restore of Containers with Kubernetes Checkpointing API Getting Started with Google APIs in Python Python CLI Tricks That Don't Require Any Code Whatsoever All The Ways To Introspect Python Objects at Runtime What is Python's "self" Argument, Anyway? Python List Comprehensions Are More Powerful Than You Might Think You Should Be Using Python's Walrus Operator - Here's Why Recipes and Tricks for Effective Structural Pattern Matching in Python It's Time to Say Goodbye to These Obsolete Python Libraries Advanced Features of Kubernetes' Horizontal Pod Autoscaler Data and System Visualization Tools That Will Boost Your Productivity Stop Messing with Kubernetes Finalizers Automate All the Boring Kubernetes Operations with Python End-to-End Monitoring with Grafana Cloud with Minimal Effort Bitly | bit.ly/3JLmSgA Bitly | bit.ly/3uETfbi Ultimate CI Pipeline for All of Your Python Projects Bitly | bit.ly/3M30D82 Bitly | bit.ly/3oMJ6qR Bitly | bit.ly/3IRD7IK Bitly | bit.ly/3A3B69t Profiling and Analyzing Performance of Python Programs Bitly | bit.ly/30uviIM Bitly | bit.ly/3E1X2mw Bitly | bit.ly/3Dv7JxP Bitly | bit.ly/3GG1BEz Bitly | bit.ly/3lLavs4 Bitly | bit.ly/39TqP3m Bitly | bit.ly/3A5Mpx8 Bitly | bit.ly/3kGwPl4 Bitly | bit.ly/3iHtulU Bitly | bit.ly/3xGjtKS Bitly | bit.ly/3h8DZg0 Bitly | bit.ly/2RQn1dG Bitly | bit.ly/3p2B5wW Bitly | bit.ly/3tULpb0 Bitly | bit.ly/2PHVudx Bitly | bit.ly/3uPtnb0 Bitly | bit.ly/3dg3QR9 Bitly | bit.ly/3qHtSkZ Bitly | bit.ly/3kIkTPr Bitly | bit.ly/3qlRAUN Bitly | bit.ly/3pCUJ26 Hardening Docker and Kubernetes with seccomp Bitly | bit.ly/34ZhIMt Bitly | bit.ly/3qSO7h0 Bitly | bit.ly/3muGLOk Bitly | bit.ly/35xN79v Bitly | bit.ly/3mLGshK Bitly | bit.ly/2IvkGQl Bitly | bit.ly/2Sk1KFK Bitly | bit.ly/3iCNIL6 Bitly | bit.ly/3beQPpy Saving Your Linux Machine from Certain Death New Features in Python 3.9 You Should Know About Deploy Any Python Project to Kubernetes Analyzing Docker Image Security Recursive SQL Queries with PostgreSQL Automating Every Aspect of Your Python Project Tour of Python Itertools Implementing 2D Physics in Javascript Ultimate Setup for Your Next Python Project Making Python Programs Blazingly Fast Security and Cryptography Mistakes You Are Probably Doing All The Time Going Serverless with OpenFaaS and Golang - Building Optimized Templates Going Serverless with OpenFaaS and Golang - The Ultimate Setup and Workflow Setting Up Swagger Docs for Golang API Building RESTful APIs in Golang Pytest Features, That You Need in Your (Testing) Life Setting up GitHub Package Registry with Docker and Golang Ultimate Setup for Your Next Golang Project Python Tips and Trick, You Haven't Already Seen, Part 2. Tricks for Postgres and Docker that will make your life easier Getting The Most Out of Reading Books - Reading The "Professional Way" Python Tips and Trick, You Haven't Already Seen
Remote Interactive Debugging of Python Applications Running in Kubernetes
Martin · 2023-06-20 · via Martin Heinz's Blog

Let's imagine a situation - you have multiple Python applications running on Kubernetes that interact with each other. There's bug that you can't reproduce locally, but it surfaces everytime you hit a particular API endpoint. If only you could attach to the remote running application processes, set breakpoints and debug them live... how easy it would be to troubleshoot the bug...

But you can! In this tutorial we will create a setup for remote debugging of Python applications running in Kubernetes, which will allow you to set breakpoints, step through code, and interactively debug you applications without any change to your code or deployment.

The Goal

Before we start debugging, let's first define what we want to achieve here. Obviously, we know that we want to remotely debug some Python applications, but in doing so, we also want to:

  • Avoid modifying the application code,
  • Make sure we don't compromise application security,
  • No traffic redirection to local - we want to debug actual remote code,
  • We want to set breakpoints and step through the code,
  • Microservices don't exist in isolation - we want to debug more than one container/Pod at the same time,
  • We want a simple(-ish), streamlined setup.

Note: This article was inspired by KubeCon Talk - Breakpoints in Your Pod: Interactively Debugging Kubernetes Applications, which focuses on Go applications, but the same rationale from that presentation applies here.

Setup

To do any debugging, we first need to create a couple application(s) and deploy them somewhere. For the purpose of this tutorial, we will use a minikube cluster:


minikube start --kubernetes-version=v1.26.3

We will also deploy 2 Python applications, so that we can demonstrate that it's possible to debug multiple containers at the same time. For your convenience, sample application code is available in this repository, where the applications have the following layout:


project-root/
├── app1/
│   ├── __init__.py
│   ├── Dockerfile
│   ├── main.py
│   └── requirements.txt
└── app2/
    ├── __init__.py
    ├── Dockerfile
    ├── main.py
    └── requirements.txt

We only really care about the code in main.py files. For the first application we have:


# app1/main.py
from fastapi import FastAPI
import os
import requests

app = FastAPI()

API = os.environ.get("API", "")

@app.get("/")
def sample_endpoint():
    r = requests.get(f"{API}/api/test")
    return {"data": r.json()}

It's a trivial FastAPI application with single endpoint (/) which sends request to the second application and returns whatever it gets back. Speaking of which, the second application code:


# app2/main.py
from fastapi import FastAPI

app = FastAPI()

@app.get("/api/test")
def test_api():
    return {"key": "some data"}

This one simply returns a JSON response from /api/test endpoint which is called by the first application. With this setup, we will be able to do just a single request to the first app to trigger breakpoints in both apps at the same time.

Additionally, to build these apps, we need a Dockerfile(s):


FROM python:3.11.4-slim-buster

WORKDIR /code

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
RUN pip install debugpy

COPY ./main.py ./__init__.py /code/app/

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "5000"]

This is a basic setup for FastAPI image based on docs, only change is an addition of RUN pip install debugpy which we need for the debugger to work. If you want to implement this debugging setup in your existing applications, this is the only change you have to make to your codebase.

To then build and deploy these:


docker build -f app1/Dockerfile -t docker.io/martinheinz/python-debugging-app1:v1.0 app1
docker build -f app2/Dockerfile -t docker.io/martinheinz/python-debugging-app2:v1.0 app2

minikube image load docker.io/martinheinz/python-debugging-app1:v1.0
minikube image load docker.io/martinheinz/python-debugging-app2:v1.0

# ... or docker push ...

# Deploy to cluster
kubectl apply -f deployment.yaml

Here we use minikube image load ... to get the images into the cluster, if you're using real cluster, then you would want to push the images to a registry. As for the deployment.yaml (available in the repository), it's a basic application deployment, with Deployment and Service object for each of the 2 applications.

Finally, we can test whether the applications work:


kubectl port-forward svc/app1 5000
curl localhost:5000/
# {"data":{"key":"some data"}}

We forward the application port to local and query it, which returns expected response passed on by the second app.

Deploy Debugger

With that out of the way, we can move onto deploying the debugger. As was stated earlier, we want to debug the applications without making any changes to them, therefore we will use a new-ish feature of Kubernetes - Ephemeral containers, which are described in docs as: a special type of container that runs temporarily in an existing Pod to accomplish user-initiated actions such as troubleshooting.

They're essentially a temporary sidecar containers that can be injected into existing Pod.

For our ephemeral debugging container we will use the following image:


# debugger.Dockerfile
FROM python:3.11.4-slim-buster
RUN apt-get update && apt install -y gdb
RUN pip install debugpy

ENV DEBUGPY_LOG_DIR=/logs

It's necessary build the debugger with (more-or-less) the same base image as the applications that will be debugged. It also has to include gdb, the GNU Project debugger as well as debugpy. Additionally, we set DEBUGPY_LOG_DIR environment variable, which tells the debugger to write logs to files in that directory in case we need to inspect/troubleshoot the debugger itself.

To build this image:


docker build -f debugger.Dockerfile -t docker.io/martinheinz/python-debugger:v1.0 .
minikube image load docker.io/martinheinz/python-debugger:v1.0

Next, we need to inject the ephemeral container into application Pods:


APP1_POD=$(kubectl get -l=app=app1 pod --output=jsonpath='{.items[0].metadata.name}')
APP2_POD=$(kubectl get -l=app=app2 pod --output=jsonpath='{.items[0].metadata.name}')
./create-debug-container.sh default "$APP1_POD" app1
./create-debug-container.sh default "$APP2_POD" app2

We first find the Pod names using label selectors and then run a script that injects following container into the Pods:


# ...
  spec:
    # ... the existing application container here...
    ephemeralContainers:
    - image: docker.io/martinheinz/python-debugger:v1.0
      name: debugger
      command:
      - sleep
      args:
      - infinity
      tty: true
      stdin: true
      securityContext:
        privileged: true
        capabilities:
          add:
          - SYS_PTRACE
        runAsNonRoot: false
        runAsUser: 0
        runAsGroup: 0
      targetContainerName: "app1"  # or app2

It specifies a target container (targetContainerName) to which it gets attached, as well as securityContext giving it elevated privileges and extra Linux capability, which it needs to be able to attach to the application process.

After the container is injected, the script also runs:


kubectl exec "$POD_NAME" --container=debugger -- python -m debugpy --listen 0.0.0.0:5678 --pid 1

Which starts the debugger on port 5678 and attaches to the PID 1 which is the process ID of the actual application.

Here I omit the full script for clarity, but you can find it in the repository here.

Finally, to be able to access the debugger as well as the application endpoint, we start port forwarding:


kubectl port-forward "$APP1_POD" 5000 5678
kubectl port-forward "$APP2_POD" 5679:5678

For the first application we forward both the application port (5000), where we will query its endpoint, as well as port 5678 where the debugger listens for connections. For the second app, we only need to forward the debugger port, this time mapping it from 5678 (in container) to 5679 (on local) because the port 5678 is already taken by the first app.

Debugging

With the debugger waiting for connections, all that's left to do is connect. For that we need run/debug configuration in VS Code:


{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: Remote Attach App 1",
      "type": "python",
      "request": "attach",
      "connect": {
        "host": "127.0.0.1",
        "port": 5678
      },
      "pathMappings": [
        {
          "localRoot": "${workspaceFolder}/app1",
          "remoteRoot": "/code/app/"
        }
      ],
      "justMyCode": true
    },
    {
      "name": "Python: Remote Attach App 2",
      "type": "python",
      "request": "attach",
      "connect": {
        "host": "127.0.0.1",
        "port": 5679
      },
      "pathMappings": [
        {
          "localRoot": "${workspaceFolder}/app2",
          "remoteRoot": "/code/app/"
        }
      ],
      "justMyCode": true
    }
  ]
}

This configuration resides in .vscode/launch.json file. The important parts are connect.port values which specify the ports that we're forwarding. Also notice localRoot and remoteRoot values - the former specifies the local code directories, while the latter uses the directory to which the application code got copied during build.

Now it's time to start the debugging session(s). In VS Code select Run and Debug and start debug configurations:

Run and Debug

Now we can set breakpoints anywhere in the code and trigger them with curl localhost:5000:

Breakpoint

And we have a hit! We have successfully hit breakpoint in remote code and can debug it on local. If we now step through the code, we will also see that we can hit breakpoints in the second application as well.

Note: If you're trying this with your own application and requests to your application hangs (e.g. with Flask app), that's most likely due to the debugger blocking the application process. Only solution I found is switching to a different web server.

Conclusion

At glance, this might seem a little complicated to set up, but you really only need to run the script that injects the ephemeral container and forward the debugger ports, which only takes a couple seconds.

This then gives you proper remote debugging capabilities without having to modify your application code, without re-deploying it, without having to reproduce locally, while also allowing you to debug multiple containers/application at the same time.

Also, while this demonstrates how to debug in VS Code, this should also work in PyCharm, which uses pydevd which is the underlying library behind debugpy. You will however need "Professional" edition/license (see docs).