惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Martin Heinz's Blog

A Guide to Python's Weak References Using weakref Module Recent Docker BuildKit Features You're Missing Out On Modern Git Commands and Features You Should Be Using Everything You Can Do with Python's textwrap Module Monitoring Indoor Air Quality with Prometheus, Grafana and a CO2 Sensor Everything You Can Do with Python's bisect Module You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache A Collection of Docker Images To Solve All Your Debugging Needs Weird Python "Features" That Might Catch You By Surprise Lessons Learned From Writing 100 Articles Debugging Crashes and Deadlocks in Python using PyStack Goodbye etcd, Hello PostgreSQL: Running Kubernetes with an SQL Database Remote Interactive Debugging of Python Applications Running in Kubernetes Real Multithreading is Coming to Python - Learn How You Can Use It Now Python's Missing Batteries: Essential Libraries You're Missing Out On Kubernetes-Native Synthetic Monitoring with Kuberhealthy Make Your CLI Demos a Breeze with Zero Stress and Zero Mistakes Reduce - The Power of a Single Python Function Why I Will Never Use Alpine Linux Ever Again Cgroups - Deep Dive into Resource Management in Kubernetes Dictionary Dispatch Pattern in Python Boost Your Python Application Performance using Continuous Profiling Lazy Evaluation Using Recursive Python Generators Python Magic Methods You Haven't Heard About Getting Started with Mastodon API in Python Backup-and-Restore of Containers with Kubernetes Checkpointing API Getting Started with Google APIs in Python Python CLI Tricks That Don't Require Any Code Whatsoever All The Ways To Introspect Python Objects at Runtime What is Python's "self" Argument, Anyway? Python List Comprehensions Are More Powerful Than You Might Think You Should Be Using Python's Walrus Operator - Here's Why Recipes and Tricks for Effective Structural Pattern Matching in Python It's Time to Say Goodbye to These Obsolete Python Libraries Advanced Features of Kubernetes' Horizontal Pod Autoscaler Data and System Visualization Tools That Will Boost Your Productivity Stop Messing with Kubernetes Finalizers Automate All the Boring Kubernetes Operations with Python End-to-End Monitoring with Grafana Cloud with Minimal Effort Bitly | bit.ly/3JLmSgA Bitly | bit.ly/3uETfbi Ultimate CI Pipeline for All of Your Python Projects Bitly | bit.ly/3M30D82 Bitly | bit.ly/3oMJ6qR Bitly | bit.ly/3IRD7IK Bitly | bit.ly/3A3B69t Profiling and Analyzing Performance of Python Programs Bitly | bit.ly/30uviIM Bitly | bit.ly/3E1X2mw Bitly | bit.ly/3Dv7JxP Bitly | bit.ly/3GG1BEz Bitly | bit.ly/3lLavs4 Bitly | bit.ly/39TqP3m Bitly | bit.ly/3A5Mpx8 Bitly | bit.ly/3kGwPl4 Bitly | bit.ly/3iHtulU Bitly | bit.ly/3xGjtKS Bitly | bit.ly/3h8DZg0 Bitly | bit.ly/2RQn1dG Bitly | bit.ly/3p2B5wW Bitly | bit.ly/3tULpb0 Bitly | bit.ly/2PHVudx Bitly | bit.ly/3uPtnb0 Bitly | bit.ly/3dg3QR9 Bitly | bit.ly/3qHtSkZ Bitly | bit.ly/3kIkTPr Bitly | bit.ly/3qlRAUN Bitly | bit.ly/3pCUJ26 Hardening Docker and Kubernetes with seccomp Bitly | bit.ly/34ZhIMt Bitly | bit.ly/3qSO7h0 Bitly | bit.ly/3muGLOk Bitly | bit.ly/35xN79v Bitly | bit.ly/3mLGshK Bitly | bit.ly/2IvkGQl Bitly | bit.ly/2Sk1KFK Bitly | bit.ly/3iCNIL6 Bitly | bit.ly/3beQPpy Saving Your Linux Machine from Certain Death New Features in Python 3.9 You Should Know About Deploy Any Python Project to Kubernetes Analyzing Docker Image Security Recursive SQL Queries with PostgreSQL Automating Every Aspect of Your Python Project Tour of Python Itertools Implementing 2D Physics in Javascript Ultimate Setup for Your Next Python Project Making Python Programs Blazingly Fast Security and Cryptography Mistakes You Are Probably Doing All The Time Going Serverless with OpenFaaS and Golang - Building Optimized Templates Going Serverless with OpenFaaS and Golang - The Ultimate Setup and Workflow Setting Up Swagger Docs for Golang API Building RESTful APIs in Golang Pytest Features, That You Need in Your (Testing) Life Setting up GitHub Package Registry with Docker and Golang Ultimate Setup for Your Next Golang Project Python Tips and Trick, You Haven't Already Seen, Part 2. Tricks for Postgres and Docker that will make your life easier Getting The Most Out of Reading Books - Reading The "Professional Way" Python Tips and Trick, You Haven't Already Seen
The Right Way to Run Shell Commands From Python
Martin · 2023-06-06 · via Martin Heinz's Blog

Python is a popular choice for automating anything and everything, that includes automating system administration tasks or tasks that require running other programs or interacting with operating system. There are however, many ways to achieve this in Python, most of which are arguably bad, though.

So, in this article we will look at all the options you have in Python for running other processes - the bad; the good; and most importantly, the right way to do it.

The Options

Python has way too many builtin options for interfacing with other programs, some of them better, some of them worse, and honestly I don't like any of them. Let's quickly glance over each option and see when (if ever) it makes sense to use the particular module.

Native Tools

General rule of thumb should be to use native functions instead of directly calling other programs or OS commands. So, first let's look at the native Python options:

  • pathlib - If you need to create or delete file/directory; check if file exists; change permissions; etc., there's absolutely no reason to run system commands, just use pathlib, it has everything you need. When you start using pathlib, you will also realise that you can forget about other Python modules, such as glob, or os.path.
  • tempfile - Similarly, if you need a temporary file just use tempfile module, don't mess with /tmp manually.
  • shutil - pathlib should satisfy most of your file-related needs in Python, but if you need for example to copy, move, chown, which or create archive, then you should turn to shutil.
  • signal - in case you need to use signal handlers.
  • syslog - for an interface to Unix syslog.

If none of the above builtin options satisfy your needs, only then it makes sense to start interacting with OS or other programs directly...

OS Module

Starting from the worst options - os module - it provides low-level functions for interacting with OS - many of which have been superseded by functions in other modules.

If you simply wanted to call some other program, you could use os.system function, but you shouldn't. I don't even want to give you an example, because you simply should not use it.

While os should be not be your first choice, there are a couple functions that you might find useful:


import os

print(os.getenv('PATH'))
# /home/martin/.local/bin:/usr/local/sbin:/usr/local/bin:...
print(os.uname())
# posix.uname_result(sysname='Linux', nodename='...', release='...', version='...', machine='x86_64')
print(os.times())
# posix.times_result(user=0.01, system=0.0, children_user=0.0, children_system=0.0, elapsed=1740.63)
print(os.cpu_count())
# 16
print(os.getloadavg())
# (2.021484375, 2.35595703125, 2.04052734375)
old_umask = os.umask(0o022)
# Do stuff with files...
os.umask(old_umask)  # restore old umask

# Only if you need better random numbers than pseudo-random numbers from 'random' module:
from base64 import b64encode

random_bytes = os.urandom(64)
print(b64encode(random_bytes).decode('utf-8'))
# C2F3kHjdzxcP7461ETRj/YZredUf+NH...hxz9MXXHJNfo5nXVH7e5olqLwhahqFCe/mzLQ==

Apart from the function shown above, there are also functions for creating fd (file descriptors), pipes, opening PTY, chroot, chmod, mkdir, kill, stat, but I'd like discourage you from using them as there are better options. There's even section in docs that shows how to replace os with subprocess module, so don't even think about using os.popen, os.spawn or os.system.

Same also goes for using os module for file/path operations - please don't. Here's a whole section on how to use pathlib instead of os.path and other path-related functions.

Most of the remaining functions in os module are direct interface to OS (or C language) API, e.g. os.dup, os.splice, os.mkfifo, os.execv, os.fork, etc. If you need to use all of those, then I'm not sure whether Python is the right language for the task...

Subprocess Module

A second - little better - options that we have in Python is subprocess module:


import subprocess

p = subprocess.run('ls -l', shell=True, check=True, capture_output=True, encoding='utf-8')

# 'p' is instance of 'CompletedProcess(args='ls -la', returncode=0)'
print(f'Command {p.args} exited with {p.returncode} code, output: \n{p.stdout}')
# Command ls -la exited with 0 code

# total 36
# drwxrwxr-x  2 martin martin  4096 apr 22 12:53 .
# drwxrwxr-x 42 martin martin 20480 apr 22 11:01 ..
# ...

As stated in docs:

The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle.

In most cases it should be enough for you to use subprocess.run, passing in kwargs to alter its behavior, e.g. shell=True allows you to pass the command as a single string, check=True causes it throw exception if exit code is not 0, and capture_output=True populates the stdout attribute.

While subprocess.run() is the recommended way to invoke processes, there are other (unnecessary, deprecated) options in this module: call, check_call, check_output, getstatusoutput, getoutput. Generally, you should use only run and Popen:


with subprocess.Popen(['ls', '-la'], stdout=subprocess.PIPE, encoding='utf-8') as process:
    # process.wait(timeout=5)  # Returns only code: 0
    outs, errs = process.communicate(timeout=5)
    print(f'Command {process.args} exited with {process.returncode} code, output: \n{outs}')

# Pipes
import shlex
ls = shlex.split('ls -la')
awk = shlex.split("awk '{print $9}'")
ls_process = subprocess.Popen(ls, stdout=subprocess.PIPE)
awk_process = subprocess.Popen(awk, stdin=ls_process.stdout, stdout=subprocess.PIPE, encoding='utf-8')

for line in awk_process.stdout:
    print(line.strip())
    # .
    # ..
    # examples.py
    # ...

First example above shows Popen equivalent of previously shown subprocess.run. However, you should only use Popen when more flexibility is needed than run provides, e.g. in the second example you can see how you can pipe output of one command into another, effectively running ls -la | awk '{print $9}'. You can also see that we used shlex.split, which is a convenience function that splits the string into array of tokens that can be passed into Popen or run without using shell=True.

When using Popen you can additionally use terminate(), kill() and send_signal() for more interactions with the process.

In the previous examples we didn't really do any error handling, but there's a lot that can go wrong when running other processes. For simple-ish scripting, check=True is probably enough as it will cause CalledProcessError to be raised as soon as the subprocess runs into a non-zero return code, so your program will fail fast and loud, which is good. If you also set timeout argument, then you can also get TimeoutExpired exception, but generally, all of the exceptions in subproccess module inherit from SubprocessError, so if you want to catch exceptions, then you can simply watch for SubprocessError.

The Right Way

Zen of Python states that:

There should be one-- and preferably only one --obvious way to do it.

But, so far, we've seen quite a few ways, all in Python's builtin modules, which one is the right one though? In my opinion... none of them...

While I love Python's standard library, I believe one of its missing "batteries" is a better subprocess module.

If you find yourself orchestrating lots of other processes in Python, then you should at least take a look at sh library:


# https://pypi.org/project/sh/
# pip install sh
import sh

# Run any command in $PATH...
print(sh.ls('-la'))

ls_cmd = sh.Command('ls')
print(ls_cmd('-la'))  # Explicit
# total 36
# drwxrwxr-x  2 martin martin  4096 apr  8 14:18 .
# drwxrwxr-x 41 martin martin 20480 apr  7 15:23 ..
# -rw-rw-r--  1 martin martin    30 apr  8 14:18 examples.py

# If command is not in PATH:
custom_cmd = sh.Command('/path/to/my/cmd')
custom_cmd('some', 'args')

with sh.contrib.sudo:
    # Do stuff using 'sudo'...
    ...

When we invoke sh.some_command, sh library tries to look for builtin shell command or a binary in your $PATH with that name. If it finds such command, it will simply execute it for you. If the command is not in $PATH, then you can create instance of Command and call it that way. In case you need to use sudo, you can use the sudo context manager from contrib module. So simple and straight-forward, right?

To write output of a command to a file you only need to provide _out argument to the function:


sh.ip.address(_out='/tmp/ipaddr')
# Same as 'ip address > /tmp/ipaddr'

The above also shows how to invoke subcommands - just use dots.

And finally, you can also use pipes (|) by using _in argument:


print(sh.awk('{print $9}', _in=sh.ls('-la')))
# Same as "ls -la | awk '{print $9}'"

print(sh.wc('-l', _in=sh.ls('.', '-1')))
# Same as "ls -1 | wc -l"

As for the error handling, you can simply watch for ErrorReturnCode or TimeoutException exceptions:


try:
    sh.cat('/tmp/doesnt/exist')
except sh.ErrorReturnCode as e:
    print(f'Command {e.full_cmd} exited with {e.exit_code}')
    # Command /usr/bin/cat /tmp/doesnt/exist exited with 1

curl = sh.curl('https://httpbin.org/delay/5', _bg=True)
try:
    curl.wait(timeout=3)
except sh.TimeoutException:
    print("Command timed out...")
    curl.kill()

Optionally, if your process terminates from a signal you will receive SignalException, you can check for specific signal with e.g. SignalException_SIGKILL (or _SIGTERM, _SIGSTOP, ...).

This library also has builtin logging support, all you have to do is turn it on:


import logging

# Turn on default logging:
logging.basicConfig(level=logging.INFO)
sh.ls('-la')
# INFO:sh.command:<Command '/usr/bin/ls -la', pid 1631463>: process started

# Change log level:
logging.getLogger('sh').setLevel(logging.DEBUG)
sh.ls('-la')
# INFO:sh.command:<Command '/usr/bin/ls -la', pid 1631661>: process started
# DEBUG:sh.command:<Command '/usr/bin/ls -la'>: starting process
# DEBUG:sh.command.process:<Command '/usr/bin/ls -la'>.<Process 1631666 ['/usr/bin/ls', '-la']>: started process
# ...

The above examples should cover most use cases, but if you're trying to more advanced/obscure, then do check out tutorials or FAQ in library docs, which has additional examples.

Closing Thoughts

I want to stress again - you should always prefer native Python functions instead of resorting to using system commands. Also, always prefer using 3rd party client libraries such as kubernetes-client or Cloud provider's SDK instead of running CLI commands directly. That - in my opinion - applies even if you're coming from SysAdmin background and are more comfortable with shell than Python. And finally, while Python is a great and much more robust language than shell, if you need to string together too many other programs/commands, maybe, just maybe you should just write shell script instead.