惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Martin Heinz's Blog

A Guide to Python's Weak References Using weakref Module Recent Docker BuildKit Features You're Missing Out On Modern Git Commands and Features You Should Be Using Monitoring Indoor Air Quality with Prometheus, Grafana and a CO2 Sensor Everything You Can Do with Python's bisect Module You Don't Need a Dedicated Cache Service - PostgreSQL as a Cache A Collection of Docker Images To Solve All Your Debugging Needs Weird Python "Features" That Might Catch You By Surprise Lessons Learned From Writing 100 Articles Debugging Crashes and Deadlocks in Python using PyStack Goodbye etcd, Hello PostgreSQL: Running Kubernetes with an SQL Database Remote Interactive Debugging of Python Applications Running in Kubernetes The Right Way to Run Shell Commands From Python Real Multithreading is Coming to Python - Learn How You Can Use It Now Python's Missing Batteries: Essential Libraries You're Missing Out On Kubernetes-Native Synthetic Monitoring with Kuberhealthy Make Your CLI Demos a Breeze with Zero Stress and Zero Mistakes Reduce - The Power of a Single Python Function Why I Will Never Use Alpine Linux Ever Again Cgroups - Deep Dive into Resource Management in Kubernetes Dictionary Dispatch Pattern in Python Boost Your Python Application Performance using Continuous Profiling Lazy Evaluation Using Recursive Python Generators Python Magic Methods You Haven't Heard About Getting Started with Mastodon API in Python Backup-and-Restore of Containers with Kubernetes Checkpointing API Getting Started with Google APIs in Python Python CLI Tricks That Don't Require Any Code Whatsoever All The Ways To Introspect Python Objects at Runtime What is Python's "self" Argument, Anyway? Python List Comprehensions Are More Powerful Than You Might Think You Should Be Using Python's Walrus Operator - Here's Why Recipes and Tricks for Effective Structural Pattern Matching in Python It's Time to Say Goodbye to These Obsolete Python Libraries Advanced Features of Kubernetes' Horizontal Pod Autoscaler Data and System Visualization Tools That Will Boost Your Productivity Stop Messing with Kubernetes Finalizers Automate All the Boring Kubernetes Operations with Python End-to-End Monitoring with Grafana Cloud with Minimal Effort Bitly | bit.ly/3JLmSgA Bitly | bit.ly/3uETfbi Ultimate CI Pipeline for All of Your Python Projects Bitly | bit.ly/3M30D82 Bitly | bit.ly/3oMJ6qR Bitly | bit.ly/3IRD7IK Bitly | bit.ly/3A3B69t Profiling and Analyzing Performance of Python Programs Bitly | bit.ly/30uviIM Bitly | bit.ly/3E1X2mw Bitly | bit.ly/3Dv7JxP Bitly | bit.ly/3GG1BEz Bitly | bit.ly/3lLavs4 Bitly | bit.ly/39TqP3m Bitly | bit.ly/3A5Mpx8 Bitly | bit.ly/3kGwPl4 Bitly | bit.ly/3iHtulU Bitly | bit.ly/3xGjtKS Bitly | bit.ly/3h8DZg0 Bitly | bit.ly/2RQn1dG Bitly | bit.ly/3p2B5wW Bitly | bit.ly/3tULpb0 Bitly | bit.ly/2PHVudx Bitly | bit.ly/3uPtnb0 Bitly | bit.ly/3dg3QR9 Bitly | bit.ly/3qHtSkZ Bitly | bit.ly/3kIkTPr Bitly | bit.ly/3qlRAUN Bitly | bit.ly/3pCUJ26 Hardening Docker and Kubernetes with seccomp Bitly | bit.ly/34ZhIMt Bitly | bit.ly/3qSO7h0 Bitly | bit.ly/3muGLOk Bitly | bit.ly/35xN79v Bitly | bit.ly/3mLGshK Bitly | bit.ly/2IvkGQl Bitly | bit.ly/2Sk1KFK Bitly | bit.ly/3iCNIL6 Bitly | bit.ly/3beQPpy Saving Your Linux Machine from Certain Death New Features in Python 3.9 You Should Know About Deploy Any Python Project to Kubernetes Analyzing Docker Image Security Recursive SQL Queries with PostgreSQL Automating Every Aspect of Your Python Project Tour of Python Itertools Implementing 2D Physics in Javascript Ultimate Setup for Your Next Python Project Making Python Programs Blazingly Fast Security and Cryptography Mistakes You Are Probably Doing All The Time Going Serverless with OpenFaaS and Golang - Building Optimized Templates Going Serverless with OpenFaaS and Golang - The Ultimate Setup and Workflow Setting Up Swagger Docs for Golang API Building RESTful APIs in Golang Pytest Features, That You Need in Your (Testing) Life Setting up GitHub Package Registry with Docker and Golang Ultimate Setup for Your Next Golang Project Python Tips and Trick, You Haven't Already Seen, Part 2. Tricks for Postgres and Docker that will make your life easier Getting The Most Out of Reading Books - Reading The "Professional Way" Python Tips and Trick, You Haven't Already Seen
Everything You Can Do with Python's textwrap Module
Martin · 2024-02-08 · via Martin Heinz's Blog

Python has many options for formatting strings and text, including f-strings, format() function, templates and more. There's however one module that few people know about and it's called textwrap.

This module is specifically built to help you with line-wrapping, indentation, trimming and more, and in this article we will look at all the things you can use it for.

Shorten

Let's start with very simple, yet very useful function from the textwrap module, called shorten:


from textwrap import shorten

shorten("This is a long text or sentence.", width=10)
# 'This [...]'
shorten("This is a long text or sentence.", width=15)
# 'This is a [...]'
shorten("This is a long text or sentence.", width=15, placeholder=" <...>")
# 'This is a <...>'

As the name suggests, shorten allows us to trim text to certain length (width) if the specified string is too long. By default, the placeholder for the trimmed text is [...], but that can be overridden with the placeholder argument.

Wrap

A more interesting function from this module is wrap. The obvious use-case for it is to split long text into lines of same length, but there are more things we can do with it:


from textwrap import wrap
s = '1234567890'
wrap(s, 3)
# ['123', '456', '789', '0']

In this example we split a string into equal chunks which can be useful for batch processing, rather than just formatting.

Using this function however, has some caveats:


s = '12\n3  45678\t9\n0'
wrap(s, 3)
# ['12', '3  ', '456', '78', '9 0']
# the first ("12") element "includes" newline
# the 4th element ("78") "includes" tab
wrap(s, 3, drop_whitespace=False, tabsize=1)
# ['12 ', '3  ', '456', '78 ', '9 0']

You should be careful with whitespaces, when using wrap - above you can see the behaviour with newline, tab and space characters. You can see that the first element (12) "includes" newline, and 4th element (78) "includes" tab, those are however, dropped by default, therefore these elements only have 2 characters instead of 3.

We can specify the drop_whitespace keyword argument to preserve them and maintain the proper length of chunks.

It might be obvious, but wrap is also great for reformatting whole files to certain line width:


with open("some-text.md", "r", encoding="utf-8") as f:
    formatted = wrap(f.read(), width=80)  # List of lines
    formatted = fill(f.read(), width=80)  # Single string that includes line breaks
    # ... write it back

We can also use the fill function which is a shorthand for "\n".join(wrap(text, ...)). The difference between the 2 is that wrap will give us a list of lines that we would need to concatenate ourselves, and fill gives us a single string that's already joined using newlines.

TextWrapper

textwrap module also includes a more powerful version wrap function, which is a TextWrapper class:


import textwrap

w = textwrap.TextWrapper(width=120, placeholder=" <...>")
for s in list_of_strings:
    w.wrap(s)
    # ...

This class and its wrap method are great if we need to call wrap with the same parameters multiple times as shown above.

And while we're looking that the TextWrapper, let's also try out some more keyword arguments:


user = "John"
prefix = user + ": "
width = 50
wrapper = TextWrapper(initial_indent=prefix, width=width, subsequent_indent=" " * len(prefix))
messages = ["...", "...", "..."]
for m in messages:
    print(wrapper.fill(m))

# John: Lorem Ipsum is simply dummy text of the
#       printing and typesetting industry. Lorem
# John: Ipsum has been the industry's standard dummy
#       text ever since the 1500s, when an
# John: unknown printer took a galley of type and
#       scrambled it to make a type specimen

Here we can see the use of initial_indent and subsequent_indent for indenting the first line of paragraph and subsequent ones, respectively. There are couple more options, which you can find in docs.

Furthermore, because TextWrapper is a class, we can also extend it and completely override some of its methods:


from textwrap import TextWrapper

class DocumentWrapper(TextWrapper):

    def wrap(self, text):
        split_text = text.split('\n')
        lines = [line for par in split_text for line in TextWrapper.wrap(self, par)]
        return lines


text = """First line,

Another, much looooooonger line of text and/or sentence"""
d = DocumentWrapper(width=50)
print(d.fill(text))

# First line,
# Another, much looooooonger line of text and/or
# sentence

This is a nice example of changing the wrap method to preserve existing line breaks and to print them properly.

For a more complete example for handling multiple paragraphs with TextWrapper, check out this article.

Indentation

Finally, textwrap also includes two functions for indentation, first one being dedent:


# Ugly formatting:
multiline_string = """
First line
Second line
Third line
"""

from textwrap import dedent

multiline_string = """
                   First line
                   Second line
                   Third line
                   """

print(dedent(multiline_string))

# First line
# Second line
# Third line

# Notice the leading blank line...
# You can use:
multiline_string = """\
                   First line
                   Second line
                   Third line
                   """

# or
from inspect import cleandoc
cleandoc(multiline_string)
# 'First line\nSecond line\nThird line'

By default, multiline strings in Python honor any indentation used in the string, therefore we need to use the ugly formatting shown in the first variable in the snippet above. But we can use the dedent function to improve formatting - we simply indent the variable value however we like and then call dedent on it before using it.

Alternatively, we could also use inspect.cleandoc, which also strips the leading newline. This function however encodes the whitespaces as special characters (\n and \t), so you might need to reformat it again.

Naturally, when there is dedent, then there needs to be also indent function:


from textwrap import indent

indented = indent(text, "    ", lambda x: not text.splitlines()[0] in x)

We simply supply the text and the string that each line will be indented with (here just 4 spaces, we could - for example - use >>> to make it look like REPL). Additionally, we can supply a predicate that will decide whether the line should be indented or not. In the example above, the lambda function makes it so that first line of string (paragraph) is not indented.

Closing Thoughts

textwrap is a simple module with just a few functions/methods, but it once again shows that Python really comes with "batteries included" for things that don't necessarily need to be standard library, but they can save you so much time when you happen to need them.

If you happen to do a lot of text processing, then I also recommend checking out the whole docs section dedicated to working with text. There are many more modules and little functions that you didn't know you needed. 😉