惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
D
Docker
Blog — PlanetScale
Blog — PlanetScale
罗磊的独立博客
美团技术团队
V
V2EX
Last Week in AI
Last Week in AI
D
DataBreaches.Net
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Microsoft Security Blog
Microsoft Security Blog
Microsoft Azure Blog
Microsoft Azure Blog
人人都是产品经理
人人都是产品经理
M
MIT News - Artificial intelligence
P
Proofpoint News Feed
B
Blog RSS Feed
博客园_首页
B
Blog
博客园 - 叶小钗
I
InfoQ
WordPress大学
WordPress大学
L
LangChain Blog
Apple Machine Learning Research
Apple Machine Learning Research
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
A
About on SuperTechFans
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
云风的 BLOG
云风的 BLOG
博客园 - 司徒正美
Latest news
Latest news
W
WeLiveSecurity
T
The Exploit Database - CXSecurity.com
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
aimingoo的专栏
aimingoo的专栏
小众软件
小众软件
Cyberwarzone
Cyberwarzone
Scott Helme
Scott Helme
D
Darknet – Hacking Tools, Hacker News & Cyber Security
C
CERT Recently Published Vulnerability Notes
C
CXSECURITY Database RSS Feed - CXSecurity.com
Recent Commits to openclaw:main
Recent Commits to openclaw:main
N
News and Events Feed by Topic
S
Secure Thoughts
The Hacker News
The Hacker News
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Google DeepMind News
Google DeepMind News

John D. Cook

Distinguishing variables from parameters Silver Rectangles and the Ways of Kings Derivative equals inverse Who you gonna believe: Grok or the docs? Brace expansion tree When will the decimals in a/b repeat? Height of harmonic numbers Writing down harmonic numbers Hart’s theorem Incircles and Excircles of Pythagorean triangles Consecutive Pythagorean triangle sides The Star Trek lemma Lobachevsky’s integral formula Queens on a prime order board All pieces on a 6 by 5 board Formalizing a ring theorem with Lean 4 and Claude Partial fraction decomposition Three examples suffice Testing pentagonal numbers Quaternion Rotations, Claude, and Lean Writing Prolog with ChatGPT RSA munitions T-shirt Solving a chess puzzle with Claude and Prolog Formally proving a calculation with Claude and Lean Pulling on a thread Aitken acceleration before Aitken The Laplace limit A crank formula for π From Kepler to Bessel Mr. Bessel’s eponymous functions
Regular expressions that work “everywhere”
John · 2026-06-25 · via John D. Cook

The most frustrating aspect of regular expressions is that implementations vary. Features supported in one tool may not be supported at all in another tool, or they may be supported with slightly different syntax.

I learned regular expressions in the context Perl, a maximalist regex environment. This led to frustration when features I expect to work are missing [1]. One way around this is to use Perl analogs of other tools, but this is very non-standard. I want to be able to send colleagues and clients code that works out of the box.

As I mentioned in my post on computational survivalism, I occasionally need to work on computers that I cannot install software on. So a better approach is to identify a subset of regex features that work everywhere. The stricter your definition of “everywhere” the less this includes. The strictest subset would be

  • literals
  • character classes […]
  • the special characters . * ^ $

A more relaxed definition of “everywhere” would be the tools you most care about. Currently the tools I most want to use with regular expressions are sed, awk, grep, and Emacs.

Awk as lowest common denominator

If you use the Gnu versions of sed, awk, and grep, and use the -E option with sed and grep, then the list of common features is bigger. The regular expression features of of the three tools are similar, and awk’s features are supported in the other tools, with one exception: word boundaries in awk are \< and \> rather than \b and \B.

I wrote about Awk’s regex features here.

Emacs as the oddball

Emacs supports analogs of most of awk’s regex features. However, the characters

    + ? ( ) { } |

all require a backslash in front in order to act like the awk counterparts. Also, the analog of \s and \S in awk is \s- and \S- in Emacs.

Instead of meaning space or nonspace, \s and \S in Emacs begin a (negated) character class, and one of those classes is - for space. But there are many others. For example, \s. stands for a punctuation character and \S. stands for a non-punctuation character.

What works everywhere

So for my definition of “everywhere,” with the caveats mentioned above, the following features work everywhere. YMMV.

    .
    ^, $
    […], [^…]
    *
    \w, \W, \s, \S
    \1 - \9 backreferences
    \b \B
    ? + 
    | alternation
    {n,m} for counting matches
    (...) capturing

One footnote is that gawk supports backreferences in replacement strings but not in regular expressions per se.

[1] To some extent, basic Perl features work elsewhere and advanced features do not, depending on your idea of what is basic or advanced. I think of look-around features as advanced, and that tracks. But I think of \d for digits as basic, but that’s not supported in many regex flavors.