惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
H
Help Net Security
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
Vulnerabilities – Threatpost
MongoDB | Blog
MongoDB | Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
A
Arctic Wolf
The GitHub Blog
The GitHub Blog
Security Latest
Security Latest
G
GRAHAM CLULEY
Cyberwarzone
Cyberwarzone
S
Schneier on Security
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
P
Privacy & Cybersecurity Law Blog
IT之家
IT之家
D
Darknet – Hacking Tools, Hacker News & Cyber Security
博客园 - 聂微东
T
Threat Research - Cisco Blogs
AWS News Blog
AWS News Blog
The Hacker News
The Hacker News
B
Blog RSS Feed
云风的 BLOG
云风的 BLOG
Scott Helme
Scott Helme
P
Proofpoint News Feed
T
The Exploit Database - CXSecurity.com
L
LangChain Blog
F
Full Disclosure
I
Intezer
V
V2EX
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Spread Privacy
Spread Privacy
美团技术团队
Engineering at Meta
Engineering at Meta
C
Cybersecurity and Infrastructure Security Agency CISA
罗磊的独立博客
T
Tenable Blog
D
DataBreaches.Net
M
MIT News - Artificial intelligence
S
Securelist
C
CERT Recently Published Vulnerability Notes
Recent Announcements
Recent Announcements
Microsoft Azure Blog
Microsoft Azure Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
NISL@THU
NISL@THU
The Register - Security
The Register - Security
L
LINUX DO - 热门话题
P
Palo Alto Networks Blog

John D. Cook

Distinguishing variables from parameters Silver Rectangles and the Ways of Kings Derivative equals inverse Who you gonna believe: Grok or the docs? Brace expansion tree When will the decimals in a/b repeat? Height of harmonic numbers Writing down harmonic numbers Hart’s theorem Incircles and Excircles of Pythagorean triangles Consecutive Pythagorean triangle sides The Star Trek lemma Lobachevsky’s integral formula Queens on a prime order board All pieces on a 6 by 5 board Formalizing a ring theorem with Lean 4 and Claude Partial fraction decomposition Three examples suffice Testing pentagonal numbers Quaternion Rotations, Claude, and Lean Writing Prolog with ChatGPT RSA munitions T-shirt Solving a chess puzzle with Claude and Prolog Formally proving a calculation with Claude and Lean Pulling on a thread Aitken acceleration before Aitken The Laplace limit A crank formula for π From Kepler to Bessel Mr. Bessel’s eponymous functions
Regular expressions that work “everywhere”
John · 2026-06-25 · via John D. Cook

The most frustrating aspect of regular expressions is that implementations vary. Features supported in one tool may not be supported at all in another tool, or they may be supported with slightly different syntax.

I learned regular expressions in the context Perl, a maximalist regex environment. This led to frustration when features I expect to work are missing [1]. One way around this is to use Perl analogs of other tools, but this is very non-standard. I want to be able to send colleagues and clients code that works out of the box.

As I mentioned in my post on computational survivalism, I occasionally need to work on computers that I cannot install software on. So a better approach is to identify a subset of regex features that work everywhere. The stricter your definition of “everywhere” the less this includes. The strictest subset would be

  • literals
  • character classes […]
  • the special characters . * ^ $

A more relaxed definition of “everywhere” would be the tools you most care about. Currently the tools I most want to use with regular expressions are sed, awk, grep, and Emacs.

Awk as lowest common denominator

If you use the Gnu versions of sed, awk, and grep, and use the -E option with sed and grep, then the list of common features is bigger. The regular expression features of of the three tools are similar, and awk’s features are supported in the other tools, with one exception: word boundaries in awk are \< and \> rather than \b and \B.

I wrote about Awk’s regex features here.

Emacs as the oddball

Emacs supports analogs of most of awk’s regex features. However, the characters

    + ? ( ) { } |

all require a backslash in front in order to act like the awk counterparts. Also, the analog of \s and \S in awk is \s- and \S- in Emacs.

Instead of meaning space or nonspace, \s and \S in Emacs begin a (negated) character class, and one of those classes is - for space. But there are many others. For example, \s. stands for a punctuation character and \S. stands for a non-punctuation character.

What works everywhere

So for my definition of “everywhere,” with the caveats mentioned above, the following features work everywhere. YMMV.

    .
    ^, $
    […], [^…]
    *
    \w, \W, \s, \S
    \1 - \9 backreferences
    \b \B
    ? + 
    | alternation
    {n,m} for counting matches
    (...) capturing

One footnote is that gawk supports backreferences in replacement strings but not in regular expressions per se.

[1] To some extent, basic Perl features work elsewhere and advanced features do not, depending on your idea of what is basic or advanced. I think of look-around features as advanced, and that tracks. But I think of \d for digits as basic, but that’s not supported in many regex flavors.