惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

S
SegmentFault 最新的问题
人人都是产品经理
人人都是产品经理
Blog — PlanetScale
Blog — PlanetScale
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Cisco Talos Blog
Cisco Talos Blog
Spread Privacy
Spread Privacy
Scott Helme
Scott Helme
C
CXSECURITY Database RSS Feed - CXSecurity.com
S
Securelist
酷 壳 – CoolShell
酷 壳 – CoolShell
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
I
Intezer
博客园 - 叶小钗
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
雷峰网
雷峰网
量子位
Security Latest
Security Latest
P
Proofpoint News Feed
P
Privacy International News Feed
P
Palo Alto Networks Blog
D
DataBreaches.Net
大猫的无限游戏
大猫的无限游戏
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Google Online Security Blog
Google Online Security Blog
Webroot Blog
Webroot Blog
云风的 BLOG
云风的 BLOG
N
Netflix TechBlog - Medium
Vercel News
Vercel News
博客园 - 【当耐特】
C
CERT Recently Published Vulnerability Notes
Hugging Face - Blog
Hugging Face - Blog
月光博客
月光博客
Hacker News - Newest:
Hacker News - Newest: "LLM"
K
Kaspersky official blog
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Stack Overflow Blog
Stack Overflow Blog
AWS News Blog
AWS News Blog
博客园 - Franky
爱范儿
爱范儿
T
Tor Project blog
The GitHub Blog
The GitHub Blog
宝玉的分享
宝玉的分享
小众软件
小众软件
L
LINUX DO - 最新话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
W
WeLiveSecurity
SecWiki News
SecWiki News
L
LangChain Blog
I
InfoQ

Yi blog

Improving on Vi Improved - Yi Modularization - Yi Release 0.14 - Yi Dynamic and static compilation - Yi Prototypes - Encoding Object Oriented inheritance in Haskell Demo - Yi Configuration - Yi Overall Structure - Yi
Incremental parsing - Yi
2014-09-04 · via Yi blog

Posted on September 4, 2014 by Jeanphilippe Bernardy

In this post I will motivate Yi’s incremental parsing library and describe the main ideas behind it.

Why another parsing library?

Why bothering developing another parsing framework while there exist plenty already?

First, since we want to parse many languages, in many flavors, we want to be able to reuse pieces of grammars. Since we are using Haskell, the easiest way to achieve this is to through a parser-combinator library.

Second, we want to give timely feedback to users. Therefore, the parser has to be efficient. In particular, responsivity should not depend on the length of file. On the other hand, the input file will change incrementally (as the user edits it), and the parser should take advantage of this. It should reuse previous results to speed up the parse.

Third, the parser must gracefully handle errors in the input: while files are being edited, they will inevitably contain syntactically incorrect programs at several moments.

No parsing framework combining these characteristics exists.

Approach

Hughes and Swierstra have shown how online parsers can be constructed. An online parser takes advantage of lazy evaluation: the input is analyzed only as far as needed to produce the part of the result that is forced.

Effectively, this solves part of the incremental parsing problem: since the user can see only one page of text at a time, the editor will only force so much of the result tree, and only the corresponding part of the input will be analyzed, making the parser response time independent of the length of the input.

This does not completely solve the problem though. If the user edits the end of the file, the whole input will have to be analyzed at each key-press.

Caching results

The proposed solution is to cache intermediate parsing results. For given positions in the input (say every half-page), we will store a partially evaluated state of the parsing automaton. Whenever the input is modified, the new parsing result will be computed by using the most relevant cached state, and applying the new input to it. The cached states that became invalidated will also be recomputed on the basis of the most relevant state.

Of course, the cached states will only be computed lazily, so that no cost is paid for cached states that will be discarded.

Conclusion

The strategy sketched above has several advantages:

  • The design is relatively straightforward, and adds only a hundred lines of code compared to the polish parsers of Hughes and Swierstra.

  • There is no start-up cost. A non-online approach would need to parse the whole file the first time it is loaded. In our approach loading is instantaneous, and parsing proceeds as the user scrolls down the file.

  • The caching strategy is independent of the underlying parsing automaton. We only require it to accept partial inputs.

Its notable that the design has a strong functional flavor:

  • We never update a parse tree (no references, no zipper) and still achieve incremental parsing.

  • We take advantage of lazy evaluation in a cool way.

The main drawback is that the user code must use the parse tree lazily, and there is no way to enforce this in any current implementation of Haskell.