惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

K
Kaspersky official blog
Martin Fowler
Martin Fowler
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
V
Visual Studio Blog
博客园_首页
Engineering at Meta
Engineering at Meta
The Cloudflare Blog
MongoDB | Blog
MongoDB | Blog
Blog — PlanetScale
Blog — PlanetScale
T
The Blog of Author Tim Ferriss
雷峰网
雷峰网
D
Docker
博客园 - 司徒正美
S
SegmentFault 最新的问题
M
MIT News - Artificial intelligence
博客园 - 叶小钗
博客园 - 三生石上(FineUI控件)
U
Unit 42
J
Java Code Geeks
A
About on SuperTechFans
N
Netflix TechBlog - Medium
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
Security Affairs
I
Intezer
Cisco Talos Blog
Cisco Talos Blog
C
Cyber Attacks, Cyber Crime and Cyber Security
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
B
Blog RSS Feed
P
Privacy & Cybersecurity Law Blog
T
Tenable Blog
T
Threatpost
H
Hacker News: Front Page
G
Google Developers Blog
博客园 - 【当耐特】
Hugging Face - Blog
Hugging Face - Blog
Apple Machine Learning Research
Apple Machine Learning Research
L
Lohrmann on Cybersecurity
大猫的无限游戏
大猫的无限游戏
Google DeepMind News
Google DeepMind News
A
Arctic Wolf
S
Secure Thoughts
GbyAI
GbyAI
NISL@THU
NISL@THU
S
Security @ Cisco Blogs
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Webroot Blog
Webroot Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
O
OpenAI News
Spread Privacy
Spread Privacy
Application and Cybersecurity Blog
Application and Cybersecurity Blog

Yi blog

Improving on Vi Improved - Yi Modularization - Yi Release 0.14 - Yi Dynamic and static compilation - Yi Prototypes - Encoding Object Oriented inheritance in Haskell Demo - Yi Configuration - Yi Overall Structure - Yi
Incremental parsing - Yi
2014-09-04 · via Yi blog

Posted on September 4, 2014 by Jeanphilippe Bernardy

In this post I will motivate Yi’s incremental parsing library and describe the main ideas behind it.

Why another parsing library?

Why bothering developing another parsing framework while there exist plenty already?

First, since we want to parse many languages, in many flavors, we want to be able to reuse pieces of grammars. Since we are using Haskell, the easiest way to achieve this is to through a parser-combinator library.

Second, we want to give timely feedback to users. Therefore, the parser has to be efficient. In particular, responsivity should not depend on the length of file. On the other hand, the input file will change incrementally (as the user edits it), and the parser should take advantage of this. It should reuse previous results to speed up the parse.

Third, the parser must gracefully handle errors in the input: while files are being edited, they will inevitably contain syntactically incorrect programs at several moments.

No parsing framework combining these characteristics exists.

Approach

Hughes and Swierstra have shown how online parsers can be constructed. An online parser takes advantage of lazy evaluation: the input is analyzed only as far as needed to produce the part of the result that is forced.

Effectively, this solves part of the incremental parsing problem: since the user can see only one page of text at a time, the editor will only force so much of the result tree, and only the corresponding part of the input will be analyzed, making the parser response time independent of the length of the input.

This does not completely solve the problem though. If the user edits the end of the file, the whole input will have to be analyzed at each key-press.

Caching results

The proposed solution is to cache intermediate parsing results. For given positions in the input (say every half-page), we will store a partially evaluated state of the parsing automaton. Whenever the input is modified, the new parsing result will be computed by using the most relevant cached state, and applying the new input to it. The cached states that became invalidated will also be recomputed on the basis of the most relevant state.

Of course, the cached states will only be computed lazily, so that no cost is paid for cached states that will be discarded.

Conclusion

The strategy sketched above has several advantages:

  • The design is relatively straightforward, and adds only a hundred lines of code compared to the polish parsers of Hughes and Swierstra.

  • There is no start-up cost. A non-online approach would need to parse the whole file the first time it is loaded. In our approach loading is instantaneous, and parsing proceeds as the user scrolls down the file.

  • The caching strategy is independent of the underlying parsing automaton. We only require it to accept partial inputs.

Its notable that the design has a strong functional flavor:

  • We never update a parse tree (no references, no zipper) and still achieve incremental parsing.

  • We take advantage of lazy evaluation in a cool way.

The main drawback is that the user code must use the parse tree lazily, and there is no way to enforce this in any current implementation of Haskell.