惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

SecWiki News
SecWiki News
D
Darknet – Hacking Tools, Hacker News & Cyber Security
I
Intezer
月光博客
月光博客
Cyberwarzone
Cyberwarzone
雷峰网
雷峰网
Security Latest
Security Latest
量子位
博客园 - 聂微东
小众软件
小众软件
NISL@THU
NISL@THU
C
Cisco Blogs
The GitHub Blog
The GitHub Blog
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Y
Y Combinator Blog
V
V2EX
博客园 - 三生石上(FineUI控件)
P
Privacy & Cybersecurity Law Blog
F
Full Disclosure
Cisco Talos Blog
Cisco Talos Blog
Microsoft Security Blog
Microsoft Security Blog
S
Security @ Cisco Blogs
The Register - Security
The Register - Security
Google DeepMind News
Google DeepMind News
J
Java Code Geeks
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
IT之家
IT之家
Webroot Blog
Webroot Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
aimingoo的专栏
aimingoo的专栏
腾讯CDC
S
Schneier on Security
L
LINUX DO - 最新话题
Latest news
Latest news
Simon Willison's Weblog
Simon Willison's Weblog
罗磊的独立博客
A
Arctic Wolf
MyScale Blog
MyScale Blog
云风的 BLOG
云风的 BLOG
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
S
Secure Thoughts
S
Securelist
Stack Overflow Blog
Stack Overflow Blog
T
Troy Hunt's Blog
Recorded Future
Recorded Future
I
InfoQ
The Cloudflare Blog
H
Heimdal Security Blog
Hugging Face - Blog
Hugging Face - Blog

Ethereum Foundation Blog

Checkpoint #9: Apr 2026 | Ethereum Foundation Blog How L1 and L2s can build the strongest possible Ethereum | Ethereum Foundation Blog The Promise of Ethereum: Introducing the EF Mandate | Ethereum Foundation Blog This Is Fine (Until the Grant Runs Out) | Ethereum Foundation Blog Treasury Staking Initiative | Ethereum Foundation Blog The Ethereum Foundation's Commitment to DeFi | Ethereum Foundation Blog Protocol Priorities Update for 2026 | Ethereum Foundation Blog Announcing the Platform Team at EF | Ethereum Foundation Blog Ethereum Protocol Studies 2026 | Ethereum Foundation Blog Executive Leadership Update | Ethereum Foundation Blog An update from Tomasz | Ethereum Foundation Blog Introducing the EF Academic Secretariat 2026 PhD Fellowship | Ethereum Foundation Blog Trillion Dollar Security Day at Devconnect | Ethereum Foundation Blog Allocation Update - Q4 2025 | Ethereum Foundation Blog Checkpoint #8: Jan 2026 | Ethereum Foundation Blog Devcon 8 is coming to Mumbai, India in November 2026 | Ethereum Foundation Blog Hegota Upgrade EIP Proposal Timelines | Ethereum Foundation Blog Shipping an L1 zkEVM #2: The Security Foundations | Ethereum Foundation Blog The Future of Ethereum’s State | Ethereum Foundation Blog Devconnect Argentina Recap | Ethereum Foundation Blog Allocation Update - Q3 2025 | Ethereum Foundation Blog Making Ethereum Feel Like One Chain Again | Ethereum Foundation Blog Checkpoint #7: Nov 2025 | Ethereum Foundation Blog Fusaka Mainnet Announcement | Ethereum Foundation Blog 2 weeks to Devconnect: Everything you need to know | Ethereum Foundation Blog Unveiling ESP's New Grants Program | Ethereum Foundation Blog Fusaka Update – Transaction Gas Limit Cap arrives with EIP-7825 | Ethereum Foundation Blog Fusaka Update - Information for Blob users | Ethereum Foundation Blog Announcing the 2026 EF Internship | Ethereum Foundation Blog Supporting privacy with new funding mechanisms | Ethereum Foundation Blog The Ethereum Foundation’s Commitment to Privacy | Ethereum Foundation Blog Checkpoint #6: Oct 2025 | Ethereum Foundation Blog Privacy Cluster Leadership Announcement | Ethereum Foundation Blog Fusaka Testnet Announcement | Ethereum Foundation Blog Announcing the districts of the Ethereum World’s Fair | Ethereum Foundation Blog Fusaka $2,000,000 Audit Contest! | Ethereum Foundation Blog Holešky Testnet Shutdown Announcement | Ethereum Foundation Blog The Ecosystem Support Program's Next Chapter | Ethereum Foundation Blog Protocol Update 003 — Improve UX | Ethereum Foundation Blog Protocol Update 002 - Scale Blobs | Ethereum Foundation Blog Trillion Dollar Security - Phase 2 | Ethereum Foundation Blog Join Us: EF Protocol Reddit AMA - August 29th, 2025 | Ethereum Foundation Blog Protocol Update 001 – Scale L1 | Ethereum Foundation Blog lean Ethereum | Ethereum Foundation Blog Celebrating 10 Years of Ethereum | Ethereum Foundation Blog Checkpoint #5: July 2025 | Ethereum Foundation Blog Allocation Update - Q2 2025 | Ethereum Foundation Blog The Future of Ecosystem Development at the EF | Ethereum Foundation Blog Shipping an L1 zkEVM #1: Realtime Proving | Ethereum Foundation Blog Partial history expiry announcement | Ethereum Foundation Blog Checkpoint #4: Berlinterop | Ethereum Foundation Blog World Experience: Updates from the Next Billion Fellowship | Ethereum Foundation Blog Now accepting interns - Join the Ethereum Season of Internships | Ethereum Foundation Blog Tickets are live for the Ethereum World’s Fair! And we're launching the Supporter Program | Ethereum Foundation Blog Ethereum Foundation Treasury Policy | Ethereum Foundation Blog Checkpoint #3: June 2025 | Ethereum Foundation Blog Announcing the Devconnect ARG Scholars Program | Ethereum Foundation Blog Announcing Protocol | Ethereum Foundation Blog Nyota Interop Recap ✨ | Ethereum Foundation Blog Allocation Update - Q1 2024 | Ethereum Foundation Blog Announcing the Ethereum Protocol Fellowship Cohort 5 | Ethereum Foundation Blog Ethereum Protocol Fellowship Cohort 4 Recap | Ethereum Foundation Blog Sepolia Incident | Ethereum Foundation Blog Announcing the Devcon SEA venue! | Ethereum Foundation Blog Devconnect Scholars Program - Ethereum Stories from Istanbul and Beyond | Ethereum Foundation Blog Dencun Mainnet Announcement | Ethereum Foundation Blog ZK Grants Round | Ethereum Foundation Blog Eth2 at ETHWaterloo: Prizes for Eth2 education, tooling, and research | Ethereum Foundation Blog eth2 quick update no. 2 | Ethereum Foundation Blog Devcon4 Ticket Sales | Ethereum Foundation Blog Announcing Swarm Proof-of-Concept Release 3 | Ethereum Foundation Blog Devcon4 Announcement | Ethereum Foundation Blog Announcing May 2018 Cohort of EF Grants | Ethereum Foundation Blog Announcing World Trade Francs: The Official Ethereum Stablecoin | Ethereum Foundation Blog Announcing Beneficiaries of the Ethereum Foundation Grants | Ethereum Foundation Blog Geth 1.8 - Iceberg¹ | Ethereum Foundation Blog Farewell and Welcome | Ethereum Foundation Blog Security Alert - Solidity - Variables can be overwritten in storage | Ethereum Foundation Blog Uncle Rate and Transaction Fee Analysis | Ethereum Foundation Blog Announcement of imminent hard fork for EIP150 gas cost changes | Ethereum Foundation Blog Dev Update: Formal Methods | Ethereum Foundation Blog On Inflation, Transaction Fees and Cryptocurrency Monetary Policy | Ethereum Foundation Blog Onward from the Hard Fork | Ethereum Foundation Blog C++ DEV Update - July edition | Ethereum Foundation Blog The Devcon2 site is now live! | Ethereum Foundation Blog Security Alert - DoS Vulnerability in the Soft Fork | Ethereum Foundation Blog DAO Wars: Your voice on the soft-fork dilemma | Ethereum Foundation Blog Smart Contract Security | Ethereum Foundation Blog Security Alert – Geth suffers from a very low probable DoS attack vector - Update immediately | Ethereum Foundation Blog On Settlement Finality | Ethereum Foundation Blog Ethereum Foundation and Wanxiang Blockchain Labs announce a blockbuster event combining Devcon2 and the 2nd Global Blockchain Summit in Shanghai, September 19–24, 2016 | Ethereum Foundation Blog Ethereum Partners with R3CEV on Lizardcoin, Bringing Together the Best of Centralized Finance and Blockchain Technology | Ethereum Foundation Blog From Smart Contracts to Courts with not so Smart Judges | Ethereum Foundation Blog BTC Relay included in Ethereum Bounty Program | Ethereum Foundation Blog Ethereum DEV Update: C++ Roadmap | Ethereum Foundation Blog Cut and try: building a dream | Ethereum Foundation Blog Ambients Applied to Ethereum | Ethereum Foundation Blog Mihai’s Ethereum Project Update. The First Year. | Ethereum Foundation Blog Getting to the Frontier | Ethereum Foundation Blog The Ethereum Development Process | Ethereum Foundation Blog
The Burden of Proof(s): Code Merkleization | Ethereum Foundation Blog
2020-11-30 · via Ethereum Foundation Blog

A note about the Stateless Ethereum initiative: Research activity has (understandably) slowed in the second half of 2020 as all contributors have adjusted to life on the weird timeline. But as the ecosystem moves incrementally closer to Serenity and the Eth1/Eth2 merge, Stateless Ethereum work will become increasingly relevant and impactful. Expect a more substantial year-end Stateless Ethereum retrospective in the coming weeks.

Let's roll through the re-cap one more time: The ultimate goal of Stateless Ethereum is to remove the requirement of an Ethereum node to keep a full copy of the updated state trie at all times, and to instead allow for changes of state to rely on a (much smaller) piece of data that proves a particular transaction is making a valid change. Doing this solves a major problem for Ethereum; a problem that has so far only been pushed further out by improved client software: State growth.

The Merkle proof needed for Stateless Ethereum is called a 'witness', and it attests to a state change by providing all of the unchanged intermediate hashes required to arrive at a new valid state root. Witnesses are theoretically a lot smaller than the full Ethereum state (which takes 6 hours at best to sync), but they are still a lot larger than a block (which needs to propagate to the whole network in just a few seconds). Leaning out the size of witnesses is therefore paramount to getting Stateless Ethereum to minimum-viable-utility.

Just like the Ethereum state itself, a lot of the extra (digital) weight in witnesses comes from smart contract code. If a transaction makes a call to a particular contract, the witness will by default need to include the contract bytecode in its entirety with the witness. Code Merkelization is a general technique to reduce burden of smart contract code in witnesses, so that contract calls only need to include the bits of code that they 'touch' in order to prove their validity. With this technique alone we might see a substantial reduction in witness, but there are a lot of details to consider when breaking up smart contract code into byte-sized chunks.

What is Bytecode?

There are some trade-offs to consider when splitting up contract bytecode. The question we will eventually need to ask is "how big will the code chunks be?" – but for now, let's look at some real bytecode in a very simple smart contract, just to understand what it is:

pragma solidity >=0.4.22 <0.7.0;

contract Storage {

    uint256 number;

    function store(uint256 num) public {
        number = num;
    }

    function retrieve() public view returns (uint256){
        return number;
    }
}

When this simple storage contract is compiled, it turns into the machine code meant to run 'inside' the EVM. Here, you can see the same simple storage contract shown above, but complied into individual EVM instructions (opcodes):

PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH1 0xF JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x4 CALLDATASIZE LT PUSH1 0x32 JUMPI PUSH1 0x0 CALLDATALOAD PUSH1 0xE0 SHR DUP1 PUSH4 0x2E64CEC1 EQ PUSH1 0x37 JUMPI DUP1 PUSH4 0x6057361D EQ PUSH1 0x53 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH1 0x3D PUSH1 0x7E JUMP JUMPDEST PUSH1 0x40 MLOAD DUP1 DUP3 DUP2 MSTORE PUSH1 0x20 ADD SWAP2 POP POP PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH1 0x7C PUSH1 0x4 DUP1 CALLDATASIZE SUB PUSH1 0x20 DUP2 LT ISZERO PUSH1 0x67 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST DUP2 ADD SWAP1 DUP1 DUP1 CALLDATALOAD SWAP1 PUSH1 0x20 ADD SWAP1 SWAP3 SWAP2 SWAP1 POP POP POP PUSH1 0x87 JUMP JUMPDEST STOP JUMPDEST PUSH1 0x0 DUP1 SLOAD SWAP1 POP SWAP1 JUMP JUMPDEST DUP1 PUSH1 0x0 DUP2 SWAP1 SSTORE POP POP JUMP INVALID LOG2 PUSH5 0x6970667358 0x22 SLT KECCAK256 DUP13 PUSH7 0x1368BFFE1FF61A 0x29 0x4C CALLER 0x1F 0x5C DUP8 PUSH18 0xA3F10C9539C716CF2DF6E04FC192E3906473 PUSH16 0x6C634300060600330000000000000000

As explained in a previous post, these opcode instructions are the basic operations of the EVM's stack architecture. They define the simple storage contract, and all of the functions it contains. You can find this contract as one of the example solidity contracts in the Remix IDE (Note that the machine code above is an example of the storage.sol after it's already been deployed, and not the output of the Solidity compiler, which will have some extra 'bootstrapping' opcodes). If you un-focus your eyes and imagine a physical stack machine chugging along with step-by-step computation on opcode cards, in the blur of the moving stack you can almost see the outlines of functions laid out in the Solidity contract.

Whenever the contract receives a message call, this code runs inside every Ethereum node validating new blocks on the network. In order to submit a valid transaction on Ethereum today, one needs a full copy of the contract's bytecode, because running that code from beginning to end is the only way to obtain the (deterministic) output state and associated hash.

Stateless Ethereum, remember, aims to change this requirement. Let's say that all you want to do is call the function retrieve() and nothing more. The logic describing that function is only a subset of the whole contract, and in this case the EVM only really needs two of the basic blocks of opcode instructions in order to return the desired value:

PUSH1 0x0 DUP1 SLOAD SWAP1 POP SWAP1 JUMP,

JUMPDEST PUSH1 0x40 MLOAD DUP1 DUP3 DUP2 MSTORE PUSH1 0x20 ADD SWAP2 POP POP PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN

In the Stateless paradigm, just as a witness provides the missing hashes of un-touched state, a witness should also provide the missing hashes for un-executed pieces of machine code, so that a stateless client only requires the portion of the contract it's executing.

The Code's Witness

Smart contracts in Ethereum live in the same place that externally-owned accounts do: as leaf nodes in the enormous single-rooted state trie. Contracts are in many ways no different than the externally-owned accounts humans use. They have an address, can submit transactions, and hold a balance of Ether and any other token. But contract accounts are special because they must contain their own program logic (code), or a hash thereof. Another associated Merkle-Patricia Trie, called the storageTrie keeps any variables or persistent state that an active contract uses to go about its business during execution.

witness

This witness visualization provides a good sense of how important code merklization could be in reducing the size of witnesses. See that giant chunk of colored squares and how much bigger it is than all the other elements in the trie? That's a single full serving of smart contract bytecode.

Next to it and slightly below are the pieces of persistent state in the storageTrie, such as ERC20 balance mappings or ERC721 digital item ownership manifests. Since this is example is of a witness and not a full state snapshot, those too are made mostly of intermediate hashes, and only include the changes a stateless client would require to prove the next block.

Code merkleization aims to split up that giant chunk of code, and to replace the field codeHash in an Ethereum account with the root of another Merkle Trie, aptly named the codeTrie.

Worth its Weight in Hashes

Let's look at an example from this Ethereum Engineering Group video, which analyzes some methods of code chunking using an ERC20 token contract. Since many of the tokens you've heard of are made to the ERC-20 standard, this is a good real-world context to understand code merkleization.

Because bytecode is long and unruly, let's use a simple shorthand of replacing four bytes of code (8 hexidecimal characters) with either an . or X character, with the latter representing bytecode required for the execution of a specific function (in the example, the ERC20.transfer() function is used throughout).

In the ERC20 example, calling the transfer() function uses a little less than half of the whole smart contract:

XXX.XXXXXXXXXXXXXXXXXX..........................................
.....................XXXXXX.....................................
............XXXXXXXXXXXX........................................
........................XXX.................................XX..
......................................................XXXXXXXXXX
XXXXXXXXXXXXXXXXXX...............XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX..................................
.......................................................XXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX..................................X
XXXXXXXX........................................................
....

If we wanted to split up that code into chunks of 64 bytes, only 19 out of the 41 chunks would be required to execute a stateless transfer() transaction, with the rest of the required data coming from a witness.

|XXX.XXXXXXXXXXXX|XXXXXX..........|................|................
|................|.....XXXXXX.....|................|................
|............XXXX|XXXXXXXX........|................|................
|................|........XXX.....|................|............XX..
|................|................|................|......XXXXXXXXXX
|XXXXXXXXXXXXXXXX|XX..............|.XXXXXXXXXXXXXXX|XXXXXXXXXXXXXXXX
|XXXXXXXXXXXXXXXX|XXXXXXXXXXXXXX..|................|................
|................|................|................|.......XXXXXXXXX
|XXXXXXXXXXXXXXXX|XXXXXXXXXXXXX...|................|...............X
|XXXXXXXX........|................|................|................
|....

Compare that to 31 out of 81 chunks in a 32 byte chunking scheme:

|XXX.XXXX|XXXXXXXX|XXXXXX..|........|........|........|........|........
|........|........|.....XXX|XXX.....|........|........|........|........
|........|....XXXX|XXXXXXXX|........|........|........|........|........
|........|........|........|XXX.....|........|........|........|....XX..
|........|........|........|........|........|........|......XX|XXXXXXXX
|XXXXXXXX|XXXXXXXX|XX......|........|.XXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX
|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXX..|........|........|........|........
|........|........|........|........|........|........|.......X|XXXXXXXX
|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXX...|........|........|........|.......X
|XXXXXXXX|........|........|........|........|........|........|........
|....

On the surface it seems like smaller chunks are more efficient than larger ones, because the mostly-empty chunks are less frequent. But here we need to remember that the unused code has a cost as well: each un-executed code chunk is replaced by a hash of fixed size. Smaller code chunks mean a greater number of hashes for the unused code, and those hashes could be as large as 32 bytes each (or as small as 8 bytes). You might at this point exclaim "Hol' up! If the hash of code chunks is a standard size of 32 bytes, how would it help to replace 32 bytes of code with 32 bytes of hash!?".

Recall that the contract code is merkleized, meaning that all hashes are linked together in the codeTrie -- the root hash of which we need to validate a block. In that structure, any sequential un-executed chunks only require one hash, no matter how many there are. That is to say, one hash can stand in for a potentially large limb full of sequential chunk hashes on the merkleized code trie, so long as none of them are required for coded execution.

We Must Collect Additional Data

The conclusion we've been building to is a bit of an anticlimax: There is no theoretically 'optimal' scheme for code merkleization. Design choices like fixing the size of code chunks and hashes depend on data collected about the 'real world'. Every smart contract will merkleize differently, so the burden is on researchers to choose the format that provides the largest efficiency gains to observed mainnet activity. What does that mean, exactly?

overhead

One thing that could indicate how efficient a code merkleization scheme is Merkleization overhead, which answers the question "how much extra information beyond executed code is getting included in this witness?"

Already we have some promising results, collected using a purpose-built tool developed by Horacio Mijail from Consensys' TeamX research team, which shows overheads as small as 25% -- not bad at all!

In short, the data shows that by-and-large smaller chunk sizes are more efficient than larger ones, especially if smaller hashes (8-bytes) are used. But these early numbers are by no means comprehensive, as they only represent about 100 recent blocks. If you're reading this and interested in contributing to the Stateless Ethereum initiative by collecting more substantial code merkleization data, come introduce yourself on the ethresear.ch forums, or the #code-merkleization channel on the Eth1x/2 research discord!

And as always, if you have questions, feedback, or requests related to "The 1.X Files" and Stateless Ethereum, DM or @gichiba on twitter.