Skip to content

Merkle Trees

A Merkle tree is a binary hash tree where every leaf contains the hash of a data block, every intermediate node contains the hash of its two children, and the single root hash at the top serves as a fingerprint for the entire dataset. Invented in 1979 by Ralph Merkle for efficient digital signature authentication, this data structure is now foundational to virtually every blockchain — Bitcoin uses it for transaction verification, Ethereum extends it for global state management, and Smart Contracts use it for gas-efficient airdrop distribution and allowlist verification.


TL;DR

  • A Merkle tree hashes data recursively into a single root that represents the entire dataset — change one byte anywhere and the root changes.
  • Merkle proofs verify data membership in O(log n) steps instead of O(n) — verifying 1 transaction in a block of 4,096 requires only 12 hashes, not 4,096.
  • Bitcoin block headers contain a Merkle root of all transactions, enabling SPV light clients to verify transactions using only ~464 bytes instead of a full ~1.7 MB block.
  • Ethereum uses a Modified Merkle Patricia Trie for state, transactions, and receipts — three roots embedded in every block header.
  • The Merkle Distributor pattern (popularized by the Uniswap UNI airdrop) stores a single 32-byte root on-chain instead of 250,000+ individual addresses, reducing deployment gas by orders of magnitude.
  • Merkle proofs are not bulletproof — second preimage attacks, duplicate leaf bugs, and odd-leaf-count edge cases have caused real vulnerabilities (CVE-2012-2459 in Bitcoin).
  • Verkle trees (vector commitment + Merkle) are Ethereum's planned replacement, shrinking proof sizes by 20-30x vs the current hexary Patricia trie.

What Is a Merkle Tree?

At its core, a Merkle tree converts a list of data into a single hash that commits to every element. Construction is bottom-up:

  1. Hash each data element individually to produce leaf nodes.
  2. Pair adjacent leaves and hash each pair to produce the next level.
  3. Repeat until one root hash remains.
flowchart BT
    Root["Merkle Root<br/>H( H(AB) + H(CD) )"]
    subgraph Intermediate
        I1["H( H(A) + H(B) )"]
        I2["H( H(C) + H(D) )"]
    end
    subgraph Leaves
        L1["H(A)"]
        L2["H(B)"]
        L3["H(C)"]
        L4["H(D)"]
    end

    L1 --> I1
    L2 --> I1
    L3 --> I2
    L4 --> I2
    I1 --> Root
    I2 --> Root

Figure 1: A four-leaf Merkle tree. Each leaf is the hash of raw data. Each parent is the hash of its two children. The root commits to the entire dataset.

Change a single byte in data block C, and H(C) changes, which changes H(CD), which changes the root. The corruption propagates upward deterministically.

Hash Functions by Chain

CHAIN HASH FUNCTION CONSTRUCTION NOTES
Bitcoin Double SHA-256 Merkle-Damgard Double-hashing defends against length-extension attacks
Ethereum Keccak-256 Sponge Not vulnerable to length-extension; single hash sufficient
Hedera SHA-384 Merkle-Damgard Truncated SHA-512; faster than SHA-256 on 64-bit hardware, 192-bit collision resistance
Solana SHA-256 Merkle-Damgard Used in transaction and account state verification

Merkle Proofs

A Merkle proof answers one question: "Is this data in the tree?" Without one, you would need the entire dataset to verify membership. With one, you need only the sibling hashes along the path from the target leaf to the root.

How Verification Works

To prove leaf B exists in the tree from Figure 1:

  1. Start with H(B) (the leaf to verify).
  2. Provide sibling H(A) — hash them together to get H(AB).
  3. Provide sibling H(CD) — hash with H(AB) to get the root.
  4. Compare computed root against the known root. Match = valid.

The proof for B is just two hashes: [H(A), H(CD)]. The verifier never sees data blocks A, C, or D.

Proof Size Comparison

TREE SIZE (LEAVES) PROOF DEPTH PROOF SIZE FULL DATASET SIZE COMPRESSION
128 7 224 bytes 4 KB ~18x
1,024 10 320 bytes 32 KB ~100x
10,000 ~14 448 bytes 312 KB ~700x
1,000,000 ~20 640 bytes 32 MB ~50,000x

Each proof step is one 32-byte hash. Total proof size = log2(n) × 32 bytes. A tree with a million leaves requires only 640 bytes to prove membership — compared to scanning 32 MB of hashes without the tree.


Blockchain Applications

Bitcoin: SPV and Light Clients

Every Bitcoin block header (80 bytes) contains a Merkle root of all transactions in that block. Satoshi described Simplified Payment Verification (SPV) in Section 8 of the Bitcoin whitepaper: a light client downloads only block headers and requests Merkle proofs from full nodes to verify specific transactions.

METRIC FULL BLOCK SPV PROOF
Data per block ~1.7 MB average 80 bytes (header) + ~384 bytes (proof)
All headers since genesis ~1.4 TB (full chain) ~70 MB (headers only)
Verify 1 tx in 4,096 tx block Process all 4,096 hashes 12 sibling hashes

This is what makes mobile Bitcoin wallets possible. They do not download the full blockchain — they trust block headers (validated by Proof of Work) and use Merkle proofs to confirm their transactions exist.

Ethereum: Three Tries Per Block

Ethereum extends the concept with a Modified Merkle Patricia Trie (MPT) — a hexary (16-way branching) trie that combines Merkle hashing with path-based key lookup. Every block header contains three MPT roots:

  • stateRoot: Maps every address to its account state (nonce, balance, storage root, code hash). Covers 130+ million addresses.
  • transactionsRoot: Maps transaction index to transaction data for that block. Immutable after mining.
  • receiptsRoot: Maps transaction index to its receipt (gas used, logs, status).

Note

Ethereum's Patricia Trie is not a binary tree. Branch nodes have 16 children (one per hex nibble) plus a value slot, and extension nodes shortcut through sparse regions. This makes the structure wider but shallower than a standard Merkle tree.

Airdrop Claims: The Merkle Distributor

The Merkle Distributor pattern stores a single bytes32 merkleRoot on-chain — one storage slot — instead of writing every eligible address to contract storage. Recipients submit proofs off-chain to claim tokens.

The canonical implementation is Uniswap's MerkleDistributor.sol, deployed for the September 2020 UNI airdrop (0x090D4613...0821D256e (etherscan)).

How it works:

  1. Off-chain: Build a Merkle tree where each leaf = keccak256(abi.encodePacked(index, account, amount)).
  2. Deploy contract with just the Merkle root. Publish the full tree to IPFS.
  3. User calls claim(index, account, amount, proof).
  4. Contract verifies the proof against the stored root and transfers tokens.
  5. A bitmap (mapping(uint256 => uint256)) tracks claimed status — each uint256 stores 256 claim flags as individual bits.

Gas comparison — 250,000 eligible addresses:

APPROACH SETUP COST PER-CLAIM COST
Store all addresses on-chain ~5.5 billion gas (250K × 22,100 gas SSTORE) ~5,000 gas (mapping lookup)
Merkle root only ~20,000 gas (one SSTORE) ~45,000-55,000 gas (proof verification)

The trade-off is clear: massively cheaper deployment in exchange for slightly higher per-claim cost. For large distributions, it is not close.

Rollup State Verification

Both Optimistic and ZK Bridge rollups post state roots (Merkle roots) to L1 as commitments:

  • ZK Rollups: Sequencer computes new state root, generates a validity proof (ZK-SNARK/STARK), and posts both to L1. The L1 verifier checks the proof and accepts the new root. Withdrawals include a Merkle proof of the burn transaction's inclusion.
  • Optimistic Rollups: Sequencer posts a new state root. It is assumed valid unless challenged during a ~7-day dispute window, where fraud proofs re-execute disputed transactions and compare state roots.

Rollup state anchoring is covered in more detail in Anchoring.


How It Works in Solidity

On-chain Merkle verification is straightforward. OpenZeppelin's MerkleProof.sol is the standard library:

function verify(
    bytes32[] memory proof,
    bytes32 root,
    bytes32 leaf
) internal pure returns (bool)

The function walks the proof array, hashing the current value with each sibling hash using commutative hashing (sorts the pair before hashing so proof generators do not need to track left/right positioning). After processing all proof elements, it compares the result to the root.

Gas Costs

Keccak-256 in the EVM: 30 gas base + 6 gas per 32-byte word of input. Each proof step hashes 64 bytes (two 32-byte values) = 42 gas for the hash itself, plus memory and calldata overhead.

TREE SIZE (LEAVES) PROOF DEPTH ESTIMATED VERIFICATION GAS
128 7 ~30,000 gas
1,024 10 ~35,000 gas
10,000 ~14 ~40,000 gas
250,000 ~18 ~50,000 gas

For comparison: ECDSA signature verification costs ~29,000 gas, and a single SSTORE to a new slot costs 22,100 gas.


Risks & What Can Go Wrong

Second Preimage Attacks

Internal nodes are computed as H(left || right) — two concatenated 32-byte hashes (64 bytes total). An attacker can submit those 64 bytes as a fake "leaf" with a shortened proof, and verification passes because the contract cannot distinguish a leaf hash from an internal node hash.

OpenZeppelin's JavaScript library mitigates this by double-hashing leaves: keccak256(keccak256(leaf)). But the Solidity MerkleProof.verify function does not enforce this — developers must apply double-hashing themselves. OpenZeppelin GitHub issue #3091 documents the vulnerability.

Odd Number of Leaves (CVE-2012-2459)

When the leaf count is not a power of 2, the standard approach is to duplicate the last leaf. Bitcoin does this, and it introduced a real vulnerability: transaction lists [1,2,3,4,5,6] and [1,2,3,4,5,6,5,6] produce the same Merkle root. An attacker could create a mutated block with duplicate transactions that shares the Merkle root of a valid block, causing nodes to cache the invalid version and reject the real one.

A variation was reintroduced in Bitcoin Core 0.13.0 (2016), fixed in 0.14.0 (2017), and another variant discovered and fixed in 2023.

Airdrop Front-Running

Merkle proofs are public data — they appear in transaction calldata once broadcast. However, the Merkle Distributor pattern binds each leaf to a specific address: keccak256(index, account, amount). Even if a MEV Searcher copies the proof, the contract verifies the account matches the recipient. The bot cannot redirect tokens to themselves.

The real risk is poorly implemented contracts that do not bind claims to msg.sender.

Verkle Trees: The Planned Replacement

Ethereum's current hexary Patricia Trie requires large proofs — a single state proof can exceed 1 KB. Verkle trees (coined by John Kuszmaul, 2018) replace sibling hashes with polynomial commitments (IPA or KZG), eliminating the need to provide neighbor data entirely.

PROPERTY MERKLE PATRICIA TRIE VERKLE TREE
Proof size (1B data points) ~1 KB+ ~150 bytes
Branching factor 16 (hexary) 256+
Requires sibling hashes
Quantum resistant ☑ (hash-based) ☒ (elliptic curve)
Status Production (since 2015) Testnet (devnets running)

The quantum vulnerability concern has prompted parallel research into binary Merkle trees with more efficient hash functions (EIP-7864) as a potential alternative path.


References


Changelog

DATE AUTHOR NOTES
2026-03-18 Artificial. Generated by robots. Gas: 75 tok
2026-03-19 Denizen. Reviewed, edited, and curated by humans.