Merkle Trees for Airdrops: Build Proofs Without a Backend
The first airdrop I deployed had 412 addresses on the allowlist and I stored them in a mapping. The second one had 38,000 addresses and the deploy transaction ran out of gas before the constructor finished. That is the moment every Solidity engineer discovers Merkle trees. You cannot put 38,000 addresses on chain — not in a constructor, not in a loop, not split across multiple transactions at any sane cost. But you can commit a single 32-byte root that represents all of them, and let each claimer prove they belong with roughly 500 bytes of calldata.
The trick is older than Ethereum itself. Ralph Merkle described the structure in 1979 for authenticating large sets of documents. Bitcoin uses it for block transactions, Git uses a variant for commits, and every major L1/L2 uses it for state. For airdrops it solves a very concrete problem: how do you let 100,000 users claim from a contract that only knows a single hash?
What is a Merkle tree?
A Merkle tree is a binary tree of hashes. The leaves at the bottom are hashes of your data items. Each parent node is the hash of its two children concatenated together. You keep walking up the tree, pairing and hashing, until you end up with a single value at the top: the Merkle root.
A tiny example with four addresses. Imagine a list:
0xAbc...001 // leaf0 = keccak256(0xAbc...001) 0xDef...002 // leaf1 = keccak256(0xDef...002) 0x123...003 // leaf2 = keccak256(0x123...003) 0x456...004 // leaf3 = keccak256(0x456...004) parent01 = keccak256(leaf0 || leaf1) // concatenate, then hash parent23 = keccak256(leaf2 || leaf3) root = keccak256(parent01 || parent23)
The property that makes this useful: flip a single bit anywhere in the input list and the root changes completely. keccak256 is collision-resistant, so no attacker can produce a different address list that hashes to the same root. The root is a 32-byte commitment to the entire set.
The other useful property is local verifiability. To prove that leaf2 is in the tree, you do not need to reveal the whole list. You only need the sibling nodes along the path from leaf2 up to the root: leaf3 and parent01 in the example above. That is two hashes, 64 bytes, to prove membership in a set of four. For a set of 100,000 items the proof is 17 hashes, 544 bytes.
Why airdrops use Merkle trees
Consider the naive alternative. You want to airdrop a token to 50,000 addresses. Option A: push tokens to each address from a batch script. At ~50,000 gas per ERC-20 transfer and even a quiet 20 gwei gas price, that is 50,000 × 50,000 × 20 × 10^-9 = 50 ETH just to move the tokens. Add the fact that many recipients never want the token and you are burning money on unwanted transfers.
Option B: store an on-chain mapping of {address => amount} and let users claim. Each SSTORE of a new non-zero slot costs 20,000 gas. 50,000 entries × 20,000 gas × 20 gwei = 20 ETH of deployment cost, plus the constructor loop hitting block gas limits so you have to split it across dozens of transactions. Still brutal.
Option C, the Merkle approach: store a single bytes32 root in the constructor. That is one SSTORE, 20,000 gas, ~$2 at typical gas prices. Claimers submit a proof when they want their tokens, paying their own gas. Verification costs roughly 1,200 gas per proof element, so about 20,000 gas for a tree of 50,000 leaves. Total cost to the deployer: one small transaction. Total cost to non-claimers: zero.
Every major token launch in the last three years — Uniswap, ENS, Arbitrum, Optimism, zkSync — has used this pattern. The allowlist data lives off-chain in IPFS or a plain static JSON file, and the contract only ever sees the root. The economic argument is lopsided enough that I cannot remember the last time I saw a production airdrop that did not use Merkle proofs. Even tiny allowlists of a few hundred addresses often ship with a root now, because the code path is familiar and the marginal cost is essentially the same as the naive version.
There is a second, quieter benefit: the contract stops being a privacy sink. With option B, the entire allowlist is visible on chain forever. Anyone can query addresses, sort them by allocation, build dashboards of who got what. With a Merkle root, only addresses that actually claim reveal themselves, and even then only the claim transaction is public. The full off-chain list only matters to people checking eligibility. That is a much smaller attack surface for targeted phishing against known recipients.
Building the tree
The tree is only as trustworthy as the leaf hashing convention. You must decide exactly how a leaf is produced, and the off-chain tree builder and the on-chain verifier must agree byte-for-byte.
Address-only leaves (allowlist)
If the airdrop gives every eligible address the same amount, the leaf is just the hashed address:
// Solidity bytes32 leaf = keccak256(abi.encodePacked(claimer)); // JavaScript (ethers v6) const leaf = ethers.solidityPackedKeccak256(["address"], [claimer]);
Address + amount leaves (variable airdrop)
If different addresses get different amounts (a common pattern for contributor or retro-active rewards), encode both:
// Solidity bytes32 leaf = keccak256(abi.encode(claimer, amount)); // JavaScript (ethers v6) const leaf = ethers.solidityPackedKeccak256( ["address", "uint256"], [claimer, amount] );
A practical note on abi.encode versus abi.encodePacked: encodePacked is shorter but has collision risks when types can be ambiguous (two dynamic values side by side, for instance). encode pads each value to 32 bytes and is safer when mixing types. For the common address+uint256 pair, encodePacked is fine because both fields have fixed width; most modern guides use abi.encode anyway for consistency.
Double-hashing
OpenZeppelin's standard library recommends hashing leaves twice: keccak256(bytes.concat(keccak256(abi.encode(addr, amount)))). The outer hash is a defense against a subtle second-preimage attack where an attacker crafts an internal node that collides with a hand-chosen leaf. For most airdrops the single-hash form has been safe in practice, but if you are copying the reference OpenZeppelin StandardMerkleTree implementation, double-hashing is the default and you should match it.
Pair ordering
When you hash two sibling nodes together, do you hash(left, right) or hash(right, left)? The answer matters because the on-chain verifier needs the same convention. OpenZeppelin sorts the pair so the smaller hash always comes first:
function _hashPair(bytes32 a, bytes32 b) private pure returns (bytes32) {
return a < b
? keccak256(abi.encodePacked(a, b))
: keccak256(abi.encodePacked(b, a));
}Sorted pairs let the proof omit direction bits. Each proof element just gets combined with the running hash in the correct sorted order. This is simpler and marginally cheaper than the unsorted variant that needs a parallel bitmap of left/right flags.
Odd node at a level
A tree with 5, 7, or 38,001 leaves has levels where you run out of pairs. Two common strategies: duplicate the last node to pair with itself (Bitcoin uses this), or promote the unpaired node unchanged to the next level. OpenZeppelin's StandardMerkleTree promotes unchanged; that is the convention you should follow if you want interoperability with the widely audited verifier.
If you want to skip the implementation details for a one-off airdrop, BeautiCode's Merkle Tree Generator builds the root, all intermediate nodes, and per-leaf proofs from a pasted list. It matches the OpenZeppelin sorted-pair convention so the output plugs into MerkleProof.verify without glue code. Before feeding it in, run your addresses through ETH Address Checksum to normalize everything to EIP-55 casing — mixed-case duplicates will produce different leaf hashes and silent claim failures.
Generating a proof for a leaf
A Merkle proof is the list of sibling hashes you need to walk from a leaf up to the root. Start at the leaf, grab the sibling, hash them together (sorted), move up one level, grab that level's sibling, hash, repeat until you produce the root.
Proof size is simply the tree depth:
- 100 leaves → depth 7 → proof is 7 × 32 bytes = 224 bytes
- 10,000 leaves → depth 14 → proof is 14 × 32 bytes = 448 bytes
- 100,000 leaves → depth 17 → proof is 17 × 32 bytes = 544 bytes
- 1,000,000 leaves → depth 20 → proof is 20 × 32 bytes = 640 bytes
Proof size grows logarithmically, so doubling the allowlist adds one hash to every proof. Verification gas follows the same curve: about 1,200 gas per proof element plus a fixed overhead, so even a million-leaf tree verifies in well under 30,000 gas.
The proof is not a secret. You can publish every address's proof in a static JSON file on IPFS or GitHub, and your claim UI just looks up proofs[userAddress] when the user connects their wallet. No backend, no database, no server. A typical 10,000-address airdrop produces a ~4 MB JSON file that you can stick on Cloudflare Pages and serve for effectively free.
// proofs.json (example entry)
{
"0xAbc1230000000000000000000000000000000001": {
"amount": "1000000000000000000000",
"proof": [
"0x7d3e6f...",
"0x9a2b48...",
"0x12c6ff...",
"0x80be33..."
]
},
"0xDef4560000000000000000000000000000000002": {
"amount": "500000000000000000000",
"proof": ["0xabcdef...", "0x123456...", "0x789abc...", "0xdef012..."]
}
}Verifying on-chain with MerkleProof.verify
OpenZeppelin's MerkleProof.sol is the reference verifier. You do not need to roll your own — it is audited, gas-optimized, and handles the sorted-pair convention out of the box. The full contract surface you will touch is three functions, and in 95% of airdrops you only use verify.
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
import "@openzeppelin/contracts/utils/cryptography/MerkleProof.sol";
import "@openzeppelin/contracts/token/ERC20/IERC20.sol";
import "@openzeppelin/contracts/utils/ReentrancyGuard.sol";
contract MerkleAirdrop is ReentrancyGuard {
bytes32 public immutable merkleRoot;
IERC20 public immutable token;
mapping(address => bool) public claimed;
event Claimed(address indexed account, uint256 amount);
constructor(bytes32 _root, IERC20 _token) {
merkleRoot = _root;
token = _token;
}
function claim(uint256 amount, bytes32[] calldata proof)
external
nonReentrant
{
require(!claimed[msg.sender], "already claimed");
// Reconstruct the leaf exactly as the off-chain builder did
bytes32 leaf = keccak256(abi.encode(msg.sender, amount));
// MerkleProof.verify walks the proof and checks against the root
require(
MerkleProof.verify(proof, merkleRoot, leaf),
"invalid proof"
);
// Effects before interactions
claimed[msg.sender] = true;
// Transfer the tokens
require(token.transfer(msg.sender, amount), "transfer failed");
emit Claimed(msg.sender, amount);
}
}Three things to notice. First, msg.senderis baked into the leaf, so a proof is only valid for the address making the call. Nobody can steal someone else's allocation by copying their proof. Second, claimed[msg.sender] = true happens before the transfer, following the checks-effects-interactions pattern. Third, merkleRoot is immutable — set once in the constructor, never mutable afterwards, which means your deploy transaction is the only chance to get the root right.
MerkleProof.verify itself is tiny:
function verify(bytes32[] memory proof, bytes32 root, bytes32 leaf)
internal pure returns (bool)
{
return processProof(proof, leaf) == root;
}
function processProof(bytes32[] memory proof, bytes32 leaf)
internal pure returns (bytes32)
{
bytes32 computedHash = leaf;
for (uint256 i = 0; i < proof.length; i++) {
computedHash = _hashPair(computedHash, proof[i]);
}
return computedHash;
}That is the entire verification. Walk the proof, hash pairs, compare the result to the stored root. No loops over thousands of entries, no giant storage reads, just 17 or so keccak256 calls.
Common mistakes
Almost every bug in a Merkle airdrop comes down to the off-chain builder and the on-chain verifier disagreeing about exactly one thing. Here is the list, ordered by how often I have seen them go wrong in production.
Leaf hashing mismatch
The builder uses keccak256(abi.encodePacked(addr, amount)), the contract uses keccak256(abi.encode(addr, amount)), nothing verifies, every claim reverts. Pick one encoding, document it in a code comment right above the line in both places, and write a test that hashes a known address on both sides and asserts equality.
Sorted vs unsorted pairs
If your off-chain builder does not sort sibling hashes before concatenating, the root will not match what OpenZeppelin's verifier reconstructs. Either sort in the builder, or use a custom verifier with direction bits. Sorted is the path of least surprise.
Duplicate leaves
If the same address appears twice in the input list, it gets two leaves and two proofs. Nothing in MerkleProof.verify prevents both from validating, so whoever controls that address can claim twice unless your claimed[] mapping catches it. It does in the contract above, but only if the duplicates are the same (address, amount) tuple. If the amounts differ, you have a real allocation error and both claims succeed. De-duplicate before building the tree.
Checksummed vs lowercase addresses
keccak256 on 0xAbC... and 0xabc...as strings produces different hashes. Solidity's address type normalizes to 20 raw bytes so this does not bite you on-chain, but it absolutely bites you when the off-chain builder takes addresses as strings. Normalize everything to EIP-55 or pure lowercase before hashing and keep the convention consistent across the full pipeline.
Reentrancy on claim
If you are airdropping a token that calls back into the caller (ERC-777, custom hooks, NFTs with onERC721Received), and you mark claimed after the transfer, a reentrancy attack can drain the contract. The example above uses ReentrancyGuard and writes to claimed before the transfer. Do both. One is belt, the other is suspenders.
Wrong root deployed
I have personally deployed the wrong root once — rebuilt the tree from a newer CSV, copied the old root from a Slack message. Every claim failed, I had to redeploy, and the first contract holds 250k tokens stranded forever. Verify the root on a public block explorer immediately after deploy, spot-check three claims from three known addresses before you publish the claim UI, and keep the source-of-truth CSV in a commit on your repo.
Multi-proofs and when to use them
A multi-proof verifies several leaves in a single call, sharing any sibling nodes they have in common. If a user is claiming three entries from the tree, a multi-proof is smaller and cheaper than three independent proofs that each walk their own path.
// OpenZeppelin multi-proof signature
function multiProofVerify(
bytes32[] memory proof,
bool[] memory proofFlags,
bytes32 root,
bytes32[] memory leaves
) internal pure returns (bool);The proofFlags array tells the verifier at each step whether to consume the next hash from the proof array or pair up two already-computed hashes. It is cleverer than a single proof and the gas savings scale with how many leaves share ancestors.
Use multi-proofs when:
- A single caller legitimately owns many leaves (e.g. a DAO treasury claiming on behalf of multiple historical addresses).
- You are batching claims in a single transaction to save calldata — aggregators sometimes do this for users who have multiple small allocations across a tree.
- You want to prove that a specific subset exists, for example for a voting snapshot.
Skip multi-proofs when every claimer has exactly one leaf. The added code path is not worth the complexity for a single-claim-per-address airdrop, and the individual proof is already cheap.
A realistic example: 5,000-address airdrop
Putting it all together, here is the flow I ran for a mid-sized airdrop last year. Start to finish took an afternoon, and the contract has been live without incident for 14 months.
Step 1: prepare the CSV
Export the allocation from whatever tool produced it (snapshot query, internal spreadsheet, contributor ledger). Two columns, no header:
0xAbC1230000000000000000000000000000000001,1000 0xDeF4560000000000000000000000000000000002,500 0x1234560000000000000000000000000000000003,2500 0x4567890000000000000000000000000000000004,1000 ...
De-duplicate (sort | uniq on the address column). Sum the amount column and cross-check against your total supply. Nothing else is worse than deploying with an allocation that exceeds the tokens you are minting into the contract.
Step 2: checksum the addresses
Run the address column through ETH Address Checksum to produce EIP-55 mixed-case versions. This catches typos (a bad checksum reveals a bit flip), and it normalizes the input so two representations of the same address do not produce different leaves later.
Step 3: build the tree and pull the root
Paste the cleaned CSV into the Merkle Tree Generator. You get back the 32-byte root and a JSON bundle of per-address proofs. Save the root, the JSON, and the original CSV in the same commit so the tree is reproducible later if someone disputes their allocation.
Step 4: deploy the contract
Take the MerkleAirdrop contract from earlier, pass the root and token address to the constructor, and deploy. Mint or transfer the total allocation (check the CSV sum) into the contract as a second transaction. Verify the source on the block explorer so claimers can audit the verify logic themselves.
Step 5: publish the claim UI
Host proofs.json on IPFS, GitHub Pages, or Cloudflare R2 — anywhere static files are cheap. The UI is just a lookup: user connects wallet, you look up proofs[address], you call claim(amount, proof)on the contract. If the address is not in the JSON, show a "not eligible" message and stop. There is no backend, no database, no API keys. The entire claim site fits on a static file host.
// Minimal claim UI logic (ethers v6)
import proofs from "./proofs.json";
async function claim(signer) {
const addr = (await signer.getAddress()).toLowerCase();
const entry = proofs[addr];
if (!entry) {
alert("Address not eligible");
return;
}
const contract = new ethers.Contract(
AIRDROP_ADDRESS,
["function claim(uint256, bytes32[]) external"],
signer
);
const tx = await contract.claim(entry.amount, entry.proof);
await tx.wait();
}Step 6: announce and monitor
Watch the Claimed events for the first few hours. If claims are reverting, you have a leaf hashing bug, not a proof bug — fix the off-chain builder, rebuild the tree, and redeploy (your contract is still holding the tokens; they are not lost, just stuck in the old address). If claims succeed, you are done. The contract runs itself from here.
Wrapping up
Merkle trees are the cleanest example in crypto of a small cryptographic primitive solving a practical problem. One 32-byte root stands in for an arbitrarily large allowlist. Proofs scale logarithmically. Verification is cheap enough that users barely notice the gas cost. No backend, no database, no centralized claim API — just a static JSON file and an immutable contract.
When you are ready to run your own airdrop, the two pieces you need are the tree builder and the address validator. BeautiCode's Merkle Tree Generator handles the tree and produces OpenZeppelin-compatible proofs, and the ETH Address Checksum tool keeps your input clean so the on-chain verifier agrees with your off-chain hashes. Both run entirely in the browser — paste your CSV, copy the root, deploy. The rest is just Solidity you have already written.
The first Merkle airdrop I ran taught me that the hard part is never the cryptography — it is the off-chain pipeline. Get the CSV right, get the encoding right, test one leaf end-to-end before you deploy, and the tree takes care of itself.
Related Tools
Merkle Tree Generator
Build a Merkle tree from an address / amount list and generate the root plus inclusion proofs for each leaf.
Ethereum Address Checksum
Convert Ethereum addresses to EIP-55 checksum format and validate address integrity.
Token Decimals Calculator
Convert between token raw units and human amounts respecting each token's decimals (USDC 6, ETH 18).
ABI Decoder & Encoder
Decode and encode Solidity function call data against an ABI. Supports tuples, arrays, and custom types.
Related Articles
How to Generate Secure Passwords in 2026: A Complete Guide
Learn why strong passwords matter and how to generate secure passwords using entropy, length, and complexity. Includes practical tips and free tools.
2025-12-15 · 8 min readData FormatsJSON vs YAML: When to Use What — A Developer's Guide
Compare JSON and YAML formats with syntax examples, pros and cons, and use case recommendations for APIs, configs, and CI/CD pipelines.
2025-12-28 · 10 min read