Back to Blog

Understanding Merkle Trees: The Technology Behind Blockchain Verification

A deep dive into the cryptographic data structure invented by Ralph Merkle in 1979 that powers Bitcoin, Git, and enables Anchora to achieve 99.98% cost reduction in blockchain verification.

Introduction

Every time you use Git to commit code, verify a Bitcoin transaction, or download a file via BitTorrent, you're using a data structure invented in 1979 by computer scientist Ralph Merkle. This elegant invention - the Merkle tree (also called a hash tree) - is one of the most important building blocks of modern cryptography and blockchain technology.

In this article, we'll explore what Merkle trees are, how they work, and why they're fundamental to how Anchora achieves 99.98% cost reduction in blockchain anchoring while maintaining cryptographic security.

The Problem Merkle Trees Solve

Imagine you have 1,000 documents and you want to prove that each one hasn't been tampered with. The naive approach is to:

  1. Hash each document individually
  2. Store each hash on a blockchain
  3. Pay for 1,000 separate blockchain transactions

At approximately $10 per transaction (Ethereum mainnet average), that's $10,000 to verify 1,000 documents. This makes blockchain verification economically impractical for most applications.

Merkle trees solve this problem elegantly: By combining multiple hashes into a tree structure, you can verify 1,000 documents with a single blockchain transaction, reducing the cost to approximately $0.04 (using Polygon via Anchora).

Historical Note: Ralph Merkle invented hash trees in 1979 and patented them in 1982 (US Patent 4,309,569). The patent has long since expired, making Merkle trees freely available for everyone to use.

How Merkle Trees Work

A Merkle tree is a binary tree where:

  • Leaf nodes contain hashes of individual data blocks
  • Non-leaf nodes contain hashes of their children (combined)
  • The root (Merkle root) is a single hash that represents ALL data in the tree

Step-by-Step Construction

Let's build a Merkle tree for 4 documents:

Visual
Documents: [Doc1, Doc2, Doc3, Doc4]

Step 1: Hash each document (leaf nodes)
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│   H1    │ │   H2    │ │   H3    │ │   H4    │
│ 0x7f3a..│ │ 0xb2c1..│ │ 0x9e4d..│ │ 0x1a2b..│
└─────────┘ └─────────┘ └─────────┘ └─────────┘
     │           │           │           │
     └─────┬─────┘           └─────┬─────┘
           │                       │
Step 2: Hash pairs together
     ┌─────────┐             ┌─────────┐
     │  H12    │             │  H34    │
     │ 0x5678..│             │ 0x9abc..│
     └─────────┘             └─────────┘
           │                       │
           └───────────┬───────────┘
                       │
Step 3: Hash to get Merkle Root
                 ┌─────────┐
                 │  ROOT   │
                 │ 0x9876..│
                 └─────────┘

Where:
H1   = SHA256(Doc1)
H2   = SHA256(Doc2)
H12  = SHA256(H1 + H2)
ROOT = SHA256(H12 + H34)

The key insight is that any change to any document changes the Merkle root. If someone modifies Doc2, then H2 changes, which changes H12, which changes ROOT. This is called tamper evidence.

The Power of Merkle Proofs

Here's where Merkle trees become truly powerful. To prove that Doc1 is part of the tree, you don't need all documents - you only need:

  • H2 (sibling of H1)
  • H34 (sibling of H12)
  • The Merkle Root
Merkle Proof
To verify Doc1:

1. Compute H1 = SHA256(Doc1)
2. Compute H12 = SHA256(H1 + H2)     ← H2 is in proof
3. Compute ROOT = SHA256(H12 + H34)  ← H34 is in proof
4. Compare computed ROOT with stored ROOT
5. If match → Doc1 is authentic ✓

Proof for Doc1: [H2, H34]
Proof size: 2 hashes (O(log n))

For a tree with 256 documents, you only need 8 hashes (log2(256) = 8) to prove any single document. This is incredibly efficient!

Proof Size Comparison

Documents in Batch
Proof Size
Efficiency
4
2 hashes
50% of data
16
4 hashes
25% of data
256
8 hashes
3.1% of data
1,024
10 hashes
0.98% of data
1,000,000
20 hashes
0.002% of data

Real-World Applications

1. Bitcoin and Blockchain

Every Bitcoin block contains thousands of transactions organized in a Merkle tree. The block header only stores the Merkle root (32 bytes), not all transactions. This allows Simplified Payment Verification (SPV) - lightweight clients can verify transactions without downloading the entire blockchain.

2. Git Version Control

Git uses Merkle trees (called "Git objects") to track files and directories. Every commit is a Merkle root of the entire repository state at that point. This enables efficient comparison between versions and detection of any file changes.

3. IPFS and Content Addressing

The InterPlanetary File System (IPFS) uses Merkle DAGs (Directed Acyclic Graphs) to address content by its hash. Files are split into chunks, and the Merkle root becomes the file's permanent address.

4. Certificate Transparency

Google's Certificate Transparency logs use Merkle trees to create an append-only log of SSL/TLS certificates, making it impossible to issue fraudulent certificates without detection.

How Anchora Uses Merkle Trees

Anchora uses Merkle tree batching as its core innovation to make blockchain verification economically viable:

Anchora's Batching Process
1. Collect records (up to 256 per batch)
   Records: [R1, R2, R3, ... R256]

2. Hash each record using SHA-256
   Hashes: [H1, H2, H3, ... H256]

3. Build Merkle tree using keccak256
   (keccak256 is Ethereum's hash function)

4. Anchor ONLY the Merkle root to Polygon blockchain
   Contract.anchorRoot(merkleRoot, recordCount)
   Cost: ~$0.001 per batch (regardless of batch size!)

5. Generate proof for each record
   R1.proof = [H2, H34, H5678, ...]

6. Store proofs with records
   Each record can be verified independently

Cost Comparison

Approach
1,000 Records
Per Record
Traditional (1 tx/record)
$10,000
$10.00
Anchora (Merkle batching)
$0.04
$0.00004
Savings
$9,999.98
99.98%

Implementation Deep Dive

Here's how Anchora implements Merkle trees using the merkletreejs library:

JavaScript
import { MerkleTree } from 'merkletreejs';
import keccak256 from 'keccak256';

// Step 1: Collect record hashes
const recordHashes = [
  'e3b0c44298fc1c149afbf4c8996fb924...',
  'd7a8fbb307d7809469ca9abcb0082e4f...',
  // ... up to 256 hashes
];

// Step 2: Create leaf nodes (keccak256 of each hash)
const leaves = recordHashes.map(h =>
  keccak256(Buffer.from(h, 'hex'))
);

// Step 3: Build Merkle tree
const tree = new MerkleTree(leaves, keccak256, {
  sortPairs: true  // Ensures deterministic tree
});

// Step 4: Get Merkle root (this goes to blockchain)
const merkleRoot = '0x' + tree.getRoot().toString('hex');
console.log(merkleRoot);
// "0x9876543210fedcba9876543210fedcba..."

// Step 5: Generate proof for any record
const leaf = keccak256(Buffer.from(recordHashes[0], 'hex'));
const proof = tree.getProof(leaf);

// Convert proof to hex strings
const proofHex = proof.map(p =>
  '0x' + p.data.toString('hex')
);
console.log(proofHex);
// ["0x1234...", "0x5678...", "0x9abc...", ...]

Verification Process

JavaScript
// Verify a record against stored Merkle root
function verifyMerkleProof(hash, proof, expectedRoot) {
  let computedHash = Buffer.from(hash, 'hex');

  for (const sibling of proof) {
    const siblingBuffer = Buffer.from(
      sibling.replace('0x', ''),
      'hex'
    );

    // Sort pair to ensure deterministic hashing
    const combined = Buffer.compare(computedHash, siblingBuffer) < 0
      ? Buffer.concat([computedHash, siblingBuffer])
      : Buffer.concat([siblingBuffer, computedHash]);

    computedHash = keccak256(combined);
  }

  const computedRoot = '0x' + computedHash.toString('hex');
  return computedRoot === expectedRoot;
}

// Usage
const isValid = verifyMerkleProof(
  'e3b0c44298fc1c149afbf4c8996fb924...',  // Record hash
  ['0x1234...', '0x5678...'],                // Merkle proof
  '0x9876543210fedcba...'                    // Root from blockchain
);

console.log(isValid); // true

Security Properties

Merkle trees inherit their security from the underlying hash function (SHA-256 or keccak256). They provide:

Collision Resistance

It's computationally infeasible to find two different inputs that produce the same hash. With 2256 possible outputs, the probability of collision is negligible.

Tamper Evidence

Any modification to any leaf changes the Merkle root. There's no way to modify data without detection.

Privacy Preserving

Merkle proofs reveal only the siblings needed for verification. Other data in the tree remains hidden.

Non-Repudiation

Once a Merkle root is anchored to blockchain, the data's existence at that time is permanently proven.

Why 256 Records Per Batch?

Anchora uses a batch size of 256 records for several reasons:

  • Perfect Binary Tree: 256 = 28, creating an 8-level tree with exactly 8 hashes per proof
  • Proof Size: 8 hashes x 32 bytes = 256 bytes per proof (manageable)
  • Latency Balance: At 30-second batch intervals, this provides good throughput while maintaining reasonable anchoring times
  • Cost Optimization: Gas cost is nearly constant regardless of batch size, so larger batches = lower per-record cost
The Math: With a batch size of 256 and gas cost of ~52,000 gas (~$0.001), the per-record cost is $0.001 / 256 = $0.00000391. That's approximately 250x cheaper than individual transactions!

Comparison with Other Approaches

Approach
Pros
Cons
Individual Transactions
Simple, immediate
Expensive ($10/record)
Hash Chaining
Ordered, sequential
Can't verify individual items
Merkle Trees
Efficient, verifiable, private
Slight complexity
ZK-Rollups
Maximum compression
Complex, expensive to generate

Conclusion

Merkle trees are a fundamental cryptographic primitive that enables efficient, secure, and privacy-preserving data verification. Their O(log n) proof size makes them ideal for blockchain applications where storage and bandwidth are expensive.

At Anchora, we leverage Merkle trees to batch 256 records into a single blockchain transaction, achieving 99.98% cost reduction while maintaining the same cryptographic security guarantees. Each record gets its own unique proof that can be verified independently without revealing other records in the batch.

Whether you're building credential verification systems, supply chain tracking, or audit logs, Merkle trees (and by extension, Anchora) provide the most efficient path to blockchain-backed data integrity.

Ready to leverage Merkle tree batching?

Anchora handles all the complexity of Merkle tree construction, blockchain anchoring, and proof generation. Start verifying data at a fraction of traditional costs.

Get Free API Key