Ethereum Internals

Ethereum is the world computer that almost works. It is a replicated state machine with ~1 million validators executing the same transactions and reaching consensus on the result every 12 seconds. Underneath the user-visible "send some ETH" or "swap on Uniswap" sits a stack of carefully chosen primitives: an account model that's neither pure key-value nor pure UTXO, a Merkle Patricia Trie that turns global state into a cryptographic commitment, the EVM (a 256-bit stack machine with metered execution), an EIP-1559 fee market that burns base fees instead of paying them to miners, and — since The Merge in 2022 — a clean split between the execution layer (what the EVM runs) and the consensus layer (Casper FFG plus LMD GHOST proof-of-stake).

The protocol is governed by Ethereum Improvement Proposals (EIPs). Hard forks ship roughly every 6-12 months: London (1559), Paris (Merge), Shanghai (withdrawals), Dencun (4844 blobs), Pectra (account abstraction primitives).

Ethereum Architecture Overview

Key Numbers

Slot

12 s

Epoch

32 slots = 6.4 min

Validator stake

32 ETH

Active validators

~1.0M

Block gas limit

~30M gas

Finality

2 epochs (~12.8 min)

EVM word width

256 bits

Why Ethereum Exists

Bitcoin Was Just Money

Bitcoin's scripting language is intentionally restricted — no loops, no persistent state, no general computation. Ethereum's bet was that a Turing-complete scripting layer with metered execution opens up an entire app platform, not just a payment rail. Vitalik Buterin's whitepaper called it "a blockchain with a built-in programming language."

Replicated State Machine

Every full node executes every transaction, in order, against the same starting state, and arrives at the same ending state. That's the whole architecture. Consensus picks which sequence of transactions is canonical; execution applies them deterministically. The "world computer" framing is literal.

Trustless Programmability

A Solidity contract on Ethereum runs the same way for every caller, with no operator that can change the rules mid-game. That property — the ability to ship a program that nobody can rug — is what enabled DeFi, NFTs, governance tokens, and (more controversially) speculative tokens at scale.

The Account Model: EOA vs Contract

Ethereum has two account types, both addressed by 20-byte hash:

Externally Owned Account (EOA) — controlled by a private key (secp256k1). Has a balance and a nonce. Cannot run code; can only sign and send transactions. The address is the last 20 bytes of keccak256(pubkey).
Contract account — controlled by code. Has a balance, a nonce, deployed bytecode, and a 2^256-slot storage trie. Cannot initiate transactions; can only respond to calls. The address at deploy time is keccak256(rlp([sender, nonce]))[12:], or with CREATE2, deterministic from the deployer + salt + bytecode hash.

An account state is a 4-tuple: (nonce, balance, storageRoot, codeHash). For an EOA, storageRoot is the empty-trie root and codeHash is the keccak256(""). For a contract, both are populated. The collection of all accounts forms the state trie, and its root hash is committed in every block header.

A transaction from an EOA carries: nonce (must equal the sender's current nonce, providing replay protection and ordering), (maxFeePerGas, maxPriorityFeePerGas) (post-EIP-1559), gasLimit, to, value, data, plus an ECDSA signature (v, r, s). Setting to to 0x0 makes it a contract creation; the data field is then the constructor bytecode.

The Merkle Patricia Trie

Ethereum's state isn't stored as a flat key-value table; it's a Merkle Patricia Trie (MPT) — a radix trie where every node is hashed and every parent commits to its children's hashes. The root hash uniquely identifies the entire global state. A single contract storage slot's value can be verified against that root with a logarithmic-size proof.

An MPT mixes three node types to balance depth and branching:

Branch node — 17 entries. Indexed 0-15 (one per nibble of the path) plus an optional value at index 16 for a key that terminates exactly here.
Extension node — a (shared-path-prefix, child-hash) pair. Compresses a long single-child chain into one node.
Leaf node — a (remaining-path, value) pair at the end of a path.

Keys are nibbleized (split into 4-bit chunks) before insertion, giving 16-way branching. The MPT has 4 logical tries per block:

State trie — keyed by keccak256(address), value is the RLP-encoded account state.
Storage trie — one per contract, keyed by keccak256(slot), value is the RLP-encoded slot value.
Transaction trie — keyed by transaction index in the block.
Receipt trie — keyed by transaction index, value is the receipt (status, logs, gas used).

The MPT is widely acknowledged as one of Ethereum's worst design choices. Updates are I/O-amplified: changing one storage slot rewrites the leaf, every ancestor up to the storage root, the contract's account state in the state trie, and every state-trie ancestor up to the global root. A single tx that touches 10 contracts triggers ~50 MPT writes. The replacement, Verkle Tries (using vector commitments with much wider branching), is on the medium-term roadmap.

The EVM: Stack Machine, Gas, Opcodes

The Ethereum Virtual Machine is a 256-bit stack machine with three memory regions:

Stack — at most 1024 256-bit words. Almost all opcodes consume and produce stack values. The stack is per-call-frame; each CALL creates a fresh stack.
Memory — a byte-addressable, zero-initialized array, also per-call-frame. Grows on demand; gas cost is quadratic past 32 KiB to discourage gigantic allocations.
Storage — persistent 256-bit-key, 256-bit-value mapping per contract. Survives across calls and transactions. Writes are by far the most expensive operation in the EVM.

Every opcode has a gas cost. Cheap ones (ADD, PUSH) cost 3 gas. Memory expansion costs roughly 3 + n²/512 gas. SLOAD from cold storage is 2100 gas; warm SLOAD is 100 gas (EIP-2929 introduced the cold/warm distinction). SSTORE ranges from 100 (rewriting the same value) to 22100 (zero → non-zero), with up to 19900 gas refunded for clearing storage (capped at 20% of gas used). CALL is 700 base plus the cost of any value transfer plus the gas forwarded to the callee. CREATE2 is 32000 plus the cost of executing the constructor.

Gas serves three purposes simultaneously: it bounds execution time (no infinite loops), it prices network resources (storage is far more expensive than CPU), and it creates the fee market. A transaction provides a gasLimit; if execution exceeds it, everything reverts but the gas is still charged. If it succeeds with leftover gas, the remainder is refunded.

A precompile is a contract at a fixed low address (0x01-0x0a) implemented natively in the client for cryptography that would be prohibitively expensive in EVM bytecode: ECRECOVER, SHA256, RIPEMD-160, identity, modexp, BN254 curve ops, BLS12-381, KZG point evaluation. Each has a hand-tuned gas formula; getting it wrong is consensus-critical.

EIP-1559: Base Fee, Priority Fee, Burn

Pre-1559, transactions had a single gasPrice field, and miners chose transactions purely by descending price. Block space was a first-price auction, which led to overbidding, gas wars, and unpredictable fees.

EIP-1559 (London, 2021) replaced that with a two-component fee:

Base fee — set by the protocol per block, adjusts ±12.5% per block based on whether the previous block was above or below the 15M-gas target. Burned (sent to 0x0), not paid to the proposer.
Priority fee (tip) — extra ETH per gas the sender offers above the base fee, paid to the block proposer as a tip.

Senders set maxFeePerGas (an upper bound, not a real bid) and maxPriorityFeePerGas (the tip). The actual fee is min(maxFeePerGas, baseFee + maxPriorityFeePerGas) per gas. If the base fee ever exceeds maxFeePerGas, the transaction sits in the mempool until the base fee drops or the tx is replaced.

Three properties matter:

Predictability — base fee is observable and deterministic from the previous block. Wallets can quote a fee that's almost certain to land.
Burn — base fees are removed from circulation. At sustained high activity, ETH can become net deflationary. Roughly 4M ETH have been burned since 1559 launched.
No more gas wars — competitive bidding is now bounded; you outbid by tip, which is much smaller than the total fee.

Consensus: Casper FFG and LMD GHOST

Ethereum's proof-of-stake consensus is a hybrid of two algorithms operating on different timescales:

LMD GHOST (Latest Message Driven Greedy Heaviest Observed Subtree) is the fork-choice rule. Every slot, a randomly-selected validator (the proposer) builds and broadcasts a block. Every other validator in the slot's committee casts an attestation — a signed vote for the head of the chain they consider canonical. LMD GHOST picks the chain whose subtree has the most attestations, weighted by validator effective balance. Each validator's most recent attestation overwrites their previous one (the "latest message" part).

Casper FFG (Friendly Finality Gadget) sits on top and provides finality. Every epoch (32 slots = 6.4 min), validators cast a separate "checkpoint vote" identifying which epoch boundary they consider justified. When 2/3 of stake votes for the same checkpoint twice in a row, that checkpoint becomes finalized — economically irreversible. Reverting it would require slashing at least 1/3 of total stake (~10M ETH at current valuation).

The two layers serve different needs. LMD GHOST resolves single-block forks fast (within a slot or two) but offers only probabilistic safety. Casper FFG provides absolute safety but only every two epochs. Together they give "fast head, slow finality" — wallets can show pending blocks immediately, exchanges wait ~12 minutes for finality before crediting deposits.

Validator Economics and Slashing

A validator is created by depositing exactly 32 ETH to the deposit contract. The validator then has three duties:

Propose — produce a block when randomly selected (~once every few weeks per validator). Get the priority fees + MEV.
Attest — vote on the head every epoch. Get a small reward for correct, timely votes.
Sync committee — every 256 epochs, ~512 validators are selected to sign sync messages used by light clients. Extra reward for participation.

Total APR sits around 3-5% for a solo validator, depending on total staked ETH. (The issuance curve targets a roughly logarithmic decline as more validators join.)

Slashing punishes equivocation — provably cheating the protocol. Two slashable offenses:

Double proposal — signing two different blocks at the same slot.
Surround / double attestation — casting two attestations whose vote ranges overlap in a way only possible if you're trying to fork the chain.

A slashed validator immediately loses 1/32 of its balance (1 ETH initial penalty), is forcibly exited over the next ~36 days, and loses an additional correlation penalty proportional to how many other validators were slashed in the same window. A coordinated attack that slashes all attackers can wipe out their entire stake. Inactivity leak is the other major economic disincentive: if finality stalls for 4+ epochs, validators that didn't attest get their balance slowly drained until the active set drops back below 2/3 and finality resumes.

Execution Layer / Consensus Layer Split

Pre-Merge, a single client (geth, parity) did everything: networking, mempool, EVM execution, PoW mining, fork choice. Post-Merge, the two halves of an Ethereum node are separate processes communicating over the Engine API (a JSON-RPC):

Execution Layer (EL) clients — geth, nethermind, besu, erigon, reth. Handle the EVM, the state trie, the transaction pool, and the eth_* JSON-RPC API.
Consensus Layer (CL) clients — prysm, lighthouse, teku, nimbus, lodestar. Handle the beacon chain, fork choice, attestation aggregation, validator duties, and slashing detection.

The Engine API has three core methods: engine_newPayload (the CL hands a block to the EL for execution validation), engine_forkchoiceUpdated (the CL tells the EL which head and finalized block to track, optionally requesting payload build), and engine_getPayload (when this validator is the proposer, the CL fetches the latest built block from the EL to broadcast).

The split has two big benefits: client diversity (no single bug can take down the network if you mix-and-match EL+CL pairs) and cleaner concerns (consensus changes don't churn execution code and vice versa). It also enabled MEV-Boost: an out-of-protocol marketplace where third-party builders construct blocks (often optimized for MEV extraction), bid for the right to have their block proposed, and the validator just signs the winning bid. ~90% of mainnet blocks are MEV-Boost blocks today.

Blob Transactions: EIP-4844 / Proto-Danksharding

Layer-2 rollups (Optimism, Arbitrum, Base, zkSync) execute transactions off-chain and post compressed transaction data back to L1 for data availability. Pre-4844, that data sat in calldata, costing ~16 gas per byte and consuming a meaningful fraction of every block's gas. Rollup fees were dominated by L1 calldata costs.

EIP-4844 (Dencun, 2024) introduced a new transaction type: blob-carrying transactions. A blob is a 128 KiB chunk of opaque data attached to a transaction but not stored in the EVM state. The EVM can only see a KZG commitment to each blob (via the BLOBHASH opcode and the POINT_EVALUATION precompile). The blobs themselves are propagated through the consensus layer's gossip network and pruned after ~18 days.

Three properties:

Cheap — blob gas has its own EIP-1559-style market separate from execution gas. Blob fees during normal load are ~10-100x cheaper than equivalent calldata. Rollup transaction fees dropped 90%+ overnight.
Verifiable but ephemeral — the rollup's contract on L1 stores the blob hash, can verify proofs against it, but the blob bytes themselves don't burden state.
Forward-compatible — KZG commitments are the building block for full Danksharding (the long-term plan to split blob storage across the validator set, scaling data availability ~10x further).

Light Clients and State Sync

A full node holds the entire state trie (~200 GB and growing); a light client wants consensus security without the disk burden. The post-Merge sync committee (512 validators rotating every 256 epochs, signing every slot's header) gives light clients a compact, BLS-aggregated proof of which beacon block is canonical. Combined with the MPT proof for any specific account or storage slot, a light client can verify getBalance or eth_call queries against state roots without executing transactions itself.

For full-node sync, modern clients use snap sync: download the latest finalized state directly from peers as flat key-value pairs, verify each chunk against the published state root, then process only the most recent ~64 blocks of history. A fresh node can sync in hours instead of weeks, at the cost of not having full historical receipts (which can be filled in via separate "history" sync). EIP-4444 (eventually) makes pruning history default, shrinking the per-node footprint dramatically.

Ethereum vs Other L1s

	Ethereum	Solana	Bitcoin	Cosmos (per chain)
State model	Account-based, MPT	Account-based, flat	UTXO	Account-based, IAVL+ tree
Block time	12s	~400ms	~10 min	~6s (Tendermint)
Finality	~12.8 min (Casper FFG)	~12s (TowerBFT)	Probabilistic, ~60 min	Instant (Tendermint BFT)
Consensus	PoS · Casper FFG + LMD GHOST	PoS · TowerBFT + PoH	PoW · longest chain	PoS · Tendermint BFT
Execution model	EVM (256-bit stack)	BPF (Sealevel parallel)	Stack-based Script (limited)	App-specific (Cosmos SDK)
Validator set	~1M (permissionless, low-stake)	~1500 (permissionless, high-stake)	Open mining	~150 per chain (top stakeholders)
Throughput target	~15-30 TPS L1, scaled by L2	~3000-5000 TPS	~7 TPS	varies; ~1000 TPS Tendermint
Smart contracts	Solidity, Vyper, Yul	Rust, C, Solana SDK	Limited	CosmWasm (Rust), Go modules

Tradeoffs and Honest Weaknesses

L1 throughput is low — 15-30 TPS on a chain that targets every laptop being able to verify. The strategy is to push throughput onto rollups (L2) and use L1 purely as a settlement and data-availability layer.
State growth is unbounded — every contract that's ever been deployed is part of the state forever (no automatic pruning). The state trie is ~200 GB and grows roughly linearly. Verkle Tries + statelessness are the medium-term fix; in the meantime, large nodes need fast SSDs and lots of RAM.
MEV centralization risk — most blocks are now built by a handful of professional builders (Beaverbuild, Flashbots, Titan). Validators rarely build their own. If those builders coordinate or are censored, the chain's neutrality erodes. The "PBS" (Proposer-Builder Separation) roadmap aims to enshrine this in-protocol with stronger guarantees.
Validator economics favor pools — running a solo validator requires 32 ETH (~$80-100k), uptime, slashing risk, and ops effort. Most retail stakers go through Lido, Coinbase, RocketPool, or Kraken. Lido alone holds ~30% of staked ETH, raising centralization concerns.
Reorg risk on the head — single-slot reorgs happen routinely (a few times per day) when proposers see the previous block too late. Multi-slot reorgs are rare but possible during high latency. Wallets and exchanges wait for finality, not just inclusion, before considering transactions safe.
Quantum risk on signatures — secp256k1 ECDSA falls to a sufficiently large quantum computer. There is a long-term plan to migrate to STARK-based or hash-based signatures, but it's a ~5-10 year horizon and would require migrating every existing account.
Account abstraction is half-finished — EIP-4337 added smart-contract-account flows, EIP-7702 lets EOAs delegate to code, but the seed-phrase UX problem isn't fully solved at protocol level.

Frequently Asked Questions

How is finality different from confirmation?

"Confirmation" in PoW Bitcoin means the number of blocks built on top of yours — probabilistic, never absolute, six confirmations is conventional. Finality in Ethereum PoS is binary and economic: once 2/3 of stake votes for two consecutive checkpoints, reverting requires slashing at least 1/3 of stake. After finality, your transaction is, by definition, in the canonical chain forever — barring a social-layer hard fork.

What happens to a transaction with insufficient gas?

If gasLimit is set lower than the actual execution requires, the EVM throws OutOfGas at the moment it runs out, reverts all state changes from the transaction (including any partial work and storage writes), keeps the gas already consumed as a fee paid to the proposer/burned, and emits a failed receipt. The nonce still increments — the transaction is "included and failed," not invalid. Re-submitting requires a new transaction with the next nonce.

Why does eth_call cost no gas?

eth_call runs the EVM locally on a node, without including the transaction in a block. There's no consensus, no state mutation, no fee. The node still meters gas to bound execution time (defaulting to ~30M gas), but no ETH changes hands. It's how dApps query view functions and simulate transactions before broadcasting.

What's the difference between value and gas in a transaction?

value is ETH transferred from the sender to the recipient as part of the call. gas is the maximum compute the transaction is willing to consume, paid for separately by the sender at gasPrice (or maxFeePerGas post-1559). Value moves between parties; gas is destroyed (base fee burn) or paid to the proposer (priority fee). A transaction that sends 0 ETH (a pure contract call) still costs gas.

How does CREATE2 produce a deterministic contract address?

The address is keccak256(0xff, deployer, salt, keccak256(initCode))[12:]. Because all four inputs are known before deployment, anyone can compute the future address. This is the foundation for "counterfactual" contract patterns: a wallet can sign a transaction sending ETH to an address that doesn't exist yet, knowing the contract will be deployed there later by the deployer (or anyone who knows the salt).

Why does my transaction sometimes get "replaced by another transaction"?

You sent two transactions with the same nonce. The mempool only keeps one per (sender, nonce) — the one with the highest fee. The other is silently dropped. This is also how cancelations and speed-ups work: send a new transaction with the same nonce as the pending one, but a higher tip. The new one replaces the old; the old never confirms.

What is "MEV" and why is it visible to users?

MEV (Maximal Extractable Value) is the profit a block proposer can extract by reordering, inserting, or censoring transactions in a block. Classic forms: front-running (placing a buy in front of a known large buy on a DEX), back-running (placing a sell right after the move), sandwich attacks (both at once around the victim). It's visible because the user's price is worse than the market price they expected to get. Mitigations: private mempools (Flashbots Protect, MEV-Share), CoW Swap-style batched auctions, and intent-based architectures.

What's the difference between blob gas and execution gas?

Two completely separate fee markets sharing a block. Execution gas is metered per-opcode and pays for EVM compute and state access — the classic gasLimit/gasPrice. Blob gas is metered in 128 KiB units (one "blob") and pays for the data-availability cost of attaching opaque blobs to the block. Each has its own EIP-1559-style base fee that adjusts independently. A single transaction can pay both: the execution part for any contract calls, plus blob gas if it carries blobs (typical of L2 batch posters).