Sync Pipeline
Torsten uses a pipelined multi-peer architecture for block synchronization, separating header collection from block fetching for maximum throughput.
Architecture
flowchart LR
subgraph Primary Peer
CS[ChainSync<br/>Header Collection]
end
CS -->|headers| HQ[Header Queue]
subgraph Block Fetch Pool
BF1[Peer 1<br/>BlockFetch]
BF2[Peer 2<br/>BlockFetch]
BF3[Peer N<br/>BlockFetch]
end
HQ -->|range 1| BF1
HQ -->|range 2| BF2
HQ -->|range N| BF3
BF1 -->|blocks| BP[Block Processor]
BF2 -->|blocks| BP
BF3 -->|blocks| BP
BP --> CDB[(ChainDB)]
BP --> LS[Ledger State]
Pipeline Stages
1. Header Collection (ChainSync)
A primary peer is selected for the ChainSync protocol. The node requests block headers sequentially using the N2N ChainSync mini-protocol (V14+). Headers are collected into batches.
The ChainSync protocol involves:
- MsgFindIntersect -- Find a common point between the node and the peer
- MsgRequestNext -- Request the next header
- MsgRollForward -- Receive a new header
- MsgRollBackward -- Handle a chain reorganization
2. Block Fetch Pool
Collected headers are distributed across multiple peers for parallel block retrieval. The block fetch pool supports up to 4 concurrent peers, each fetching a range of blocks.
The BlockFetch protocol involves:
- MsgRequestRange -- Request a range of blocks by header hash
- MsgBlock -- Receive a block
- MsgBatchDone -- Signal the end of a batch
Blocks are fetched in batches of 500 headers, with sub-batches of 100 headers each. Each sub-batch is decoded on a spawn_blocking task to avoid blocking the async runtime.
3. Block Processing
Fetched blocks are applied to the ledger state in order:
- Deserialization -- Raw CBOR bytes are decoded into Torsten's internal
Blocktype using pallas - Ledger validation -- Each block is validated against the current ledger state (UTxO checks, fee validation, certificate processing)
- Storage -- Valid blocks are added to the ChainDB (volatile database first, flushed to immutable when k-deep)
- Epoch transitions -- At epoch boundaries, stake snapshots are rotated and rewards are calculated
Batched Lock Acquisition
To minimize lock contention, the sync loop acquires a single lock on both the ChainDB and ledger state for each batch of 500 blocks, rather than locking per-block.
Progress Reporting
Progress is logged every 5 seconds, showing:
- Current slot and block number
- Epoch number
- UTxO count
- Sync percentage (based on slot vs. wall-clock time)
- Blocks-per-second throughput metric
Rollback Handling
When the ChainSync peer sends a MsgRollBackward message, the node:
- Identifies the rollback point (a slot/hash pair)
- Removes rolled-back blocks from the VolatileDB
- Reverts the ledger state to the rollback point
- Resumes header collection from the new tip
Only blocks in the VolatileDB (the last k=2160 blocks) can be rolled back. Blocks that have been flushed to the ImmutableDB are permanent.
Performance Characteristics
- Header collection is serial per peer due to the ChainSync protocol design (~300ms per header request for network RTT)
- Block fetching is parallelized across multiple peers
- Block processing is batched (500 blocks per batch) with single-lock acquisition
- Throughput depends on network latency, peer count, and block sizes
The primary bottleneck during initial sync is the serial header-by-header ChainSync protocol. Multi-peer ChainSync (fetching headers from multiple peers simultaneously) is planned for future optimization.