Zelarsoft

Not Your Grandma Gossip: Node Failure Detection at Scale

Several distributed protocols are predicated on a large number of nodes relying on each other to complete transactions; for example, highly durable persistence storage depends on data being replicated several times across nodes: while this improves the system’s resilience to failure, it impacts latency and throughput – crashed (or, possibly worse, slow-responding) nodes have a… Continue reading Not Your Grandma Gossip: Node Failure Detection at Scale