Knowledge (XXG)

Reliable multicast

Source 📝

102:
additional times to specific servers, VM replication to multiple servers may be required for scale out of applications and data replication to multiple servers may be necessary for load balancing by allowing multiple servers to serve the same data from their local cached copies. Such delivery is frequent within datacenters due to plethora of servers communicating while running highly distributed applications.
106:(e.g. every 24 hours), social media applications push new content to many cache locations across the world (e.g. YouTube and Facebook), and backup services make several geographically dispersed copies for increased fault tolerance. To maximize bandwidth utilization and reduce completion times of bulk transfers, a variety of techniques have been proposed for selection of multicast forwarding trees. 89:-based protocols shift the responsibility to receivers: the sender never knows for sure that all the receivers have in fact received all the data. RFC- 2887 explores the design space for bulk data transfer, with a brief discussion on the various issues and some hints at the possible different meanings of 65:, multicast does not guarantee the delivery of a message stream. Messages may be dropped, delivered multiple times, or delivered out of order. A reliable multicast protocol adds the ability for receivers to detect lost and/or out-of-order messages and take corrective action (similar in principle to 105:
RGDD may also occur across datacenters and is sometimes referred to as inter-datacenter Point to Multipoint (P2MP) Transfers. Such transfers deliver huge volumes of data from one datacenter to multiple datacenters for various applications: search engines distribute search index updates periodically
158:
was released in 2013 under the name Isis2 (the name was changed from Isis2 to Vsync in 2015 in the wake of a terrorist attack in Paris by an extremist organization called ISIS), with periodic updates and revisions since that time. The most current stable release is V2.2.2020; it was released on
101:
Reliable Group Data Delivery (RGDD) is a form of multicasting where an object is to be moved from a single source to a fixed set of receivers known before transmission begins. A variety of applications may need such delivery: Hadoop Distributed File System (HDFS) replicates any chunk of data two
137:
platforms all support transactions and some CORBA products support transactional replication in the one-copy-serializability model. The "CORBA Fault Tolerant Objects standard" is based on the virtual synchrony model. Virtual synchrony was also used in developing the New York Stock Exchange
85:. However, not all reliable multicast protocols ensure this level of reliability; many of them trade efficiency for reliability, in different ways. For example, while TCP makes the sender responsible for transmission reliability, multicast 52:
to a group of destinations simultaneously using the most efficient strategy to deliver the messages over each link of the network only once, creating copies only when the links to the multiple destinations split (typically
166:
Other such systems include the Horus system the Transis system, the Totem system, an IBM system called Phoenix, a distributed security key management system called Rampart, the "Ensemble system", the
331: 602:
Introduces a mathematical formalism for these kinds of models, then uses it to compare their expressive power and their failure detection assumptions.
639: 138:
fault-tolerance architecture, the French Air Traffic Control System, the US Navy AEGIS system, IBM's Business Process replication architecture for
228:
hmbdc open source (headers only) C++ middleware, ultra-low latency/high throughput, scalable and reliable inter-thread, IPC and network messaging
442: 370: 634: 126:
can achieve data rates of 10,000 multicasts per second or more, and can scale to large networks with huge numbers of groups or processes.
484: 618:. K.P. Birman and T. Joseph. Proceedings of the 11th ACM Symposium on Operating systems principles (SOSP), Austin Texas, Nov. 1987. 154:
Virtual synchrony was first supported by the Cornell University and was called the "Isis Toolkit". Cornell's most current version,
29: 159:
November 14, 2015; the V2.2.2048 release is currently available in Beta form. Vsync aims at the massive data centers that support
644: 194: 167: 66: 654: 199: 188: 33:
sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver
353:
T. Zhu; et al. (Oct 18, 2016). "MCTCP: Congestion-aware and robust multicast TCP in Software-Defined networks".
569:
Reliable Distributed Systems: Technologies, Web Services and Applications. K.P. Birman. Springer Verlag (1997).
183: 575:
Distributed Systems: Principles and Paradigms (2nd Edition). Andrew S. Tanenbaum, Maarten van Steen (2002).
649: 83:
eventual delivery of all the data to all the group members, without enforcing any particular delivery order
581: 86: 62: 615: 130: 119: 446: 24: 133:
platforms support one or more of these models. For example, the widely supported object-oriented
577:
Textbook, covers a broad spectrum of distributed computing concepts, including virtual synchrony.
571:
Textbook, covers a broad spectrum of distributed computing concepts, including virtual synchrony.
376: 274: 171: 123: 58: 366: 155: 488: 480: 358: 312: 266: 143: 543: 332:"Datacast: A Scalable and Efficient Reliable Group Data Delivery Service For Data Centers" 204: 160: 81:
depends on the specific protocol instance. A minimal definition of reliable multicast is
422:"QuickCast: Fast and Efficient Inter-Datacenter Transfers using Forwarding Tree Cohorts" 255:"A reliable multicast framework for light-weight sessions and application level framing" 115: 54: 606: 628: 421: 398: 293: 278: 34: 380: 297: 250: 174:
and a number of products (including the IBM and Microsoft ones mentioned earlier).
532: 609:. Leslie Lamport. ACM Transactions on Computing Systems (TOCS), 16:2 (1998). 594: 507: 246: 49: 590: 468: 362: 620:
Earliest use of the term, but probably not the best exposition of the topic.
485:
10.1002/(SICI)1097-024X(19990725)29:9<741::AID-SPE259>3.0.CO;2-I
298:"Multipoint communication: A survey of protocols, functions, and mechanisms" 139: 45: 254: 355:
2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS)
216: 316: 270: 493: 399:"DCCast: Efficient Point to Multipoint Transfers Across Datacenters" 134: 16:
Reliable delivery of packets to multiple recipients simultaneously
611:
Introduces the Paxos implementation of replicated state machines.
584:. K.P. Birman, Communications of the ACM 16:12 (Dec. 1993). 582:"The process group approach to reliable distributed computing" 519: 591:"Group communication specifications: a comprehensive study" 554: 253:; Liu, C. -G.; McCanne, S.; Zhang, L. (December 1997). 600:
Roman Vitenberg. ACM Computing Surveys 33:4 (2001).
616:"Exploiting virtual synchrony in distributed systems" 69:), resulting in a gap-free, in-order message stream. 142:
and Microsoft's Windows Clustering architecture for
48:is a network addressing method for the delivery of 397:M. Noormohammadpour; et al. (July 10, 2017). 469:"A Review of Experiences with Reliable Multicast" 305:IEEE Journal on Selected Areas in Communications 170:system, "The OpenAIS project", its derivative 8: 492: 420:M. Noormohammadpour; et al. (2018). 330:C. Guo; et al. (November 1, 2012). 238: 443:"Catalog of CORBA/IIOP Specifications" 392: 390: 150:Systems that support virtual synchrony 7: 178:Other existing or proposed protocols 259:IEEE/ACM Transactions on Networking 14: 473:Software: Practice and Experience 640:Fault-tolerant computer systems 520:"Vsync Cloud Computing Library" 195:QuickSilver Scalable Multicast 1: 445:. 2004-10-09. Archived from 97:Reliable Group Data Delivery 635:Inter-process communication 222:Spread: C/C++ API, Java API 200:Scalable Reliable Multicast 189:Pragmatic General Multicast 172:the Corosync Cluster Engine 671: 607:"The part-time parliament" 467:K. P. Birman (July 1999). 363:10.1109/IWQoS.2016.7590433 184:Data Distribution Service 27:protocol that provides a 586:Written for non-experts. 357:. IEEE. pp. 1–10. 292:Diot, C.; Dabbous, W.; 645:Distributed algorithms 63:User Datagram Protocol 61:). However, like the 593:Gregory V. Chockler, 555:"The OpenAIS project" 131:distributed computing 77:The exact meaning of 146:enterprise servers. 114:Modern systems like 655:Computer networking 25:computer networking 116:the Spread Toolkit 21:reliable multicast 544:"Ensemble system" 372:978-1-5090-2634-0 317:10.1109/49.564128 271:10.1109/90.650139 110:Virtual synchrony 662: 557: 552: 546: 541: 535: 530: 524: 523: 516: 510: 505: 499: 498: 496: 464: 458: 457: 455: 454: 439: 433: 432: 430: 428: 417: 411: 410: 408: 406: 394: 385: 384: 350: 344: 343: 341: 339: 327: 321: 320: 302: 289: 283: 282: 243: 144:Windows Longhorn 55:network switches 670: 669: 665: 664: 663: 661: 660: 659: 625: 624: 566: 564:Further reading 561: 560: 553: 549: 542: 538: 531: 527: 518: 517: 513: 506: 502: 466: 465: 461: 452: 450: 441: 440: 436: 426: 424: 419: 418: 414: 404: 402: 396: 395: 388: 373: 352: 351: 347: 337: 335: 329: 328: 324: 300: 291: 290: 286: 245: 244: 240: 235: 213: 211:Library support 205:SMART Multicast 180: 161:cloud computing 152: 112: 99: 75: 43: 17: 12: 11: 5: 668: 666: 658: 657: 652: 650:Process theory 647: 642: 637: 627: 626: 623: 622: 613: 604: 598: 588: 579: 573: 565: 562: 559: 558: 547: 536: 533:"Horus system" 525: 511: 508:"Isis Toolkit" 500: 479:(9): 741–774. 459: 434: 412: 386: 371: 345: 322: 311:(3): 277–290. 296:(April 1997). 284: 265:(6): 784–803. 237: 236: 234: 231: 230: 229: 226: 223: 220: 212: 209: 208: 207: 202: 197: 192: 186: 179: 176: 151: 148: 111: 108: 98: 95: 74: 71: 42: 39: 15: 13: 10: 9: 6: 4: 3: 2: 667: 656: 653: 651: 648: 646: 643: 641: 638: 636: 633: 632: 630: 621: 617: 614: 612: 608: 605: 603: 599: 596: 592: 589: 587: 583: 580: 578: 574: 572: 568: 567: 563: 556: 551: 548: 545: 540: 537: 534: 529: 526: 521: 515: 512: 509: 504: 501: 495: 490: 486: 482: 478: 474: 470: 463: 460: 449:on 2004-10-09 448: 444: 438: 435: 423: 416: 413: 400: 393: 391: 387: 382: 378: 374: 368: 364: 360: 356: 349: 346: 333: 326: 323: 318: 314: 310: 306: 299: 295: 294:Crowcroft, J. 288: 285: 280: 276: 272: 268: 264: 260: 256: 252: 248: 242: 239: 232: 227: 224: 221: 218: 215: 214: 210: 206: 203: 201: 198: 196: 193: 190: 187: 185: 182: 181: 177: 175: 173: 169: 164: 162: 157: 149: 147: 145: 141: 136: 132: 127: 125: 121: 117: 109: 107: 103: 96: 94: 92: 88: 84: 80: 72: 70: 68: 64: 60: 56: 51: 47: 40: 38: 36: 35:file transfer 32: 31: 26: 22: 619: 610: 601: 585: 576: 570: 550: 539: 528: 514: 503: 476: 472: 462: 451:. Retrieved 447:the original 437: 425:. Retrieved 415: 403:. Retrieved 354: 348: 336:. Retrieved 325: 308: 304: 287: 262: 258: 251:Jacobson, V. 241: 225:RMF (C# API) 165: 153: 128: 113: 104: 100: 90: 82: 78: 76: 44: 28: 20: 18: 595:Idit Keidar 427:January 23, 168:Quicksilver 120:Quicksilver 79:reliability 73:Reliability 50:information 629:Categories 453:2024-09-19 233:References 219:(Java API) 494:1813/7380 279:221634489 247:Floyd, S. 140:WebSphere 46:Multicast 405:July 26, 401:. USENIX 381:28159768 338:July 26, 124:Corosync 91:reliable 41:Overview 30:reliable 217:JGroups 59:routers 23:is any 379:  369:  277:  122:, and 377:S2CID 334:. ACM 301:(PDF) 275:S2CID 191:(PGM) 156:Vsync 135:CORBA 129:Most 429:2018 407:2017 367:ISBN 340:2017 57:and 489:hdl 481:doi 359:doi 313:doi 267:doi 87:NAK 67:TCP 631:: 487:. 477:29 475:. 471:. 389:^ 375:. 365:. 309:15 307:. 303:. 273:. 261:. 257:. 249:; 163:. 118:, 93:. 37:. 19:A 597:, 522:. 497:. 491:: 483:: 456:. 431:. 409:. 383:. 361:: 342:. 319:. 315:: 281:. 269:: 263:5

Index

computer networking
reliable
file transfer
Multicast
information
network switches
routers
User Datagram Protocol
TCP
NAK
the Spread Toolkit
Quicksilver
Corosync
distributed computing
CORBA
WebSphere
Windows Longhorn
Vsync
cloud computing
Quicksilver
the Corosync Cluster Engine
Data Distribution Service
Pragmatic General Multicast
QuickSilver Scalable Multicast
Scalable Reliable Multicast
SMART Multicast
JGroups
Floyd, S.
Jacobson, V.
"A reliable multicast framework for light-weight sessions and application level framing"

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.