Knowledge

Graph canonization

Source 📝

316:. Begin by labeling each vertex with the string 01. Iteratively for each non-leaf x remove the leading 0 and trailing 1 from x's label; then sort x's label along with the labels of all adjacent leaves in lexicographic order. Concatenate these sorted labels, add back a leading 0 and trailing 1, make this the new label of x, and delete the adjacent leaves. If there are two vertices remaining, concatenate their labels in lexicographic order. 354:
use canonicalization steps in their computation, which is essentially the canonicalization of the graph which represents the molecule. These identifiers are designed to provide a standard (and sometimes human-readable) way to encode molecular information and to facilitate the search for such
285:
step produce canonical labeling of such uniformly-chosen random graphs in linear expected time. This result sheds some light on the fact why many reported graph isomorphism algorithms behave well in practice. This was an important breakthrough in
86:: every two isomorphic graphs have the same canonical form, and every two non-isomorphic graphs have different canonical forms. Conversely, every complete invariant of graphs may be used to construct a canonical form. The vertex set of an 756:
Scheider, Nadine; Sayle, Roger A.; Landrum, Gregory A. (October 2015). "Get Your Atoms in Order — An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm".
237: 263: 131: 547: 374:
Computer Science – Theory and Applications: Third International Computer Science Symposium in Russia, CSR 2008 Moscow, Russia, June 7-12, 2008, Proceedings
290:
which became widely known in its manuscript form and which was still cited as an "unpublished manuscript" long after it was reported at a symposium.
686: 702:
Weininger, David; Weininger, Arthur; Weininger, Joseph L. (May 1989). "SMILES. 2. Algorithm for generation of unique SMILES notation".
277:)), a simple vertex classification algorithm produces a canonical labeling of a graph chosen uniformly at random from the set of all 482: 287: 266: 154: 146: 412:
Algorithms and Computation: 18th International Symposium, ISAAC 2007, Sendai, Japan, December 17-19, 2007, Proceedings
170: 80: 372:
Arvind, Vikraman; Das, Bireswar; Köbler, Johannes (2008), "A logspace algorithm for partial 2-tree canonization",
265:. While the existence of (deterministic) polynomial algorithms for graph isomorphism is still an open problem in 158: 324:
Graph canonization is the essence of many graph isomorphism algorithms. One of the leading tools is Nauty.
166: 187: 797: 294: 181: 142: 444: 305:
considered as a linear string. However, the computation of the lexicographically smallest graph is
719: 654: 626: 560: 343: 282: 578: 542: 523: 473: 270: 177: 56:. Thus, from a solution to the graph canonization problem, one could also solve the problem of 773: 682: 676: 646: 332: 298: 150: 57: 41: 765: 711: 636: 552: 487: 415: 377: 312:
For trees, a concise polynomial canonization algorithm requiring O(n) space is presented by
302: 242: 605: 459: 429: 389: 738: 601: 455: 425: 385: 169:
to graph canonization. However, it is still an open question whether the two problems are
98:, and using such an identification a canonical form of a graph may also be described as a 83: 33: 25: 791: 672: 414:, Lecture Notes in Comput. Sci., vol. 4835, Springer, Berlin, pp. 822–833, 658: 564: 723: 376:, Lecture Notes in Comput. Sci., vol. 5010, Springer, Berlin, pp. 40–51, 17: 128:
Is graph canonization polynomial time equivalent to the graph isomorphism problem?
420: 281:-vertex graphs after only two refinement steps. Small modifications and an added 477: 381: 328: 99: 641: 619:
McKay, Brendan D.; Piperno, Adolfo (2014), "Journal of Symbolic Computation",
339: 769: 650: 508: 406:
Arvind, V.; Das, Bireswar; Köbler, Johannes (2007), "The space complexity of
545:; Kucera, L. (1979), "Canonical labeling of graphs in linear average time", 120: 777: 596:
Read, Ronald C. (1972), "The coding of various kinds of unlabeled trees",
492: 273:
reported that with probability at least 1 − exp(−O(
715: 556: 306: 91: 452:
Bulletin of the European Association for Theoretical Computer Science
347: 548:
Proc. 20th Annual IEEE Symposium on Foundations of Computer Science
301:, which is the graph of the class with lexicographically smallest 631: 351: 184:
algorithm for graph canonization, that is, one with running time
620: 162: 76:), and test whether these two canonical forms are identical. 102:
of its vertices. Canonical forms of a graph are also called
675:; Holder, Lawrence B. (2007), "6.2.1. Canonical Labeling", 327:
A common application of graph canonization is in graphical
153:. Clearly, the graph canonization problem is at least as 245: 190: 106:, and graph canonization is also sometimes known as 68:
are isomorphic, compute their canonical forms Canon(
581:; Luks, E. (1983), "Canonical labeling of graphs", 118:
Unsolved problem in computational complexity theory
257: 231: 510:Canonical Form for Graphs in Quasipolynomial Time 79:The canonical form of a graph is an example of a 583:Proc. 15th ACM Symposium on Theory of Computing 483:Proc. 15th ACM Symposium on Theory of Computing 600:, Academic Press, New York, pp. 153–182, 48:, such that every graph that is isomorphic to 8: 758:Journal of Chemical Information and Modeling 704:Journal of Chemical Information and Modeling 132:(more unsolved problems in computer science) 681:, John Wiley & Sons, pp. 120–122, 640: 630: 491: 419: 355:information in databases and on the web. 244: 218: 195: 189: 90:-vertex graph may be identified with the 480:(1983), "Canonical labeling of graphs", 401: 399: 364: 293:A commonly known canonical form is the 161:. In fact, graph isomorphism is even 7: 313: 123:Unsolved problem in computer science 232:{\displaystyle 2^{O((\log n)^{c})}} 145:of determining whether two finite 14: 625:, vol. 60, pp. 94–112, 445:"From invariants to canonization" 622:Practical graph isomorphism, II 507:Babai, László (June 23, 2019), 288:probabilistic complexity theory 267:computational complexity theory 52:has the same canonical form as 224: 215: 202: 199: 1: 60:: to test whether two graphs 421:10.1007/978-3-540-77120-3_71 24:is the problem of finding a 382:10.1007/978-3-540-79709-8_8 20:, a branch of mathematics, 814: 737:Kelley, Brian (May 2003). 598:Graph Theory and Computing 528:On the Isomorphism Problem 295:lexicographically smallest 171:polynomial time equivalent 642:10.1016/j.jsc.2013.09.003 159:graph isomorphism problem 139:graph isomorphism problem 770:10.1021/acs.jcim.5b00543 739:"Graph Canonicalization" 530:, unpublished manuscript 114:Computational complexity 32:. A canonical form is a 443:Gurevich, Yuri (1997), 259: 258:{\displaystyle c>0} 233: 108:graph canonicalization 493:10.1145/800061.808746 260: 234: 182:quasi-polynomial time 143:computational problem 486:, pp. 171–183, 410:-tree isomorphism", 243: 188: 155:computationally hard 716:10.1021/ci00062a008 557:10.1109/SFCS.1979.8 344:chemical substances 331:, in particular in 104:canonical labelings 743:Dr. Dobb's Journal 585:, pp. 171–183 551:, pp. 39–46, 283:depth-first search 255: 229: 22:graph canonization 764:(10): 2111–2120. 688:978-0-470-07303-2 678:Mining Graph Data 333:chemical database 299:isomorphism class 297:graph within the 58:graph isomorphism 28:of a given graph 805: 782: 781: 753: 747: 746: 734: 728: 727: 699: 693: 691: 669: 663: 661: 644: 634: 616: 610: 608: 593: 587: 586: 575: 569: 567: 539: 533: 531: 520: 514: 513: 504: 498: 496: 495: 470: 464: 462: 449: 440: 434: 432: 423: 403: 394: 392: 369: 303:adjacency matrix 264: 262: 261: 256: 238: 236: 235: 230: 228: 227: 223: 222: 124: 813: 812: 808: 807: 806: 804: 803: 802: 788: 787: 786: 785: 755: 754: 750: 736: 735: 731: 701: 700: 696: 689: 671: 670: 666: 618: 617: 613: 595: 594: 590: 577: 576: 572: 541: 540: 536: 522: 521: 517: 506: 505: 501: 472: 471: 467: 454:(63): 115–119, 447: 442: 441: 437: 405: 404: 397: 371: 370: 366: 361: 322: 241: 240: 239:for some fixed 214: 191: 186: 185: 135: 134: 129: 126: 122: 119: 116: 84:graph invariant 12: 11: 5: 811: 809: 801: 800: 790: 789: 784: 783: 748: 729: 694: 687: 673:Cook, Diane J. 664: 611: 588: 570: 534: 515: 499: 465: 435: 395: 363: 362: 360: 357: 335:applications. 321: 318: 254: 251: 248: 226: 221: 217: 213: 210: 207: 204: 201: 198: 194: 130: 127: 121: 117: 115: 112: 26:canonical form 13: 10: 9: 6: 4: 3: 2: 810: 799: 796: 795: 793: 779: 775: 771: 767: 763: 759: 752: 749: 744: 740: 733: 730: 725: 721: 717: 713: 710:(2): 97–101. 709: 705: 698: 695: 690: 684: 680: 679: 674: 668: 665: 660: 656: 652: 648: 643: 638: 633: 628: 624: 623: 615: 612: 607: 603: 599: 592: 589: 584: 580: 579:Babai, László 574: 571: 566: 562: 558: 554: 550: 549: 544: 543:Babai, László 538: 535: 529: 525: 524:Babai, László 519: 516: 512: 511: 503: 500: 494: 489: 485: 484: 479: 475: 474:Babai, László 469: 466: 461: 457: 453: 446: 439: 436: 431: 427: 422: 417: 413: 409: 402: 400: 396: 391: 387: 383: 379: 375: 368: 365: 358: 356: 353: 349: 345: 341: 336: 334: 330: 325: 319: 317: 315: 310: 308: 304: 300: 296: 291: 289: 284: 280: 276: 272: 268: 252: 249: 246: 219: 211: 208: 205: 196: 192: 183: 179: 174: 172: 168: 164: 160: 156: 152: 148: 144: 140: 133: 113: 111: 109: 105: 101: 97: 93: 89: 85: 82: 77: 75: 71: 67: 63: 59: 55: 51: 47: 43: 39: 35: 34:labeled graph 31: 27: 23: 19: 798:Graph theory 761: 757: 751: 742: 732: 707: 703: 697: 677: 667: 621: 614: 597: 591: 582: 573: 546: 537: 527: 518: 509: 502: 481: 478:Luks, Eugene 468: 451: 438: 411: 407: 373: 367: 338:A number of 337: 326: 323: 320:Applications 311: 292: 278: 274: 271:László Babai 180:announced a 178:László Babai 175: 138: 136: 107: 103: 95: 87: 78: 73: 72:) and Canon( 69: 65: 61: 53: 49: 45: 37: 29: 21: 18:graph theory 15: 346:, such as 340:identifiers 329:data mining 314:Read (1972) 100:permutation 359:References 269:, in 1977 151:isomorphic 94:from 1 to 42:isomorphic 40:) that is 651:0747-7171 632:1301.1493 209:⁡ 176:In 2019, 167:reducible 792:Category 778:26441310 659:17930927 565:14697933 526:(1977), 92:integers 81:complete 724:6621315 606:0344150 460:1621595 430:2472661 390:2475148 307:NP-hard 157:as the 141:is the 776:  722:  685:  657:  649:  604:  563:  458:  428:  388:  348:SMILES 147:graphs 36:Canon( 720:S2CID 655:S2CID 627:arXiv 561:S2CID 448:(PDF) 352:InChI 774:PMID 683:ISBN 647:ISSN 350:and 342:for 250:> 149:are 137:The 64:and 766:doi 712:doi 637:doi 553:doi 488:doi 416:doi 378:doi 309:. 206:log 44:to 16:In 794:: 772:. 762:55 760:. 741:. 718:. 708:29 706:. 653:, 645:, 635:, 602:MR 559:, 476:; 456:MR 450:, 426:MR 424:, 398:^ 386:MR 384:, 173:. 163:AC 110:. 780:. 768:: 745:. 726:. 714:: 692:. 662:. 639:: 629:: 609:. 568:. 555:: 532:. 497:. 490:: 463:. 433:. 418:: 408:k 393:. 380:: 279:n 275:n 253:0 247:c 225:) 220:c 216:) 212:n 203:( 200:( 197:O 193:2 165:- 125:: 96:n 88:n 74:H 70:G 66:H 62:G 54:G 50:G 46:G 38:G 30:G

Index

graph theory
canonical form
labeled graph
isomorphic
graph isomorphism
complete
graph invariant
integers
permutation
(more unsolved problems in computer science)
computational problem
graphs
isomorphic
computationally hard
graph isomorphism problem
AC
reducible
polynomial time equivalent
László Babai
quasi-polynomial time
computational complexity theory
László Babai
depth-first search
probabilistic complexity theory
lexicographically smallest
isomorphism class
adjacency matrix
NP-hard
Read (1972)
data mining

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.