Knowledge (XXG)

Total variation distance of probability measures

Source đź“ť

38: 1639: 1290: 1084: 464: 622: 334: 847: 529: 996: 733: 1076: 193: 114: 1285:{\displaystyle {\frac {1}{2}}\|P-Q\|_{1}=\delta (P,Q)=\inf {\big \{}\mathbb {P} (X\neq Y):{\text{Law}}(X)=P,{\text{Law}}(Y)=Q{\big \}}=\inf _{\pi }\operatorname {E} _{\pi },} 899: 1345: 1365: 1313: 1482:, Séminaire de Probabilités, XII (Univ. Strasbourg, Strasbourg, 1976/1977), pp. 342–363, Lecture Notes in Math., 649, Springer, Berlin, 1978, Lemma 2.1 (French). 1405: 1385: 233: 213: 157: 137: 384: 1495:, Revised and extended from the 2004 French original. Translated by Vladimir Zaiats. Springer Series in Statistics. Springer, New York, 2009. xii+214 pp. 1680: 536: 860:). This result can be shown by noticing that the supremum in the definition is achieved exactly at the set where one distribution dominates the other. 241: 755: 1617: 1564: 1537: 1500: 476: 1014: 907: 640: 470: 1554: 1447: 371: 1673: 1514: 739: 1421: 1699: 359: 1023: 1020:
The total variation distance (or half the norm) arises as the optimal transportation cost, when the cost function is
1709: 853: 1666: 632: 17: 1604:. Grundlehren der mathematischen Wissenschaften. Vol. 338. Springer-Verlag Berlin Heidelberg. p. 10. 1581: 340: 162: 83: 375: 1704: 41:
Total variation distance is half the absolute area between the two curves: Half the shaded area above.
54: 857: 117: 1426: 864: 46: 1613: 1560: 1533: 1496: 1650: 1605: 1532:(rev. and extended version of the French Book ed.). New York, NY: Springer. Lemma 2.1. 869: 78: 1318: 1416: 1350: 1298: 37: 31: 631:
between the probability functions: on discrete domains, this is the distance between the
459:{\displaystyle \delta (P,Q)\leq {\sqrt {{\frac {1}{2}}D_{\mathrm {KL} }(P\parallel Q)}}.} 1390: 1370: 218: 198: 142: 122: 1693: 1454: 352: 617:{\displaystyle \delta (P,Q)\leq {\sqrt {1-e^{-D_{\mathrm {KL} }(P\parallel Q)}}}.} 1646: 473:(see also ), which has the advantage of providing a non-vacuous bound even when 339:
This is the largest absolute difference between the probabilities that the two
329:{\displaystyle \delta (P,Q)=\sup _{A\in {\mathcal {F}}}\left|P(A)-Q(A)\right|.} 1638: 1609: 53:
is a distance measure for probability distributions. It is an example of a
842:{\displaystyle \delta (P,Q)={\frac {1}{2}}\int |p(x)-q(x)|\,\mathrm {d} x} 1006: 1002: 1001:
These inequalities follow immediately from the inequalities between the
1295:
where the expectation is taken with respect to the probability measure
628: 524:{\displaystyle \textstyle D_{\mathrm {KL} }(P\parallel Q)>2\colon } 1599: 991:{\displaystyle H^{2}(P,Q)\leq \delta (P,Q)\leq {\sqrt {2}}H(P,Q).} 728:{\displaystyle \delta (P,Q)={\frac {1}{2}}\sum _{x}|P(x)-Q(x)|,} 279: 177: 98: 1553:
Devroye, Luc; Györfi, Laszlo; Lugosi, Gabor (1996-04-04).
1654: 480: 1393: 1373: 1353: 1321: 1301: 1087: 1026: 910: 872: 758: 643: 539: 479: 387: 244: 221: 201: 165: 145: 125: 86: 1518:, 2nd. rev. ed. (AMS, 2017), Proposition 4.2, p. 48. 1512:David A. Levin, Yuval Peres, Elizabeth L. Wilmer, 1399: 1379: 1359: 1339: 1307: 1284: 1070: 990: 893: 841: 727: 616: 523: 458: 328: 227: 207: 187: 151: 131: 108: 1231: 1144: 267: 18:Total-variation distance of probability measures 1071:{\displaystyle c(x,y)={\mathbf {1} }_{x\neq y}} 863:The total variation distance is related to the 370:The total variation distance is related to the 1347:lives, and the infimum is taken over all such 469:One also has the following inequality, due to 1674: 1556:A Probabilistic Theory of Pattern Recognition 1222: 1149: 8: 1111: 1098: 627:The total variation distance is half of the 1582:"Lecture notes on communication complexity" 1681: 1667: 1559:(Corrected ed.). New York: Springer. 1392: 1372: 1352: 1320: 1300: 1264: 1258: 1257: 1244: 1234: 1221: 1220: 1200: 1177: 1155: 1154: 1148: 1147: 1114: 1088: 1086: 1056: 1050: 1049: 1025: 960: 915: 909: 871: 831: 830: 825: 793: 780: 757: 738:and when the distributions have standard 717: 685: 679: 665: 642: 582: 581: 573: 561: 538: 486: 485: 478: 426: 425: 411: 409: 386: 278: 277: 270: 243: 220: 200: 176: 175: 164: 144: 124: 97: 96: 85: 1530:Introduction to nonparametric estimation 1493:Introduction to nonparametric estimation 1448:"Distances between probability measures" 188:{\displaystyle (\Omega ,{\mathcal {F}})} 109:{\displaystyle (\Omega ,{\mathcal {F}})} 36: 1580:Harsha, Prahladh (September 23, 2011). 1480:Estimation des densitĂ©s: risque minimax 1438: 195:. The total variation distance between 7: 1635: 1633: 57:metric, and is sometimes called the 852:(or the analogous distance between 351:The total variation distance is an 1653:. You can help Knowledge (XXG) by 1241: 832: 586: 583: 490: 487: 430: 427: 169: 90: 25: 1637: 1259: 1051: 1528:Tsybakov, Aleksandr B. (2009). 1601:Optimal Transport, Old and New 1515:Markov Chains and Mixing Times 1334: 1322: 1276: 1253: 1211: 1205: 1188: 1182: 1171: 1159: 1138: 1126: 1042: 1030: 982: 970: 954: 942: 933: 921: 888: 876: 826: 822: 816: 807: 801: 794: 774: 762: 718: 714: 708: 699: 693: 686: 659: 647: 604: 592: 555: 543: 508: 496: 448: 436: 403: 391: 315: 309: 300: 294: 260: 248: 182: 166: 103: 87: 1: 1453:. UC Berkeley. Archived from 740:probability density functions 27:Concept in probability theory 1478:Bretagnolle, J.; Huber, C, 372:Kullback–Leibler divergence 366:Relation to other distances 360:integral probability metric 1726: 1632: 633:probability mass functions 343:assign to the same event. 29: 1610:10.1007/978-3-540-71050-9 854:Radon-Nikodym derivatives 341:probability distributions 1598:Villani, CĂ©dric (2009). 1491:Tsybakov, Alexandre B., 51:total variation distance 30:Not to be confused with 1422:Kolmogorov–Smirnov test 1649:-related article is a 1401: 1381: 1361: 1341: 1309: 1286: 1072: 992: 895: 894:{\displaystyle H(P,Q)} 843: 729: 618: 525: 460: 330: 229: 209: 189: 153: 133: 110: 63:statistical difference 42: 1402: 1382: 1362: 1342: 1340:{\displaystyle (x,y)} 1310: 1287: 1073: 1015:transportation theory 993: 896: 844: 730: 619: 526: 471:Bretagnolle and Huber 461: 331: 230: 210: 190: 154: 134: 111: 40: 1446:Chatterjee, Sourav. 1391: 1371: 1360:{\displaystyle \pi } 1351: 1319: 1308:{\displaystyle \pi } 1299: 1085: 1024: 908: 870: 756: 641: 537: 477: 385: 376:Pinsker’s inequality 242: 219: 199: 163: 143: 123: 118:probability measures 84: 67:variational distance 59:statistical distance 55:statistical distance 1315:on the space where 1700:Probability theory 1427:Wasserstein metric 1397: 1377: 1357: 1337: 1305: 1282: 1239: 1068: 988: 891: 865:Hellinger distance 858:dominating measure 839: 725: 684: 614: 521: 520: 456: 326: 285: 225: 205: 185: 149: 129: 106: 47:probability theory 43: 1710:Probability stubs 1662: 1661: 1619:978-3-540-71049-3 1566:978-0-387-94618-4 1539:978-0-387-79051-0 1501:978-0-387-79051-0 1400:{\displaystyle Q} 1380:{\displaystyle P} 1230: 1203: 1180: 1096: 965: 788: 675: 673: 609: 451: 419: 266: 228:{\displaystyle Q} 208:{\displaystyle P} 152:{\displaystyle Q} 132:{\displaystyle P} 16:(Redirected from 1717: 1683: 1676: 1669: 1641: 1634: 1624: 1623: 1595: 1589: 1588: 1586: 1577: 1571: 1570: 1550: 1544: 1543: 1525: 1519: 1510: 1504: 1503:, Equation 2.25. 1489: 1483: 1476: 1470: 1469: 1467: 1465: 1459: 1452: 1443: 1407:, respectively. 1406: 1404: 1403: 1398: 1386: 1384: 1383: 1378: 1366: 1364: 1363: 1358: 1346: 1344: 1343: 1338: 1314: 1312: 1311: 1306: 1291: 1289: 1288: 1283: 1275: 1274: 1263: 1262: 1249: 1248: 1238: 1226: 1225: 1204: 1201: 1181: 1178: 1158: 1153: 1152: 1119: 1118: 1097: 1089: 1077: 1075: 1074: 1069: 1067: 1066: 1055: 1054: 997: 995: 994: 989: 966: 961: 920: 919: 900: 898: 897: 892: 856:with any common 848: 846: 845: 840: 835: 829: 797: 789: 781: 748: 744: 734: 732: 731: 726: 721: 689: 683: 674: 666: 623: 621: 620: 615: 610: 608: 607: 591: 590: 589: 562: 530: 528: 527: 522: 495: 494: 493: 465: 463: 462: 457: 452: 435: 434: 433: 420: 412: 410: 335: 333: 332: 327: 322: 318: 284: 283: 282: 234: 232: 231: 226: 214: 212: 211: 206: 194: 192: 191: 186: 181: 180: 158: 156: 155: 150: 138: 136: 135: 130: 115: 113: 112: 107: 102: 101: 79:measurable space 21: 1725: 1724: 1720: 1719: 1718: 1716: 1715: 1714: 1690: 1689: 1688: 1687: 1630: 1628: 1627: 1620: 1597: 1596: 1592: 1584: 1579: 1578: 1574: 1567: 1552: 1551: 1547: 1540: 1527: 1526: 1522: 1511: 1507: 1490: 1486: 1477: 1473: 1463: 1461: 1460:on July 8, 2008 1457: 1450: 1445: 1444: 1440: 1435: 1417:Total variation 1413: 1389: 1388: 1369: 1368: 1367:with marginals 1349: 1348: 1317: 1316: 1297: 1296: 1256: 1240: 1110: 1083: 1082: 1048: 1022: 1021: 1018: 911: 906: 905: 868: 867: 754: 753: 746: 742: 639: 638: 577: 569: 535: 534: 481: 475: 474: 421: 383: 382: 368: 349: 290: 286: 240: 239: 217: 216: 197: 196: 161: 160: 141: 140: 121: 120: 82: 81: 75: 35: 32:Total variation 28: 23: 22: 15: 12: 11: 5: 1723: 1721: 1713: 1712: 1707: 1702: 1692: 1691: 1686: 1685: 1678: 1671: 1663: 1660: 1659: 1642: 1626: 1625: 1618: 1590: 1572: 1565: 1545: 1538: 1520: 1505: 1484: 1471: 1437: 1436: 1434: 1431: 1430: 1429: 1424: 1419: 1412: 1409: 1396: 1376: 1356: 1336: 1333: 1330: 1327: 1324: 1304: 1293: 1292: 1281: 1278: 1273: 1270: 1267: 1261: 1255: 1252: 1247: 1243: 1237: 1233: 1229: 1224: 1219: 1216: 1213: 1210: 1207: 1199: 1196: 1193: 1190: 1187: 1184: 1176: 1173: 1170: 1167: 1164: 1161: 1157: 1151: 1146: 1143: 1140: 1137: 1134: 1131: 1128: 1125: 1122: 1117: 1113: 1109: 1106: 1103: 1100: 1095: 1092: 1065: 1062: 1059: 1053: 1047: 1044: 1041: 1038: 1035: 1032: 1029: 1017: 1013:Connection to 1011: 999: 998: 987: 984: 981: 978: 975: 972: 969: 964: 959: 956: 953: 950: 947: 944: 941: 938: 935: 932: 929: 926: 923: 918: 914: 890: 887: 884: 881: 878: 875: 850: 849: 838: 834: 828: 824: 821: 818: 815: 812: 809: 806: 803: 800: 796: 792: 787: 784: 779: 776: 773: 770: 767: 764: 761: 736: 735: 724: 720: 716: 713: 710: 707: 704: 701: 698: 695: 692: 688: 682: 678: 672: 669: 664: 661: 658: 655: 652: 649: 646: 625: 624: 613: 606: 603: 600: 597: 594: 588: 585: 580: 576: 572: 568: 565: 560: 557: 554: 551: 548: 545: 542: 519: 516: 513: 510: 507: 504: 501: 498: 492: 489: 484: 467: 466: 455: 450: 447: 444: 441: 438: 432: 429: 424: 418: 415: 408: 405: 402: 399: 396: 393: 390: 367: 364: 348: 345: 337: 336: 325: 321: 317: 314: 311: 308: 305: 302: 299: 296: 293: 289: 281: 276: 273: 269: 265: 262: 259: 256: 253: 250: 247: 235:is defined as 224: 204: 184: 179: 174: 171: 168: 148: 128: 105: 100: 95: 92: 89: 74: 71: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 1722: 1711: 1708: 1706: 1705:F-divergences 1703: 1701: 1698: 1697: 1695: 1684: 1679: 1677: 1672: 1670: 1665: 1664: 1658: 1656: 1652: 1648: 1643: 1640: 1636: 1631: 1621: 1615: 1611: 1607: 1603: 1602: 1594: 1591: 1583: 1576: 1573: 1568: 1562: 1558: 1557: 1549: 1546: 1541: 1535: 1531: 1524: 1521: 1517: 1516: 1509: 1506: 1502: 1498: 1494: 1488: 1485: 1481: 1475: 1472: 1456: 1449: 1442: 1439: 1432: 1428: 1425: 1423: 1420: 1418: 1415: 1414: 1410: 1408: 1394: 1374: 1354: 1331: 1328: 1325: 1302: 1279: 1271: 1268: 1265: 1250: 1245: 1235: 1227: 1217: 1214: 1208: 1197: 1194: 1191: 1185: 1174: 1168: 1165: 1162: 1141: 1135: 1132: 1129: 1123: 1120: 1115: 1107: 1104: 1101: 1093: 1090: 1081: 1080: 1079: 1063: 1060: 1057: 1045: 1039: 1036: 1033: 1027: 1016: 1012: 1010: 1008: 1004: 985: 979: 976: 973: 967: 962: 957: 951: 948: 945: 939: 936: 930: 927: 924: 916: 912: 904: 903: 902: 885: 882: 879: 873: 866: 861: 859: 855: 836: 819: 813: 810: 804: 798: 790: 785: 782: 777: 771: 768: 765: 759: 752: 751: 750: 741: 722: 711: 705: 702: 696: 690: 680: 676: 670: 667: 662: 656: 653: 650: 644: 637: 636: 635: 634: 630: 611: 601: 598: 595: 578: 574: 570: 566: 563: 558: 552: 549: 546: 540: 533: 532: 531: 517: 514: 511: 505: 502: 499: 482: 472: 453: 445: 442: 439: 422: 416: 413: 406: 400: 397: 394: 388: 381: 380: 379: 377: 373: 365: 363: 361: 357: 355: 346: 344: 342: 323: 319: 312: 306: 303: 297: 291: 287: 274: 271: 263: 257: 254: 251: 245: 238: 237: 236: 222: 202: 172: 146: 126: 119: 93: 80: 72: 70: 68: 64: 60: 56: 52: 48: 39: 33: 19: 1655:expanding it 1644: 1629: 1600: 1593: 1575: 1555: 1548: 1529: 1523: 1513: 1508: 1492: 1487: 1479: 1474: 1462:. Retrieved 1455:the original 1441: 1294: 1019: 1000: 901:as follows: 862: 851: 737: 626: 468: 369: 353: 350: 338: 76: 66: 62: 58: 50: 44: 1647:probability 1078:, that is, 356:-divergence 159:defined on 77:Consider a 1694:Categories 1433:References 629:L distance 347:Properties 73:Definition 1355:π 1303:π 1269:≠ 1251:⁡ 1246:π 1236:π 1166:≠ 1124:δ 1112:‖ 1105:− 1099:‖ 1061:≠ 958:≤ 940:δ 937:≤ 811:− 791:∫ 760:δ 703:− 677:∑ 645:δ 599:∥ 575:− 567:− 559:≤ 541:δ 518:: 503:∥ 443:∥ 407:≤ 389:δ 304:− 275:∈ 246:δ 170:Ω 91:Ω 1411:See also 1005:and the 1464:21 June 358:and an 1616:  1563:  1536:  1499:  1007:2-norm 1003:1-norm 49:, the 1645:This 1585:(PDF) 1458:(PDF) 1451:(PDF) 1651:stub 1614:ISBN 1561:ISBN 1534:ISBN 1497:ISBN 1466:2013 1387:and 745:and 512:> 215:and 139:and 116:and 1606:doi 1232:inf 1202:Law 1179:Law 1145:inf 374:by 268:sup 65:or 45:In 1696:: 1612:. 1009:. 749:, 378:: 362:. 69:. 61:, 1682:e 1675:t 1668:v 1657:. 1622:. 1608:: 1587:. 1569:. 1542:. 1468:. 1395:Q 1375:P 1335:) 1332:y 1329:, 1326:x 1323:( 1280:, 1277:] 1272:y 1266:x 1260:1 1254:[ 1242:E 1228:= 1223:} 1218:Q 1215:= 1212:) 1209:Y 1206:( 1198:, 1195:P 1192:= 1189:) 1186:X 1183:( 1175:: 1172:) 1169:Y 1163:X 1160:( 1156:P 1150:{ 1142:= 1139:) 1136:Q 1133:, 1130:P 1127:( 1121:= 1116:1 1108:Q 1102:P 1094:2 1091:1 1064:y 1058:x 1052:1 1046:= 1043:) 1040:y 1037:, 1034:x 1031:( 1028:c 986:. 983:) 980:Q 977:, 974:P 971:( 968:H 963:2 955:) 952:Q 949:, 946:P 943:( 934:) 931:Q 928:, 925:P 922:( 917:2 913:H 889:) 886:Q 883:, 880:P 877:( 874:H 837:x 833:d 827:| 823:) 820:x 817:( 814:q 808:) 805:x 802:( 799:p 795:| 786:2 783:1 778:= 775:) 772:Q 769:, 766:P 763:( 747:q 743:p 723:, 719:| 715:) 712:x 709:( 706:Q 700:) 697:x 694:( 691:P 687:| 681:x 671:2 668:1 663:= 660:) 657:Q 654:, 651:P 648:( 612:. 605:) 602:Q 596:P 593:( 587:L 584:K 579:D 571:e 564:1 556:) 553:Q 550:, 547:P 544:( 515:2 509:) 506:Q 500:P 497:( 491:L 488:K 483:D 454:. 449:) 446:Q 440:P 437:( 431:L 428:K 423:D 417:2 414:1 404:) 401:Q 398:, 395:P 392:( 354:f 324:. 320:| 316:) 313:A 310:( 307:Q 301:) 298:A 295:( 292:P 288:| 280:F 272:A 264:= 261:) 258:Q 255:, 252:P 249:( 223:Q 203:P 183:) 178:F 173:, 167:( 147:Q 127:P 104:) 99:F 94:, 88:( 34:. 20:)

Index

Total-variation distance of probability measures
Total variation

probability theory
statistical distance
measurable space
probability measures
probability distributions
f-divergence
integral probability metric
Kullback–Leibler divergence
Pinsker’s inequality
Bretagnolle and Huber
L distance
probability mass functions
probability density functions
Radon-Nikodym derivatives
dominating measure
Hellinger distance
1-norm
2-norm
transportation theory
Total variation
Kolmogorov–Smirnov test
Wasserstein metric
"Distances between probability measures"
the original
ISBN
978-0-387-79051-0
Markov Chains and Mixing Times

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑