Knowledge

Spurious correlation of ratios

Source đź“ť

20: 775: 1224:
It seems surprising that the warnings of three such eminent statistician-scientists as Pearson, Galton and Weldon should have largely gone unheeded for so long: even today uncritical applications of inappropriate statistical methods to compositional data with consequent dubious inferences are
474: 1151: 1478:
Lovell, David; MĂĽller, Warren; Taylor, Jen; Zwart, Alec; Helliwell, Chris (2011). "Chapter 14: Proportions, Percentages, PPM: Do the Molecular Biosciences Treat Compositional Data Right?". In Pawlowsky-Glahn, Vera; Buccianti, Antonella (eds.).
245: 1208:
measurements by dividing them by a particular variable or total. The danger he saw was that conclusions would be drawn from correlations that are artifacts of the analysis method, rather than actual “organic” relationships.
770:{\displaystyle \rho ={\frac {r_{12}v_{1}v_{2}-r_{14}v_{1}v_{4}-r_{23}v_{2}v_{3}+r_{34}v_{3}v_{4}}{{\sqrt {v_{1}^{2}+v_{3}^{2}-2r_{13}v_{1}v_{3}}}{\sqrt {v_{2}^{2}+v_{4}^{2}-2r_{24}v_{2}v_{4}}}}}} 164: 1251:
Pearson, Karl (1896). "Mathematical Contributions to the Theory of Evolution – On a Form of Spurious Correlation Which May Arise When Indices Are Used in the Measurement of Organs".
466: 1028: 1020: 1187: 967: 400: 358: 866: 924: 897: 836: 805: 159: 1331: 82: 78:, which deals with the analysis of variables that carry only relative information, such as proportions, percentages and parts-per-million. 1577: 1368: 153:
are drawn from normal distributions with means 10, 10, and 30, respectively, and standard deviations 1, 1, and 3 respectively, i.e.,
1496: 263: 1212:
However, it would appear that spurious correlation (and its potential to mislead) is not yet widely understood. In 1986
1201: 1156:
For the special case in which all coefficients of variation are equal (as is the case in the illustrations at right),
1205: 1204:
in cautioning scientists to be wary of spurious correlation, especially in biology where it is common to scale or
51:
are statistically independent of each other (i.e., the pairwise correlations between each of them are zero). The
808: 405: 1512:
Lovell, David; Pawlowsky-Glahn, Vera; Egozcue, Juan José; Marguerat, Samuel; Bähler, Jürg (16 March 2015).
1146:{\displaystyle \rho _{0}={\frac {v_{3}^{2}}{{\sqrt {v_{1}^{2}+v_{3}^{2}}}{\sqrt {v_{2}^{2}+v_{3}^{2}}}}}.} 1384:
Galton, Francis (1896). "Note to the memoir by Professor Karl Pearson, F.R.S., on spurious correlation".
1229:
More recent publications suggest that this lack of awareness prevails, at least in molecular bioscience.
972: 1525: 1428: 68: 1348: 1159: 1460: 1444: 1401: 1268: 1217: 932: 363: 321: 75: 318:
Pearson derived an approximation of the correlation that would be observed between two indices (
19: 1553: 1492: 1452: 1364: 1327: 74:
The phenomenon of spurious correlation of ratios is one of the main motives for the field of
1543: 1533: 1484: 1436: 1393: 1356: 1299: 1260: 929:
This expression can be simplified for situations where there is a common divisor by setting
841: 902: 875: 814: 783: 1529: 1432: 240:{\displaystyle {\begin{aligned}x,y&\sim N(10,1)\\z&\sim N(30,3)\\\end{aligned}}} 1548: 1513: 1213: 71:
that arises between ratios of absolute measurements which themselves are uncorrelated.
1571: 1405: 1464: 306:
values tend to appear in the bottom left of the plot; trios with relatively small
286:) and can be better understood if we colour the points in the scatter plot by the 1538: 1287: 869: 23:
An illustration of spurious correlation, this figure shows 500 observations of
60: 1304: 1557: 1456: 1397: 1264: 1488: 1360: 129:
The scatter plot above illustrates this example using 500 observations of
1419:
Jackson, DA; Somers, KM (1991). "The Spectre of 'Spurious' Correlation".
282:
have a correlation of 0.53. This is because of the common divisor (
1514:"Proportionality: A Valid Alternative to Correlation for Relative Data" 1448: 1440: 1272: 266:
and therefore uncorrelated, in the depicted typical sample the ratios
125:
for each triplet, and correlation will be found between these indices.
109:, these will be pair and pair uncorrelated. Form the proper fractions 1197: 18: 81:
Spurious correlation is distinct from misconceptions about
97:
Select three numbers within certain ranges at random, say
93:
Pearson states a simple example of spurious correlation:
1288:"Correlations Genuine and Spurious in Pearson and Yule" 1162: 1031: 975: 935: 905: 878: 844: 817: 786: 477: 408: 366: 324: 162: 1481:
Compositional Data Analysis: Theory and Applications
1353:
Compositional Data Analysis: Theory and Applications
1022:are uncorrelated, giving the spurious correlation: 1181: 1145: 1014: 961: 918: 891: 860: 830: 799: 769: 460: 394: 352: 239: 1222: 95: 1324:The statistical analysis of compositional data 39:. The sample correlation is 0.53, even though 402:), i.e., ratios of the absolute measurements 8: 1317: 1315: 1386:Proceedings of the Royal Society of London 1253:Proceedings of the Royal Society of London 1216:, who pioneered the log-ratio approach to 314:Approximate amount of spurious correlation 55:-values are highlighted on a colour scale. 1547: 1537: 1303: 1246: 1244: 1242: 1167: 1161: 1129: 1124: 1111: 1106: 1100: 1092: 1087: 1074: 1069: 1063: 1056: 1051: 1045: 1036: 1030: 1006: 993: 980: 974: 953: 940: 934: 910: 904: 883: 877: 849: 843: 822: 816: 791: 785: 756: 746: 736: 720: 715: 702: 697: 691: 683: 673: 663: 647: 642: 629: 624: 618: 610: 600: 590: 577: 567: 557: 544: 534: 524: 511: 501: 491: 484: 476: 452: 439: 426: 413: 407: 386: 377: 371: 365: 344: 335: 329: 323: 163: 161: 310:values tend to appear in the top right. 1238: 1192:Relevance to biology and other sciences 461:{\displaystyle x_{1},x_{2},x_{3},x_{4}} 7: 89:Illustration of spurious correlation 14: 1015:{\displaystyle x_{1},x_{2},x_{3}} 230: 218: 198: 186: 65:spurious correlation of ratios 1: 1182:{\displaystyle \rho _{0}=0.5} 1539:10.1371/journal.pcbi.1004075 1218:compositional data analysis 1202:Walter Frank Raphael Weldon 962:{\displaystyle x_{3}=x_{4}} 395:{\displaystyle x_{2}/x_{4}} 353:{\displaystyle x_{1}/x_{3}} 76:compositional data analysis 1594: 1578:Covariance and correlation 1518:PLOS Computational Biology 264:statistically independent 83:correlation and causality 1322:Aitchison, John (1986). 809:coefficient of variation 302:) with relatively large 1347:Pawlowsky-Glahn, Vera; 1398:10.1098/rspl.1896.0077 1326:. Chapman & Hall. 1286:Aldrich, John (1995). 1265:10.1098/rspl.1896.0076 1227: 1196:Pearson was joined by 1183: 1147: 1016: 963: 920: 893: 862: 861:{\displaystyle r_{ij}} 832: 801: 771: 462: 396: 354: 241: 127: 56: 1489:10.1002/9781119976462 1361:10.1002/9781119976462 1305:10.1214/ss/1177009870 1184: 1148: 1017: 964: 921: 919:{\displaystyle x_{j}} 894: 892:{\displaystyle x_{i}} 863: 833: 831:{\displaystyle x_{i}} 802: 800:{\displaystyle v_{i}} 772: 463: 397: 355: 242: 22: 16:Concept in statistics 1392:(359–367): 498–502. 1349:Buccianti, Antonella 1259:(359–367): 489–498. 1160: 1029: 973: 933: 903: 876: 842: 815: 784: 475: 406: 364: 322: 160: 69:spurious correlation 1530:2015PLSCB..11E4075L 1433:1991Oecol..86..147J 1292:Statistical Science 1225:regularly reported. 1134: 1116: 1097: 1079: 1061: 870:Pearson correlation 725: 707: 652: 634: 1441:10.1007/bf00317404 1198:Sir Francis Galton 1179: 1143: 1120: 1102: 1083: 1065: 1047: 1012: 959: 916: 889: 858: 828: 797: 767: 711: 693: 638: 620: 458: 392: 350: 290:-value. Trios of ( 237: 235: 57: 1333:978-0-412-28060-3 1138: 1135: 1098: 765: 762: 689: 1585: 1562: 1561: 1551: 1541: 1509: 1503: 1502: 1475: 1469: 1468: 1416: 1410: 1409: 1381: 1375: 1374: 1344: 1338: 1337: 1319: 1310: 1309: 1307: 1283: 1277: 1276: 1248: 1188: 1186: 1185: 1180: 1172: 1171: 1152: 1150: 1149: 1144: 1139: 1137: 1136: 1133: 1128: 1115: 1110: 1101: 1099: 1096: 1091: 1078: 1073: 1064: 1060: 1055: 1046: 1041: 1040: 1021: 1019: 1018: 1013: 1011: 1010: 998: 997: 985: 984: 968: 966: 965: 960: 958: 957: 945: 944: 925: 923: 922: 917: 915: 914: 898: 896: 895: 890: 888: 887: 867: 865: 864: 859: 857: 856: 837: 835: 834: 829: 827: 826: 806: 804: 803: 798: 796: 795: 776: 774: 773: 768: 766: 764: 763: 761: 760: 751: 750: 741: 740: 724: 719: 706: 701: 692: 690: 688: 687: 678: 677: 668: 667: 651: 646: 633: 628: 619: 616: 615: 614: 605: 604: 595: 594: 582: 581: 572: 571: 562: 561: 549: 548: 539: 538: 529: 528: 516: 515: 506: 505: 496: 495: 485: 467: 465: 464: 459: 457: 456: 444: 443: 431: 430: 418: 417: 401: 399: 398: 393: 391: 390: 381: 376: 375: 359: 357: 356: 351: 349: 348: 339: 334: 333: 246: 244: 243: 238: 236: 31:plotted against 1593: 1592: 1588: 1587: 1586: 1584: 1583: 1582: 1568: 1567: 1566: 1565: 1524:(3): e1004075. 1511: 1510: 1506: 1499: 1477: 1476: 1472: 1418: 1417: 1413: 1383: 1382: 1378: 1371: 1351:, eds. (2011). 1346: 1345: 1341: 1334: 1321: 1320: 1313: 1285: 1284: 1280: 1250: 1249: 1240: 1235: 1194: 1163: 1158: 1157: 1062: 1032: 1027: 1026: 1002: 989: 976: 971: 970: 949: 936: 931: 930: 906: 901: 900: 879: 874: 873: 845: 840: 839: 818: 813: 812: 787: 782: 781: 752: 742: 732: 679: 669: 659: 617: 606: 596: 586: 573: 563: 553: 540: 530: 520: 507: 497: 487: 486: 473: 472: 448: 435: 422: 409: 404: 403: 382: 367: 362: 361: 340: 325: 320: 319: 316: 234: 233: 208: 202: 201: 176: 158: 157: 91: 17: 12: 11: 5: 1591: 1589: 1581: 1580: 1570: 1569: 1564: 1563: 1504: 1497: 1470: 1427:(1): 147–151. 1411: 1376: 1370:978-0470711354 1369: 1339: 1332: 1311: 1298:(4): 364–376. 1278: 1237: 1236: 1234: 1231: 1214:John Aitchison 1193: 1190: 1178: 1175: 1170: 1166: 1154: 1153: 1142: 1132: 1127: 1123: 1119: 1114: 1109: 1105: 1095: 1090: 1086: 1082: 1077: 1072: 1068: 1059: 1054: 1050: 1044: 1039: 1035: 1009: 1005: 1001: 996: 992: 988: 983: 979: 956: 952: 948: 943: 939: 913: 909: 886: 882: 855: 852: 848: 825: 821: 794: 790: 778: 777: 759: 755: 749: 745: 739: 735: 731: 728: 723: 718: 714: 710: 705: 700: 696: 686: 682: 676: 672: 666: 662: 658: 655: 650: 645: 641: 637: 632: 627: 623: 613: 609: 603: 599: 593: 589: 585: 580: 576: 570: 566: 560: 556: 552: 547: 543: 537: 533: 527: 523: 519: 514: 510: 504: 500: 494: 490: 483: 480: 455: 451: 447: 442: 438: 434: 429: 425: 421: 416: 412: 389: 385: 380: 374: 370: 347: 343: 338: 332: 328: 315: 312: 248: 247: 232: 229: 226: 223: 220: 217: 214: 211: 209: 207: 204: 203: 200: 197: 194: 191: 188: 185: 182: 179: 177: 175: 172: 169: 166: 165: 90: 87: 15: 13: 10: 9: 6: 4: 3: 2: 1590: 1579: 1576: 1575: 1573: 1559: 1555: 1550: 1545: 1540: 1535: 1531: 1527: 1523: 1519: 1515: 1508: 1505: 1500: 1498:9780470711354 1494: 1490: 1486: 1482: 1474: 1471: 1466: 1462: 1458: 1454: 1450: 1446: 1442: 1438: 1434: 1430: 1426: 1422: 1415: 1412: 1407: 1403: 1399: 1395: 1391: 1387: 1380: 1377: 1372: 1366: 1362: 1358: 1354: 1350: 1343: 1340: 1335: 1329: 1325: 1318: 1316: 1312: 1306: 1301: 1297: 1293: 1289: 1282: 1279: 1274: 1270: 1266: 1262: 1258: 1254: 1247: 1245: 1243: 1239: 1232: 1230: 1226: 1221: 1219: 1215: 1210: 1207: 1203: 1199: 1191: 1189: 1176: 1173: 1168: 1164: 1140: 1130: 1125: 1121: 1117: 1112: 1107: 1103: 1093: 1088: 1084: 1080: 1075: 1070: 1066: 1057: 1052: 1048: 1042: 1037: 1033: 1025: 1024: 1023: 1007: 1003: 999: 994: 990: 986: 981: 977: 954: 950: 946: 941: 937: 927: 911: 907: 884: 880: 871: 853: 850: 846: 823: 819: 810: 792: 788: 757: 753: 747: 743: 737: 733: 729: 726: 721: 716: 712: 708: 703: 698: 694: 684: 680: 674: 670: 664: 660: 656: 653: 648: 643: 639: 635: 630: 625: 621: 611: 607: 601: 597: 591: 587: 583: 578: 574: 568: 564: 558: 554: 550: 545: 541: 535: 531: 525: 521: 517: 512: 508: 502: 498: 492: 488: 481: 478: 471: 470: 469: 453: 449: 445: 440: 436: 432: 427: 423: 419: 414: 410: 387: 383: 378: 372: 368: 345: 341: 336: 330: 326: 313: 311: 309: 305: 301: 297: 293: 289: 285: 281: 277: 273: 269: 265: 261: 257: 253: 227: 224: 221: 215: 212: 210: 205: 195: 192: 189: 183: 180: 178: 173: 170: 167: 156: 155: 154: 152: 148: 144: 140: 136: 132: 126: 124: 120: 116: 112: 108: 104: 100: 94: 88: 86: 84: 79: 77: 72: 70: 67:is a form of 66: 62: 54: 50: 46: 42: 38: 34: 30: 26: 21: 1521: 1517: 1507: 1480: 1473: 1424: 1420: 1414: 1389: 1385: 1379: 1352: 1342: 1323: 1295: 1291: 1281: 1256: 1252: 1228: 1223: 1211: 1195: 1155: 928: 779: 317: 307: 303: 299: 295: 291: 287: 283: 279: 275: 271: 267: 259: 255: 251: 250:Even though 249: 150: 146: 142: 141:. Variables 138: 134: 130: 128: 122: 118: 114: 110: 106: 102: 98: 96: 92: 80: 73: 64: 58: 52: 48: 44: 40: 36: 32: 28: 24: 1233:References 61:statistics 1483:. Wiley. 1421:Oecologia 1406:170846631 1355:. Wiley. 1206:normalize 1165:ρ 1034:ρ 727:− 654:− 551:− 518:− 479:ρ 213:∼ 181:∼ 1572:Category 1558:25775355 1457:28313173 872:between 1549:4361748 1526:Bibcode 1465:1116627 1449:4219582 1429:Bibcode 1220:wrote: 807:is the 298:,  294:,  1556:  1546:  1495:  1463:  1455:  1447:  1404:  1367:  1330:  1273:115879 1271:  969:, and 838:, and 780:where 258:, and 137:, and 47:, and 1461:S2CID 1445:JSTOR 1402:S2CID 1269:JSTOR 1554:PMID 1493:ISBN 1453:PMID 1365:ISBN 1328:ISBN 1200:and 899:and 868:the 360:and 274:and 262:are 149:and 117:and 1544:PMC 1534:doi 1485:doi 1437:doi 1394:doi 1357:doi 1300:doi 1261:doi 1177:0.5 811:of 59:In 1574:: 1552:. 1542:. 1532:. 1522:11 1520:. 1516:. 1491:. 1459:. 1451:. 1443:. 1435:. 1425:86 1423:. 1400:. 1390:60 1388:. 1363:. 1314:^ 1296:10 1294:. 1290:. 1267:. 1257:60 1255:. 1241:^ 926:. 738:24 665:13 592:34 559:23 526:14 493:12 468:: 254:, 222:30 190:10 145:, 133:, 105:, 101:, 85:. 63:, 43:, 1560:. 1536:: 1528:: 1501:. 1487:: 1467:. 1439:: 1431:: 1408:. 1396:: 1373:. 1359:: 1336:. 1308:. 1302:: 1275:. 1263:: 1174:= 1169:0 1141:. 1131:2 1126:3 1122:v 1118:+ 1113:2 1108:2 1104:v 1094:2 1089:3 1085:v 1081:+ 1076:2 1071:1 1067:v 1058:2 1053:3 1049:v 1043:= 1038:0 1008:3 1004:x 1000:, 995:2 991:x 987:, 982:1 978:x 955:4 951:x 947:= 942:3 938:x 912:j 908:x 885:i 881:x 854:j 851:i 847:r 824:i 820:x 793:i 789:v 758:4 754:v 748:2 744:v 734:r 730:2 722:2 717:4 713:v 709:+ 704:2 699:2 695:v 685:3 681:v 675:1 671:v 661:r 657:2 649:2 644:3 640:v 636:+ 631:2 626:1 622:v 612:4 608:v 602:3 598:v 588:r 584:+ 579:3 575:v 569:2 565:v 555:r 546:4 542:v 536:1 532:v 522:r 513:2 509:v 503:1 499:v 489:r 482:= 454:4 450:x 446:, 441:3 437:x 433:, 428:2 424:x 420:, 415:1 411:x 388:4 384:x 379:/ 373:2 369:x 346:3 342:x 337:/ 331:1 327:x 308:z 304:z 300:z 296:y 292:x 288:z 284:z 280:z 278:/ 276:y 272:z 270:/ 268:x 260:z 256:y 252:x 231:) 228:3 225:, 219:( 216:N 206:z 199:) 196:1 193:, 187:( 184:N 174:y 171:, 168:x 151:z 147:y 143:x 139:z 135:y 131:x 123:z 121:/ 119:y 115:z 113:/ 111:x 107:z 103:y 99:x 53:z 49:z 45:y 41:x 37:z 35:/ 33:y 29:z 27:/ 25:x

Index


statistics
spurious correlation
compositional data analysis
correlation and causality
statistically independent
coefficient of variation
Pearson correlation
Sir Francis Galton
Walter Frank Raphael Weldon
normalize
John Aitchison
compositional data analysis



doi
10.1098/rspl.1896.0076
JSTOR
115879
"Correlations Genuine and Spurious in Pearson and Yule"
doi
10.1214/ss/1177009870


ISBN
978-0-412-28060-3
Buccianti, Antonella
doi
10.1002/9781119976462

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑