Knowledge

Talk:Independent and identically distributed random variables

Source 📝

657:
dataset is often assumed to be an i.i.d. random vector" would be of doubtful value in this article. The assumption of i.i.d. sampling is pervasive across various applications of statistical analysis of data, serving as the simplest assumption about a data-generating process. Mentioning machine learning can mislead a reader by giving the impression the assumption is particular to machine learning (while in fact it's independent(!) of it). Second, there are claims that are normative (e.g. "currently acquired massive quantities of data to deliver faster, more accurate results") and ill-defined (e.g. "The computer is very efficient to calculate multiple additions, but it is not efficient to calculate the multiplication"). Third, of the two URLs linked as references in the section, one no longer works and the other is not in English, which is not suitable for the English language version of Knowledge. Fourth, the section written as an answer to the question posed at its beginning: "Why assume the data in machine learning are independent and identically distributed?". The gist of the answer provided is that the log-likelihood function is additive, a simplification that makes for a more tractable optimization problem. But this again isn't particular to machine learning, but to maximum likelihood estimation. Moreover, i.i.d. sampling does not mean that the distribution function is known, so this is implicitly being assumed by the section. And then there's the fact that most machine learning methods are quite different from maximum likelihood. Finally, a good answer to this question would tackle the numerous problems with the assumption of independence in many real-world datasets, due to sample selection, autocorrelation, unobserved confounders, etc.
74: 53: 616:). The notion of sequence also adds a point of confusion for the reader when the article comes to "independent of the random variables that came before it". What about "after it"? What if there is no natural order? It has to be rewritten without the concept of the rvs coming in some order at all, which isn't so difficult. Then there is the section "Definition" which only defines pairwise independence and I strongly suspect that definition is wrong. 157: 22: 694:
as detailed above in a separate comment; the elimination of important examples illustrating how the i.i.d. sampling can be a flawed assumption; the unnecessary mentioning of "data mining" and "signal processing"; etc. With the semester already finished, it is doubtful the issues will be remedied by members of the class. I think all edits since November 2021 should be undone.
169: 656:
The "In machine learning" section has a number of serious issues. First, it is much too detailed and specific for an article on as general a statistical concept as i.i.d. random variables. Even inclusion of a sentence that amounts to something like "In machine learning, each vector of variables in a
273:
I am leaving this in the 'talk' page, in case my edit is sloppy and removed-- but I aim to include some important information I learned today about IUDs and female anatomy, which is very mundane, but little known information: 'uterine malformation' is a common occurrence in women. We are not informed
542:
The reference given supporting the definition of IID rvv is to Professor Aaron Clauset's notes on a probability primer for a complex systems modelling course. They're fine for their stated purpose, but don't pretend to be a rigorous mathematical treatment of the underlying probability theory. Even
693:
While I understand the value of having students edit Knowledge articles for a class, numerous issues have been introduced into the article. These include: the use of the pronoun "we"; the inconsistent math fonts for independence of events (which also has other issues); the machine learning section
673:
I fully agree with @Undsoweiter. There is no reason for explicitly mentioning machine learning. Moreover, I also do not understand the first reason at the end. Why is the cental limit theorem of any relevance at this point? One is not adding together random variables during likelihood optimization.
636:
White noise implies constant mean and variance and zero autocorrelation. Correlation only measures linear relationships, and hence does not imply independence, nor does it imply identical probability distribution for all the sequence of ransom variables, since it also concerns itself with the first
290:
A uterus with 2 chambers cannot be sufficiently protected from pregnancy with this contraception in the same way a woman with a normal uterus would, and I had never heard of the prevalence of this condition until today. its taken me 25 years to hear about it. It would be better to consumers if this
365:
After noticing the lead contained a mixture of both, I made a bold edit in favour of IID which I personally find less visually distracting than the dots in i.i.d. when the term is dropped into every second sentence. However, IID is not exactly beautiful, either, and typographically I would advise
492:
What I prefer, however, matters little. Similarly for what you prefer. Rather, Knowledge should generally follow what is most commonly used by reliable sources. For this, "i.i.d." is used far more commonly than "IID". Hence, I have changed the article to use the former.
146:
In the "Generalizations" section, I am missing pairwise/k-wise independence mentioned (i.e. any pair/k-tuple in the sequence is independent, but larger subsets are not necessarily independent). Pairwise/k-wise independence is used in theoretical CS. --David Pal
338:
How about explaining 'independent but not identically distributed' variables? The meaning of independence and identical distribution, and its implication, should be more explicitly stated... in my opinion, that is.
513:"An element in the sequence is independent of the random variables that came before it" ... "the probability distribution for the nth random variable is a function of the previous random variable in the sequence" 374:), except that this is apparently discouraged in the MOS. This article might the one where it makes sense to go against the recommended-style grain, though it's above my pay grade to decide this unilaterally. — 516:
Shouldn't we use "element" or value instead of the phrase "random variables"in those sentences? Each value in the sequence is a random variable? Or the whole sequence is represented by a random variable?
612:
I agree that there are big problems here. As well as those identified, the word "sequence" usually implies "countable", but sets of non-countably many iid rvs are often defined in the literature (e.g.
280:
When a uterus has an unusual shape, it cannot always accommodate an IUD in such a way that it is effective. The uterus may be cleft in half, making 2 uteri. Some women have 2 cervixes, or 2 vaginas.
124: 537:
Note that IID refers to sequences of random variables. "Independent and identically distributed" implies an element in the sequence is independent of the random variables that came before it.
588:
include that the ordering of the elements is essential - that's what most general readers would expect and possibly infer. However, I assert that the order of the elements is
216:
I think this can simply be correlation to determine if observation are IID. auto correlation is simply one for a time domain of similar. Correlation is more general I think.
277:
It is estimated that 7% of women, according to wiki's Interuterine Malformation page, (other sources will report as high as one fifth of womem) is born with this condition.
722: 400:) has been throughout the last century or more, and continues to be, a productive process in English writing, as seen in the following usages, for example: 114: 193: 727: 717: 489:
I saw "IID" on another page, and did not know what it meant. In contrast, "i.i.d." would have been immediately clear. Hence, I much prefer "i.i.d.".
346: 90: 324: 304: 231: 571:. And that's all that IID talks about - a set of observations, each assumed to come from the same underlying probability distribution. 287:), absolutely must have a link in this article, for people considering the use and potential functionality of an interuterine device. 584:
in maths is as a set indexed by the first so many (non-zero) "counting numbers", as above. But the usual connotations of the word
81: 58: 703: 683: 668: 646: 625: 605: 526: 502: 479: 378: 354: 332: 312: 294:
Any consumer of this product unaware of the link, or the structure of their uterus runs a risk of pregnancy and wasting money.
235: 205: 297:
In short, an informational relationship between the IUD page and the Uterine Malformation page would be a very helpful one.
33: 601: 475: 250: 350: 328: 308: 227: 21: 642: 223: 699: 664: 638: 597: 471: 39: 679: 73: 52: 498: 342: 320: 219: 613: 260: 675: 89:
on Knowledge. If you would like to participate, please visit the project page, where you can join
522: 201: 160:
This article was the subject of a Wiki Education Foundation-supported course assignment, between
467:– did not become the standard usage, as advocated by one of my lecturers in my youth. Oh well, 695: 660: 251:
http://de.wikipedia.org/search/?title=Unabh%C3%A4ngig_und_identisch_verteilt&redirect=no
621: 494: 185: 181: 266:
There in the text you will find "i.i.d. (für independent and identically distributed)".
177: 173: 711: 518: 397: 197: 375: 156: 617: 393: 385: 261:
http://de.wikipedia.org/Zufallsvariable#unabh.C3.A4ngig_und_identisch_verteilt
86: 459:
As an aside, it seems to me a pity that the equivalent, but more euphonious,
298: 284: 246:
Looks like this would be the corresponding article in German Knowledge
15: 689:
Degraded quality of article due to edits since November 2021
567:
coming from an initial segment of positive integers 1, 2, …
576:
This is the first time I've seen IID defined in terms of a
303:^ This seems to be on the wrong page. This is IID, not IUD 274:
of its likelihood purchasing a potentially expensive IUD.
317:
Not soure how to add the langunage link in this page.
151:
Wiki Education Foundation-supported course assignment
592:
of the essence in defining IID rvv! To say that it
580:
of rvv. Arguably, one (informal) usage of the term
283:
These unusually common malformations (they are here
85:, a collaborative effort to improve the coverage of 291:practical information were more common knowledge, 532:There's a bigger problem. The lead asserts this: 384:Your decision has my support, since dropping the 543:so, nowhere in that ref is there a mention of a 637:two mean-centered moments of the distribution. 299:https://en.wikipedia.org/Uterine_malformation 285:https://en.wikipedia.org/Uterine_malformation 8: 19: 340: 217: 47: 194:Template:Dashboard.wikiedu.org assignment 465:Identically Distributed and Independent 192:Above undated message substituted from 49: 509:Usage of the phrase "random variables" 596:essential, we need a better source. 7: 79:This article is within the scope of 38:It is of interest to the following 723:Mid-importance Statistics articles 165: 161: 14: 168:. Further details are available 155: 99:Knowledge:WikiProject Statistics 72: 51: 20: 728:WikiProject Statistics articles 718:Start-Class Statistics articles 406:International Business Machines 119:This article has been rated as 102:Template:WikiProject Statistics 632:I think white noise is not IID 1: 647:22:13, 10 November 2018 (UTC) 626:03:01, 20 November 2017 (UTC) 606:23:29, 19 November 2017 (UTC) 480:15:16, 19 November 2017 (UTC) 93:and see a list of open tasks. 704:09:12, 13 January 2022 (UTC) 669:09:00, 13 January 2022 (UTC) 313:23:23, 25 January 2021 (UTC) 206:22:56, 17 January 2022 (UTC) 463:– standing, obviously, for 236:18:40, 7 October 2019 (UTC) 744: 503:12:45, 23 March 2019 (UTC) 379:00:57, 20 March 2017 (UTC) 355:04:52, 21 April 2016 (UTC) 527:05:46, 12 June 2017 (UTC) 118: 67: 46: 684:15:01, 26 May 2022 (UTC) 652:Machine Learning section 333:17:59, 8 May 2013 (UTC) 539: 242:Link to German Version 82:WikiProject Statistics 28:This article is rated 535: 172:. Student editor(s): 563:, the index values 432:Proprietary Limited 105:Statistics articles 176:. Peer reviewers: 170:on the course page 34:content assessment 357: 345:comment added by 323:comment added by 238: 222:comment added by 139: 138: 135: 134: 131: 130: 735: 549:random variables 469:c'est la guerre! 373: 369: 335: 212:auto correlation 208: 167: 166:19 December 2021 163: 159: 125:importance scale 107: 106: 103: 100: 97: 76: 69: 68: 63: 55: 48: 31: 25: 24: 16: 743: 742: 738: 737: 736: 734: 733: 732: 708: 707: 691: 654: 634: 511: 371: 367: 363: 361:IID consistency 347:182.216.110.134 318: 288: 271: 244: 214: 191: 153: 144: 104: 101: 98: 95: 94: 61: 32:on Knowledge's 29: 12: 11: 5: 741: 739: 731: 730: 725: 720: 710: 709: 690: 687: 653: 650: 633: 630: 629: 628: 609: 608: 573: 572: 534: 533: 510: 507: 506: 505: 490: 485: 483: 482: 456: 455: 442: 429: 416: 402: 401: 362: 359: 325:95.208.167.145 305:203.91.225.198 282: 270: 268: 264: 263: 254: 253: 243: 240: 224:Chrisparker126 213: 210: 162:27 August 2021 152: 149: 143: 140: 137: 136: 133: 132: 129: 128: 121:Mid-importance 117: 111: 110: 108: 91:the discussion 77: 65: 64: 62:Mid‑importance 56: 44: 43: 37: 26: 13: 10: 9: 6: 4: 3: 2: 740: 729: 726: 724: 721: 719: 716: 715: 713: 706: 705: 701: 697: 688: 686: 685: 681: 677: 671: 670: 666: 662: 658: 651: 649: 648: 644: 640: 639:IntelligentET 631: 627: 623: 619: 615: 611: 610: 607: 603: 599: 595: 591: 587: 583: 579: 575: 574: 570: 566: 562: 558: 554: 550: 546: 541: 540: 538: 531: 530: 529: 528: 524: 520: 514: 508: 504: 500: 496: 491: 488: 487: 486: 481: 477: 473: 470: 466: 462: 458: 457: 454: 450: 446: 443: 441: 437: 433: 430: 428: 424: 420: 417: 415: 411: 407: 404: 403: 399: 398:abbreviations 395: 391: 387: 383: 382: 381: 380: 377: 360: 358: 356: 352: 348: 344: 336: 334: 330: 326: 322: 315: 314: 310: 306: 301: 300: 295: 292: 286: 281: 278: 275: 269: 267: 262: 259: 258: 257: 256:It links to: 252: 249: 248: 247: 241: 239: 237: 233: 229: 225: 221: 211: 209: 207: 203: 199: 195: 189: 187: 183: 179: 175: 171: 158: 150: 148: 141: 126: 122: 116: 113: 112: 109: 92: 88: 84: 83: 78: 75: 71: 70: 66: 60: 57: 54: 50: 45: 41: 35: 27: 23: 18: 17: 692: 672: 659: 655: 635: 593: 589: 585: 581: 577: 568: 564: 561:observations 560: 556: 555:there is an 552: 548: 544: 536: 515: 512: 484: 468: 464: 460: 453:→ etc. → etc 452: 448: 444: 439: 435: 431: 426: 422: 418: 413: 409: 405: 389: 364: 341:— Preceding 337: 319:— Preceding 316: 302: 296: 293: 289: 279: 276: 272: 265: 255: 245: 218:— Preceding 215: 190: 154: 145: 120: 80: 40:WikiProjects 696:Undsoweiter 661:Undsoweiter 557:indexed set 396:(and other 394:initialisms 372:{{sc2|IID}} 30:Start-class 712:Categories 495:SolidPhase 445:et caetera 386:full stops 186:Joannetsai 182:C.Hua Wang 96:Statistics 87:statistics 59:Statistics 449:et cetera 436:Pty. Ltd. 178:Yibeiiiii 174:Hanshenli 586:sequence 582:sequence 578:sequence 551:. What 545:sequence 519:Sarmadys 343:unsigned 321:unsigned 232:contribs 220:unsigned 198:PrimeBOT 142:Untitled 676:Nmdwolf 440:Pty Ltd 419:Company 392:) from 390:periods 123:on the 410:I.B.M. 376:MaxEnt 370:(i.e. 36:scale. 618:McKay 700:talk 680:talk 665:talk 643:talk 622:talk 614:here 602:talk 598:yoyo 523:talk 499:talk 476:talk 472:yoyo 447:(or 388:(or 351:talk 329:talk 309:talk 228:talk 202:talk 164:and 590:not 559:of 547:of 461:IDI 423:Co. 414:IBM 368:IID 196:by 115:Mid 714:: 702:) 682:) 667:) 645:) 624:) 604:) 594:is 553:is 525:) 501:) 478:) 438:→ 434:→ 427:Co 425:→ 421:→ 412:→ 408:→ 353:) 331:) 311:) 234:) 230:• 204:) 188:. 184:, 180:, 698:( 678:( 663:( 641:( 620:( 600:( 569:n 565:i 521:( 497:( 474:( 451:) 349:( 327:( 307:( 226:( 200:( 127:. 42::

Index


content assessment
WikiProjects
WikiProject icon
Statistics
WikiProject icon
WikiProject Statistics
statistics
the discussion
Mid
importance scale

on the course page
Hanshenli
Yibeiiiii
C.Hua Wang
Joannetsai
Template:Dashboard.wikiedu.org assignment
PrimeBOT
talk
22:56, 17 January 2022 (UTC)
unsigned
Chrisparker126
talk
contribs
18:40, 7 October 2019 (UTC)
http://de.wikipedia.org/search/?title=Unabh%C3%A4ngig_und_identisch_verteilt&redirect=no
http://de.wikipedia.org/Zufallsvariable#unabh.C3.A4ngig_und_identisch_verteilt
https://en.wikipedia.org/Uterine_malformation
https://en.wikipedia.org/Uterine_malformation

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.