will be ignoring the "bad roll". What else would you expect them to do? If the squirrel eats the die, then maybe they would count it as "rolling a 10 trillion"?? No! That's silly, no one would ever think to do that. The only sensible thing to do is to re-roll. If they did not have a spare die, they would say "I was not able to successfully roll the die, go ask someone else." People understand that "rolling a die" is not complete until you have an answer which is either 1,2,3,4,5,6.
1495:"Borel's law of large numbers, named after Émile Borel, states that if an experiment is repeated a large number of times, independently under identical conditions, then the proportion of times that any specified event occurs approximately equals the probability of the event's occurrence on any particular trial;" The LLN as stated might as well be Bernoulli's original LLN from 1713. There seems to be no reason to attribute *that* statement to Borel. Compare e.g. 1220:). They mention that "C is uniformly learnable if and only if the VC dimension of C is finite," where "a learning function for C is a function that, given a large enough randomly drawn sample of any target concept in C, returns a region in E (a hypothesis) that is with high probability a good approximation to the target concept." My understanding of "concept" here is function. Perhaps my understanding of "concept" is wrong or the LLN has limitations. 263: 1302:
the variance would need to exist): in fact your source seems to be using this in its proof. In this case your source is stating a sufficient condition for the law to hold ... it can and does hold under weaker conditions. The "law" is the description of the behaviour of the mean, not really any one statement of conditions under which it can be said/shown to hold. However, the article could do with better citations.
1076:(a run of heads is compensated by a run of tails, so difference between heads and tails approaches 0, which is false). The way it's presented here now is gambler's fallacy. And gambler's fallacy seems somewhat more common online, but I didn't look enough to be sure. Therefore my initial impression is that this page should be a disambiguation or short article sending interested readers to learn more at either 1982:"It follows from the law of large numbers that the empirical probability of success in a series of Bernoulli trials will converge to the theoretical probability. For a Bernoulli random variable, the expected value is the theoretical probability of success, and the average of n such variables (assuming they are independent and identically distributed (i.i.d.)) is precisely the relative frequency." 1486:"Convergence in probability is also called weak convergence of random variables". I don't think this is standard or fortunate. Convergence in distribution is already called weak convergence. The MSC (Mathematics Subject Classification) category 60F05 is "Central limit and other weak theorems", meaning theorems with convergence in distribution, not convergence in probability (as far as I know). 22: 846:. A sequence of sample means won't converge, because the average of n samples drawn from the Cauchy distribution has *exactly* the same distribution as the samples. I think the article definitely needs a section about this misconception with examples and a neat graph of diverging sequence of averages, but as you might see, my English is too bad for writing it myself. -- 1518:(eg tossing a coin or betting on roulette) tends to give a pattern over a large number of incidences? Why is it that we can anticipate (roughly) what the average will be, rather than its being completely random and not able to be anticipated? Surely there's a place for that issue in the article, if there is some literature on it. Thanks. 372:"The strong law implies the weak law but not vice versa, when the strong law conditions hold the variable converges both strongly (almost surely) and weakly (in probability). However the weak law may hold in conditions where the strong law does not hold and then the convergence is only weak (in probability)." 2121:
and i think it would be good, to underline the fact that only the average converges and not the sum of the outcome minus the theoretical outcome, to add a paragraph about this, and also a diagram showing a dice throwing experiment (with excel, with the number of throws on the abscissa, and the sum of
There are two requirements for the term "black swan events" to be technically applicable: The events are (1) rare, and (2) sufficiently consequential to affect long-term average behavior. For example, the performance of a stockmarket investor, even averaged over 15 years, may be significantly altered
First of all, you removed information without explaining in an edit summary -- twice. That makes the edits subject to revert. Secondly, it might help the rest of us who can't read your mind to explain what you are referring to with your comment "lead to Nasim Taleb's vanity page". And finally, give a
Why on earth does an article on the law of large numbers lead to Nasim Taleb's vanity page? I have also deleted the rest of the sentence, which was unencyclopedic, and unnecessary. If you disagree, could you please show a reference from the serious lln literature that mentions lightning or references
OK, I'll be up front and admit that the math on this page is beyond me. I looked up the Law of Large Numbers to try and find out why it happens. (I mean why it happens, not how it happens.) So can someone explain in plain (or even complicated) English why something that is random each time you do it
Looking through the references currently given for the uniform law of large numbers, I notice a technical issue in the current statement. One of the references (Newey & McFadden 1994) gives a formulation of the uniform LLN which allows for the function f to be continuous almost everywhere, but
There is a proof of the Strong Law of Large Numbers that is accessible to students with an undergraduate study of measure theory, its established by applying the dominated convergence theorem to the limit of indicator functions, and then using the Weak Law of Large Numbers on the resulting limit of
The section on the strong law gets excessively wordy describing exactly what it means to be strong (in an unclear way, since a theorem can be strong when the hypothesis is weaker, (so that it implies the weak one and applies to more cases) or when both the hypothesis and conclusion are stronger, as
I don't know if this page is active, but it's a classic mistake to think that since by increasing the number of trials, the empirical average gets closer to its theoretical value, it's the same for the sum of the outcome, that would get closer to the sum of the average outcome, while in fact, it's
allowing a set of measure 0 on which discontinuities may occur), but gives the stronger mode of a.s. convergence. Is there an obvious synthesis of these two statements which yields the hybrid given in the article (a.e. continuous function f as well as a.s. convergence) or is there a reference out
There are citations separately (slightly later) for the weak and strong forms of the law and when they hold. The article is quite specific about what these laws mean, but it may be that your source says the "law" means something else, such as the variance of the mean descreasing to zero (for which
Generally I agree. But in this instance I don't. We are talking about convergence. The sample average converges towards the mean, this should simply be interpreted as the distance between the two grows smaller as the number in the sample increases and tend to infinity. That we are dealing with two
Let's say we do coin tosses and let's say we assume P(head) = 0.1 and P(tail) = 0.9 as probabilities. That's a legitimate probability function according to Kolmogorov. But now the LLN becomes obviously false. So there must be some premise of LLN that forbids this constellations. Which one is it?
integral over and came out with an answer that was very close to 9. When I switched back to the presented function I did not get answers consistent with the article. Concerned that I was doing something wrong in SAS, I also carried out the same process in R and got the same answers I got in SAS.
This whole section about "Lebesgue integrable" random variables makes *no mathematical sense whatsoever*. The person who wrote this clearly does not know the slightest thing about math. You cannot just take any random variable and start "integrating" it with respect to the Lebesgue measure. First
By the way, I was motivated to make one myself, so I just put in the animated gif of red and blue balls. I think my animated one and your non-animated one are complementary and both should be in the article...different readers may respond better to one or the other. Let me know if you have any
The section on application that uses Monte Carlo simulation and the Law of Large Numbers to approximate an integral may be erroneous. I wrote a program in SAS to carry out this approximation and was getting strange results. I changed the function to x^2 and used my program to approximate this
It is true that it is simply an explanation of convergence in probability in words. However, this may be very insightful to those who are not well versed in probability theory or even mathematical formalism. The section that includes this paragraph would be substantially poorer without this
The six numbers on a die are interchangeable with any other set of symbols - is an integer mean relevant? My first guess is that the result would be 3, if I'm looking at random integers , rather than 7/2 which seems like part of a different concept, or an artifact of the way dice are labeled
This section also gets wordy on another count, as it appears some are arguing as to whether the strong and weak forms are possibly equivalent. There are examples of probability distributions for which the weak law applies, but not the strong law. As such, I suggest the following be removed:
1150:, for which we already have a fairly developed article; the second example should be dubbed “idiot’s fallacy” or something like that — really, is there a person who would think that out of 99 coin tosses, exactly 49.5 of them should be heads?; the third example is just a corollary from the 2125:
usage is more common and the focus of the article.) The difficulty in preventing students from conflating the so-called-"law" of averages/Gambler's fallacy with the law of large numbers is an extremely common problem for introductory probability and statistics instructors. I agree that
Interpreting this result, the weak law essentially states that for any nonzero margin specified, no matter how small, with a sufficiently large sample there will be a very high probability that the average of the observations will be close to the expected value, that is, within the
1489:"Differences between the weak law and the strong law". It may be interesting to add here that the Weak Law may hold even if the expected value does not exist (see e.g. Feller's book). This underlines that, in their full generality, none of the laws follows directly from the other. 1492:"Uniform law of large numbers". The uniform LLN holds under quite weaker hypotheses. This is definitely uninteresting to the average reader, but a reference to the Blum-DeHardt LLN or the Glivenko-Cantelli problem might be very valuable to a small fraction of readers. 837:
From reading this article many can get the wrong impression that a sequence of averages almost surely converges, and converges to the expected value. But in reality the law of large numbers only works when expected value of the distribution exists, and there are many
the results on the ordinate, so that we can see that the curve of the results doesn't converge towards the theoretical curve), and next to it the curve of the empirical average (of the same experiment) which we would see converging towards the theoretical line
by the amount he lost in a single hour during a market crash. So that's a black swan event. When you are rolling dice, a black swan event is impossible because the distribution of possible numerical results is so restricted: 1,2,3,4,5,6. It is never 10 billion!
such as a die landing on edge or being struck by lightning mid-roll are not possible or ignored if they do occur" is "unencyclopedic and unnecessary"; it is linked to a page with well sourced explanations as to why it is enyclopedic. Additionally, if you think
wheel, its earnings will tend towards a predictable percentage over a large number of spins. Any winning streak by a player will eventually be overcome by the parameters of the game. Importantly, the law applies (as the name indicates) only when a
and with source code available. It also looks a little different and has different data (new data may be generated by anyone with the inclination using my provided source code (or their own)). I would like to propose that we switch to my image,
342:→I don't understand your message, but why would the LLN "become obviously false"? if you toss your coin an high number of times, the number of tail divided by the number of toss may lean toward 0.9, i don't understand your problem with that? 1318:
The Cauchy distribution is a bad example, it does not have a mean, and hence no finite variance or higher moments. And the proof using characteristic functions does not seem to use the assumption of finite variance. I'll add a citetation.
Yes; if the VC dimension is infinite the learning function (with high probability) still converges pointwise to the correct characteristic function. But the convergence won't be uniform. Unfortunately, this is hardly a "quick" answer....
1988:"According to the law of large numbers, if a large number of six-sided die are rolled, the average of their values (sometimes called the sample mean) is likely to be close to 3.5, with the precision increasing as more dice are rolled." 2112:
add a category to recall the fact that despite the deviation to the mean decreases with increasing numbers, the standard deviation (so the raw deviation between theoretical output and the actual one) increases when the number of trial
forms of convergence and that thier exact definitions are different from oneanother is not of interest to the non-mathematician. And if the reader is interested in the exact difference between the two then there's an article about
this article (meaning Law of Averages), with redirect to LLN. The cited source uses the term “law of averages” as a synonym for LLN, and does not provide the interpretation given in this article. The examples section looks like
I believe my programs are producing the correct answers, which is why I am concerned that the example on the LLN page may be erroneous. Can someone please verify my calculations or tell me where the error is in my code?
The problem with this is to make it clear exactly what "Borel's law of large numbers" is in the context of the larger article, since presumably Borel's law of large numbers is notable enough to br mention specifically.
all the relative frequencies involved in the given reasoning are stable in the first place, the difference from a finite number of trails between the measured and "ideal" mean is likely to be less than so and so.
of observations are considered. There is no principle that a small number of observations will coincide with the expected value or that a streak of one value will immediately be "balanced" by the others (see the
to clarify that it was the margin to which the paragraph referred. I did so because, despite knowing the LLN and its mathematical formulation, it wasn't immediately clear to me what margin was being discussed.
2826:* Set up function for integration manually *; /* fi = xi**2; Simple Example for process checking */ fi = cos(xi)*cos(xi)*sqrt((xi*xi*xi)+1); /* Knowledge Example 1 */ 2683: 2372: 1413:
There is a universal common-sense intuitive understanding of what it means to "roll a die". According to that understanding, black swan events are irrelevant and the simple sentence is completely correct.
I'm looking for a quick answer, trying to resolve a certain issue. Does the LLN hold even when we're collecting samples from and for a model/function that has infinite VC dimension?
1158:; the last example is not even funny — people don't think that in the longrun a good team and a bad team would perform equally, that would contradict the mere notion of “skill”. 1878:
that the empirical mean deviates by 1/2 from the theoretical expectation ; the strong LLN only states that for these bad initial n tosses the error will a.s. be corrected later.
Is false, the difference in absolute value will be unbounded, but it will also be 0 an infinite amount of times. The lim inf will be zero, and the lim sup will be infinity.
there which gives the stronger statement? If not, it may be worth revising the statement to more accurately reflect the references. Gillespie 22:17, 21 March 2021 (UTC)
Whitt, Ward (2002) Stochastic-Process Limits, An Introduction to Stochastic-Process Limits and their Application to Queues, Chapter 1: Experiencing Statistical Regularity
provides only for uniform convergence in probability. The other (Jennrich 1969) gives a formulation which allows only for the function f to be continuous everywhere (
I removed the annoying references/citations tag and added a few references. Should there be more citations? Should I have left the tag where it was? I dont think so.
2944: 2458: 429: 1995: 1483:"with the accuracy increasing as more dice are rolled." This is not correct, and in the figure the accuracy for n=100 is greater than for n=200 or even 300. 35: 2969: 2954: 230: 125: 2529: 2218: 1959:
study math (and measure theory) before you come to Knowledge to "educate" people with your wisdom that is nothing more than pure ignorance and stupidity.
I think any time you can add text interpretation to a math article it is hugely helpful, even if those who already know it all find it just extra words.
as well as the clarification needed prior, and the citation needed after. Is the StackExchange conversation a sufficient reference to make such an edit?
that might be of use? I added a {{mergeto}} tag to that article. If there is nothing worthwhile then perhaps simply replace it with a redirect?
not the case (and on the contrary, the standard deviation increases), and this value converges only if we divide it by the number of trials.
This is simply explaining in words what convergence in probability is. I don't consider it useful. I'll remove it shortly if no-one objects.
The LLN is important because it guarantees stable long-term results for the averages of some random events. For example, while a
I hope that someone knolwedgeable about this topic can rewrite the introduction so that it is comprehensible to most readers.
The two ideas are not the same, and there's already a cross-reference in both articles' "See also" sections. I don't think an
requires additional assumptions, equivalent to assuming the law itself true a priori. See a more elaborate explanation here:
So again, black swan events should certainly not be mentioned in the context of rolling dice. It is an irrelevant tangent. --
803: 332: 1279:
as an example where the variance is not finite, and the Law of Large Numbers does not hold (in the section 'Cauchy case').
detailed explanation as to why you think the sentence "This assumes that all possible die roll outcomes are known and that
do i = 1 to &n; xi = &a+%sysevalf(&b-&a)*ranuni(0); * Generate U random values *;
infinite number of trails to a definite number, this "P()". The only thing the theorem allows to conclude, is that
should be part of a collection of non-mathematically-formal articles partly related to gambling, as in mention of
2013: 2711:, the meaning of the term " "theoretical results" " (the previous term, but ths time inside quootation marks) is 839: 523: 1496: 386:
To date it has not been possible to prove that the strong law conditions are the same as those of the weak law.
that describes the result of performing the same experiment a large number of times. According to the law, the
probabilities. Would this be appropriate for inclusion with the group of articles on the Laws of Large Numbers?
other formulas that look similar are not verified, such as the raw deviation from "theoretical results":
other formulas that look similar are not verified, such as the raw deviation from "theoretical results":
required, without citation or justification. Every single other source I saw said that finite variance
183: 2060: 1942: 1608:.". I do not believe this sentence (if I am wrong, please ignore me and delete this post). If you fix 1189:
despite it being pointed out that discussion is here, so I have copied the immediately above to here.
condition E(|X|)<inf is same as that the random variables x has Lebesgue integrateable expectation
depends on the point in the set of convergence (the non-uniformity alluded to in the second comment).
These two comments seem correct to me : the mistake in the article stems from the fact that, given
This seems like a fairly serious mistake to me so I'll immediately remove the wrong statement.
What if we would use different legitimate integration methods for expectation definition as:
Then we will shurely have random variables with finite expectation where L.L.N do not hold.
It is also important to note that the LLN only applies to the average. Therefore, while
It is also important to note that the LLN only applies to the average. Therefore, while
Yao, Kai; Gao, Jinwu (2016). "Law of Large Numbers for Uncertain Random Variables".
Preferably by someone who understands what writing an encyclopedia article entails.
is here.) I think it would be better to rewrite most of that verbage, specifically:
In my opinion, many statements expressed on this page are not correct, such as:
and tends to become closer to the expected value as more trials are performed.
of the results obtained from a large number of trials should be close to the
For example, for the flip of a fair coin, for any n there is a probability
BirthdaBirthday date problem issue sir Birthday date problem issue sir
890: 886: 303: 256: 15: 2678:{\displaystyle \sum _{i=1}^{n}X_{i}-n\times {\overline {X}}} 2367:{\displaystyle \sum _{i=1}^{n}X_{i}-n\times {\overline {X}}} 2889:
I was reading some papers on statistical learning (
