#
Topic: **Statistics**

You are looking at all articles with the topic **"Statistics"**. We found 68 matches.

**Hint:**
*To view all topics, click here. Too see the most popular topics, click here instead.*

# π Timeline of the far future

While the future can never be predicted with absolute certainty, present understanding in various scientific fields allows for the prediction of some far-future events, if only in the broadest outline. These fields include astrophysics, which has revealed how planets and stars form, interact, and die; particle physics, which has revealed how matter behaves at the smallest scales; evolutionary biology, which predicts how life will evolve over time; and plate tectonics, which shows how continents shift over millennia.

All projections of the future of Earth, the Solar System, and the universe must account for the second law of thermodynamics, which states that entropy, or a loss of the energy available to do work, must rise over time. Stars will eventually exhaust their supply of hydrogen fuel and burn out. Close encounters between astronomical objects gravitationally fling planets from their star systems, and star systems from galaxies.

Physicists expect that matter itself will eventually come under the influence of radioactive decay, as even the most stable materials break apart into subatomic particles. Current data suggest that the universe has a flat geometry (or very close to flat), and thus will not collapse in on itself after a finite time, and the infinite future allows for the occurrence of a number of massively improbable events, such as the formation of Boltzmann brains.

The timelines displayed here cover events from the beginning of the 11th millennium to the furthest reaches of future time. A number of alternative future events are listed to account for questions still unresolved, such as whether humans will become extinct, whether protons decay, and whether the Earth survives when the Sun expands to become a red giant.

### Discussed on

- "Timeline of the Far Future" | 2024-01-14 | 16 Upvotes 1 Comments
- "Timeline of the Far Future" | 2020-05-17 | 168 Upvotes 112 Comments
- "Timeline of the far future" | 2018-07-18 | 696 Upvotes 258 Comments
- "Timeline of the far future" | 2012-05-06 | 294 Upvotes 88 Comments

# π Simpson's Paradox

**Simpson's paradox**, which goes by several names, is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. This result is often encountered in social-science and medical-science statistics and is particularly problematic when frequency data is unduly given causal interpretations. The paradox can be resolved when causal relations are appropriately addressed in the statistical modeling.

Simpson's paradox has been used as an exemplar to illustrate to the non-specialist or public audience the kind of misleading results mis-applied statistics can generate. Martin Gardner wrote a popular account of Simpson's paradox in his March 1976 Mathematical Games column in *Scientific American*.

Edward H. Simpson first described this phenomenon in a technical paper in 1951, but the statisticians Karl Pearson et al., in 1899, and Udny Yule, in 1903, had mentioned similar effects earlier. The name *Simpson's paradox* was introduced by Colin R. Blyth in 1972.

It is also referred to as or **Simpson's reversal**, **YuleβSimpson effect**, **amalgamation paradox**, or **reversal paradox**.

### Discussed on

- "Simpson's Paradox" | 2024-03-11 | 365 Upvotes 106 Comments
- "Simpsonβs Paradox" | 2022-02-06 | 11 Upvotes 3 Comments
- "Simpson's paradox" | 2011-07-29 | 174 Upvotes 34 Comments
- "Simpson's paradox: why mistrust seemingly simple statistics" | 2009-08-28 | 152 Upvotes 17 Comments

# π Goodhart's Law

**Goodhart's law** is an adage named after economist Charles Goodhart, which has been phrased by Marilyn Strathern as "When a measure becomes a target, it ceases to be a good measure." One way in which this can occur is individuals trying to anticipate the effect of a policy and then taking actions that alter its outcome.

### Discussed on

- "Goodhart's Law" | 2021-09-17 | 178 Upvotes 83 Comments
- "Goodhart's Law: When a measure becomes a target, it ceases to be a good measure" | 2018-06-15 | 229 Upvotes 134 Comments

# π The German tank problem

In the statistical theory of estimation, the **German tank problem** consists of estimating the maximum of a discrete uniform distribution from sampling without replacement. In simple terms, suppose we have an unknown number of items which are sequentially numbered from 1 to *N*. We take a random sample of these items and observe their sequence numbers; the problem is to estimate *N* from these observed numbers.

The problem can be approached using either frequentist inference or Bayesian inference, leading to different results. Estimating the population maximum based on a *single* sample yields divergent results, whereas estimation based on *multiple* samples is a practical estimation question whose answer is simple (especially in the frequentist setting) but not obvious (especially in the Bayesian setting).

The problem is named after its historical application by Allied forces in World War II to the estimation of the monthly rate of German tank production from very few data. This exploited the manufacturing practice of assigning and attaching ascending sequences of serial numbers to tank components (chassis, gearbox, engine, wheels), with some of the tanks eventually being captured in battle by Allied forces.

### Discussed on

- "German Tank Problem" | 2023-11-26 | 39 Upvotes 7 Comments
- "The German tank problem" | 2016-12-03 | 64 Upvotes 6 Comments
- "German tank problem" | 2014-02-21 | 231 Upvotes 83 Comments
- "German tank problem" | 2011-03-02 | 28 Upvotes 1 Comments
- "The German Tank Problem" | 2009-06-23 | 103 Upvotes 18 Comments

# π Secretary Problem

The **secretary problem** is a problem that demonstrates a scenario involving optimal stopping theory. The problem has been studied extensively in the fields of applied probability, statistics, and decision theory. It is also known as the **marriage problem**, the **sultan's dowry problem**, the **fussy suitor problem**, the **googol game**, and the **best choice problem**.

The basic form of the problem is the following: imagine an administrator who wants to hire the best secretary out of $n$ rankable applicants for a position. The applicants are interviewed one by one in random order. A decision about each particular applicant is to be made immediately after the interview. Once rejected, an applicant cannot be recalled. During the interview, the administrator gains information sufficient to rank the applicant among all applicants interviewed so far, but is unaware of the quality of yet unseen applicants. The question is about the optimal strategy (stopping rule) to maximize the probability of selecting the best applicant. If the decision can be deferred to the end, this can be solved by the simple maximum selection algorithm of tracking the running maximum (and who achieved it), and selecting the overall maximum at the end. The difficulty is that the decision must be made immediately.

The shortest rigorous proof known so far is provided by the odds algorithm (Bruss 2000). It implies that the optimal win probability is always at least $1/e$ (where *e* is the base of the natural logarithm), and that the latter holds even in a much greater generality (2003). The optimal stopping rule prescribes always rejecting the first $\sim n/e$ applicants that are interviewed and then stopping at the first applicant who is better than every applicant interviewed so far (or continuing to the last applicant if this never occurs). Sometimes this strategy is called the $1/e$ stopping rule, because the probability of stopping at the best applicant with this strategy is about $1/e$ already for moderate values of $n$. One reason why the secretary problem has received so much attention is that the optimal policy for the problem (the stopping rule) is simple and selects the single best candidate about 37% of the time, irrespective of whether there are 100 or 100 million applicants.

### Discussed on

- "Secretary Problem" | 2024-04-12 | 31 Upvotes 7 Comments
- "The Secretary Problem" | 2022-08-18 | 202 Upvotes 120 Comments
- "Secretary Problem" | 2017-10-27 | 145 Upvotes 62 Comments

# π Monty Hall Problem

The **Monty Hall problem** is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show *Let's Make a Deal* and named after its original host, Monty Hall. The problem was originally posed (and solved) in a letter by Steve Selvin to the *American Statistician* in 1975 (Selvin 1975a), (Selvin 1975b). It became famous as a question from a reader's letter quoted in Marilyn vos Savant's "Ask Marilyn" column in *Parade* magazine in 1990 (vos Savant 1990a):

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

Vos Savant's response was that the contestant should switch to the other door (vos Savant 1990a). Under the standard assumptions, contestants who switch have a 2/3 chance of winning the car, while contestants who stick to their initial choice have only a 1/3 chance.

The given probabilities depend on specific assumptions about how the host and contestant choose their doors. A key insight is that, under these standard conditions, there is more information about doors 2 and 3 than was available at the beginning of the game when door 1 was chosen by the player: the host's deliberate action adds value to the door he did not choose to eliminate, but not to the one chosen by the contestant originally. Another insight is that switching doors is a different action than choosing between the two remaining doors at random, as the first action uses the previous information and the latter does not. Other possible behaviors than the one described can reveal different additional information, or none at all, and yield different probabilities. Yet another insight is that your chance of winning by switching doors is directly related to your chance of choosing the winning door in the first place: if you choose the correct door on your first try, then switching loses; if you choose a wrong door on your first try, then switching wins; your chance of choosing the correct door on your first try is 1/3, and the chance of choosing a wrong door is 2/3.

Many readers of vos Savant's column refused to believe switching is beneficial despite her explanation. After the problem appeared in *Parade*, approximately 10,000 readers, including nearly 1,000 with PhDs, wrote to the magazine, most of them claiming vos Savant was wrong (Tierney 1991). Even when given explanations, simulations, and formal mathematical proofs, many people still do not accept that switching is the best strategy (vos Savant 1991a). Paul ErdΕs, one of the most prolific mathematicians in history, remained unconvinced until he was shown a computer simulation demonstrating vos Savantβs predicted result (Vazsonyi 1999).

The problem is a paradox of the *veridical* type, because the correct choice (that one should switch doors) is so counterintuitive it can seem absurd, but is nevertheless demonstrably true. The Monty Hall problem is mathematically closely related to the earlier Three Prisoners problem and to the much older Bertrand's box paradox.

### Discussed on

- "Monty Hall Problem" | 2022-06-09 | 24 Upvotes 116 Comments
- "Monty Hall Problem" | 2019-10-24 | 122 Upvotes 252 Comments
- "Monty Hall problem" | 2010-02-22 | 14 Upvotes 27 Comments

# π Benford's Law

**Benford's law**, also called the **NewcombβBenford law**, the **law of anomalous numbers**, or the **first-digit law**, is an observation about the frequency distribution of leading digits in many real-life sets of numerical data. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small. For example, in sets that obey the law, the number 1 appears as the leading significant digit about 30% of the time, while 9 appears as the leading significant digit less than 5% of the time. If the digits were distributed uniformly, they would each occur about 11.1% of the time. Benford's law also makes predictions about the distribution of second digits, third digits, digit combinations, and so on.

The graph to the right shows Benford's law for base 10. There is a generalization of the law to numbers expressed in other bases (for example, base 16), and also a generalization from leading 1 digit to leading *n* digits.

It has been shown that this result applies to a wide variety of data sets, including electricity bills, street addresses, stock prices, house prices, population numbers, death rates, lengths of rivers, physical and mathematical constants. Like other general principles about natural dataβfor example the fact that many data sets are well approximated by a normal distributionβthere are illustrative examples and explanations that cover many of the cases where Benford's law applies, though there are many other cases where Benford's law applies that resist a simple explanation. It tends to be most accurate when values are distributed across multiple orders of magnitude, especially if the process generating the numbers is described by a power law (which are common in nature).

It is named after physicist Frank Benford, who stated it in 1938 in a paper titled "The Law of Anomalous Numbers", although it had been previously stated by Simon Newcomb in 1881.

### Discussed on

- "Benford's Law" | 2020-02-15 | 145 Upvotes 93 Comments
- "Benford's Law" | 2017-11-19 | 107 Upvotes 44 Comments
- "Benford's law" | 2014-05-24 | 56 Upvotes 19 Comments
- "Random numbers need not be uniform" | 2010-06-14 | 25 Upvotes 32 Comments

# π Kelly Criterion

In probability theory and intertemporal portfolio choice, the **Kelly criterion** (or **Kelly strategy** or **Kelly bet**), also known as the scientific gambling method, is a formula for bet sizing that leads almost surely to higher wealth compared to any other strategy in the long run (i.e. approaching the limit as the number of bets goes to infinity). The Kelly bet size is found by maximizing the expected value of the logarithm of wealth, which is equivalent to maximizing the expected geometric growth rate. The Kelly Criterion is to bet a predetermined fraction of assets, and it can seem counterintuitive. It was described by J. L. Kelly Jr, a researcher at Bell Labs, in 1956.

For an even money bet, the Kelly criterion computes the wager size percentage by multiplying the percent chance to win by two, then subtracting one-hundred percent. So, for a bet with a 70% chance to win the optimal wager size is 40% of available funds.

The practical use of the formula has been demonstrated for gambling and the same idea was used to explain diversification in investment management. In the 2000s, Kelly-style analysis became a part of mainstream investment theory and the claim has been made that well-known successful investors including Warren Buffett and Bill Gross use Kelly methods. William Poundstone wrote an extensive popular account of the history of Kelly betting.

### Discussed on

- "Kelly Criterion" | 2021-04-16 | 330 Upvotes 194 Comments

# π Micromort

A **micromort** (from micro- and mortality) is a unit of risk defined as one-in-a-million chance of death. Micromorts can be used to measure riskiness of various day-to-day activities. A **microprobability** is a one-in-a million chance of some event; thus a micromort is the microprobability of death. The micromort concept was introduced by Ronald A. Howard who pioneered the modern practice of decision analysis.

Micromorts for future activities can only be rough assessments as specific circumstances will always have an impact. However past historical rates of events can be used to provide a ball park, average figure.

### Discussed on

- "Micromort" | 2023-08-26 | 15 Upvotes 3 Comments
- "Micromort" | 2020-06-19 | 152 Upvotes 72 Comments
- "Micromort" | 2013-08-23 | 173 Upvotes 99 Comments

# π Friendship Paradox

The **friendship paradox** is the phenomenon first observed by the sociologist Scott L. Feld in 1991 that most people have fewer friends than their friends have, on average. It can be explained as a form of sampling bias in which people with greater numbers of friends have an increased likelihood of being observed among one's own friends. In contradiction to this, most people believe that they have more friends than their friends have.

The same observation can be applied more generally to social networks defined by other relations than friendship: for instance, most people's sexual partners have had (on the average) a greater number of sexual partners than they have.

### Discussed on

- "The Friendship Paradox" | 2021-07-04 | 27 Upvotes 4 Comments
- "Friendship Paradox" | 2014-08-04 | 288 Upvotes 69 Comments
- "The Friendship Paradox: Why People's Friends Have More Friends Than They Do" | 2010-04-12 | 41 Upvotes 15 Comments