Table Of Contents
What Is T-Distribution?
T-Distribution is a continuous probability distribution. It is used when sample sizes are smaller than the normal distribution, say less than 30. This method identifies the disparity between the sample and population means when the population standard deviation is unknown.
It is used in statistics to determine extreme confidence interval values for a normal distribution with small sample size. Just like normal distributions, the T-Distribution forms a symmetric bell-shaped curve. But T-curves have fatter tails than a normal distribution curve; this depicts extreme confidence interval values.
Table of contents
- The T-Distribution is a measure of probability (p-value). It is used to find the statistical significance when the sample size is small, i.e., less than 30, with an obscure standard deviation.
- The mean of a T-Distribution is evaluated as zero, and the variance is derived as v/(v-2), where v is the degree of freedom.
- This method finds extreme values—the confidence interval's lower and upper limits.
T-Distribution in Statistics Explained
William Sealy Gosset introduced the T-Distribution in statistics as a probability analysis method in 1908. It is applied to cases with small sample sizes and obscure standard deviation (population).
The following properties of T-Distribution differentiate it from the other kinds of probability distributions:
- The T-Distribution or student T-Distribution forms a symmetric bell-shaped curve with fatter tails.
- Its mean comes out to be zero.
- The value of the distribution ranges between -∞ and ∞.
- Its variance is computed as v/(v-2). Here, v ≥ 2 and ‘v’ denotes the degree of freedom: ‘Var (t) = v/(v -2)’.
- The variance values are higher than 1 when the degree of freedom is infinite; it provides a value close to the standard normal distribution.
- Compared to standard normal distributions, students' T-Distribution is highly dispersed. However, in the case of larger sample sizes, i.e., n ≥ 30, it resembles a normal distribution.
Let us have a look at the T-Distribution graph:
Although it is a widely used method, it is criticized for its inaccuracy (in certain cases). Also, when larger sample sizes, normal distribution tends to be a better option.
Calculation
This method involves the computation of two values:
#1 - T-Score
The formula used to determine the value of T-Distribution is as follows:
Here,
- x̄ is the sample mean;
- μ is the population mean;
- s is the standard deviation;
- n is the sample size.
#2 - Degree of Freedom
Variance is derived using the degree of freedom for the given data series. It is computed as sample size minus 1:
df = n - 1
Here,
- ‘df’ is the degree of freedom;
- ‘n’ is the sample size.
The value derived from the above formula is the t-score. Then, the value of the t-score and the degree of freedom are used to determine the p-value or probability using the T-Distribution table. This way, the chance of getting the desired outcome is determined.
Alternatively, T-distribution calculators from the internet can be used to derive the results.
Example
ABC Poultry Farms supplies eggs. The company claims its eggs remain fresh for five days if refrigerated. An analyst samples 25 eggs to test this claim. The average freshness of eggs was 4.5 days, with a standard deviation of a day. If the company's claim is true, find the probability of all selected eggs lasting about 4.5 days.
Solution:
Given:
- x̄ = 4.5 days
- μ = 5 days
- s = 1 day
- n = 25
Therefore,
t = (x̄-µ)/(s/√n)
t = (4.5 – 5)/(1/√25)
t = -0.5/0.2 = -2.5
Since the minus sign is irrelevant here, we get t = 2.5.
Degree of Freedom (df) = n – 1
df = 25 – 1 = 24
Thus, according to the t-test, the probability (p-value) of eggs not lasting for more than 4.5 days is 0.01965418.
Note: To find the p-value, we have substituted the values of t-score and degree of freedom into an online calculator to get the result: 0.01965418.
T-Distribution vs Normal Distribution
It differs from a normal distribution, but both form a symmetric bell-shaped curve. Also, both result in a mean value of zero. If the degree of freedom is high, then the derived students' T-Distribution value nears the normal distribution value.
The differences are as follows:
Basis | T-Distribution | Normal Distribution |
---|---|---|
Meaning | It is a continuous probability measure where the t-score and degree of freedom provide the p-value of a data set. It is used when the sample size is small, and there is no information on the population standard deviation. | The normal distribution is the most common continuous probability distribution. It is used to test the random independent variables. |
Sample Size | It is applied when the sample size is small, i.e., less than 30. | It is applied when the sample size is large. |
Population Standard Deviation | The population standard deviation is not given. | The population standard deviation is known. |
Curve | The T-curve is flatter and heavier on the tails. | The standard normal distribution curve is longer and thinner on the tails. |
Conservativeness | More conservative | Less conservative |
Formula | 'T = (x̄-µ)/(s/√n)' Here x̄ is the sample mean; μ is the population mean; s is the standard deviation; and n is the sample size. | 'Z = (X - µ)/σ' Here Z is the Z-score of the observations; µ mean of the observations, and σ is the standard deviation. |
Frequently Asked Questions (FAQs)
Given below are the various properties:
• Symmetric bell-shaped curve with fatter tails;
• x̄ (t) = 0.
• T-value ranges between -∞ and ∞.
• 'Var (t) = v/(v -2).' Here v ≥ 2 and 'v' denotes the degree of freedom.
• Var (t) > 1;
• This distribution resembles standard normal distribution when the degree of freedom reaches infinite.
• When there is high dispersion and a large sample size, i.e., n ≥ 30, it resembles a normal distribution.
This method is applied for hypothesis testing in statistics, medicine, finance, and business. Also, it is used to find extreme values—the lower and the upper limits of the confidence interval. Moreover, it is used for determining P-values in t-tests and the coefficients of regression analysis.
It is computed by dividing the difference between the sample and population means by the value acquired from dividing the standard deviation by the square root of the sample size. It is mathematically represented as follows:
't = (x̄-µ)/(s/√n)'
Recommended Articles
This article is a guide to What is T-Distribution. We explain its use in statistics, calculation, an example, and differences from the normal distribution. You can learn more about it from the following articles -