Table Of Contents
Bernoulli Distribution Definition
Bernoulli distribution is a discret univariate probability distribution. A Bernoulli trial or experiment results in binary outcomes: success or failure (0 or 1). The trialâs success is denoted as p (x=1), and failure is expressed as 1-p ( x=0).
This probability distribution is widely applied in machine learning, data analytics, data science, medicines, and finance. In addition, it is considered a convenient method of determining probability in real-world scenarios. For example, it can determine the success or failure of a medical test, studentâs exam, or interview selection.
- The Bernoulli distribution of an event is calculated using the following formula:
Key Takeaways
- The Bernoulli distribution is a discrete probability indicator. It is used for determining the possible outcome of a single random experiment (Bernoulli trial). Such a trial can only have two results, success or failure.
- It is different from Binomial distribution, which determines the probability for multiple Binomial trials.
Bernoulli Distribution ExplainedÂ
Bernoulli distribution is performed when researchers want to find the probability of achieving a binary outcomeâfrom a single Bernoulli trial or random experiment. The result can be a success: x or n = 1, or it can be a failure: x or n = 0.
Swiss mathematician Jakob Bernoulli proposed the Bernoulli probability distribution. It was published posthumously in 1713.
The following features differentiate Bernoulli probability from the other probability distributions:
- First, it is a univariate probability distribution.
- Second, it is a discrete random experiment.
- Third, each Bernoulli trial is independent of one other.
- Finally, possible outcomes are binary, i.e., success or failure.
The properties of a Bernoulli distribution are as follows:
- The Bernoulli trial can provide only two likely outcomesâ0 or 1, i.e., failure or success.
- The probability of success is denoted as p, while the probability of failure is expressed as q or 1-p.
- The sum of the two outcomes should equal 1, i.e., p + q = 1.
- The probability distribution remains constant at each successive Bernoulli trial, independent of one another.
- The expected mean of the Bernoulli distribution is denoted as E = p. Here, X is the random variable.
- The Bernoulli distribution variance for random variable is expressed as, Var = p (1 â p).
The Bernoulli method is more convenient than other methods when the probability of a real-world scenario is determined. This is because when analysts determine the probability for real-world scenarios, success refers to the expected result, and failure is the opposite.
Formula
The Bernoulli probability is denoted by P; it provides only two types of conclusions, success or failure. It is computed using the following formula.
- Here, âxâ is the outcome, which can either be a success (x=1), or failure, ( x=0)
- âpâ is the probability of getting success.
- q = 1-p, and it denoted the probability of failure.
- The value of p is 0 < p < 1.
Mean And Variance Of Bernoulli Distribution
The expected mean of the Bernoulli distribution is derived as the arithmetic average of multiple independent outcomes (for random variable X). Now, let us understand the mean formula:
According to the previous formula: P (X=1) = p
P (X=0) = q = 1-p
E (X) = P (X=1) Ă 1 + P (X=0) Ă 0
E (X) = p Ă 1 + (1-p) Ă 0
E (X) = p
Hence, the expected mean of the Bernoulli distribution is p.
With the help of the mean, we can compute the Bernoulli distribution variance. It is the difference between the expected mean of X2 and the expected mean square. Let us see its mathematical representation:
Var (X) = E (X2) â (E (X))2
E (X2) = âx2 P (X=x)
E (X2) = 12 Ă P (X=1) + 02 Ă P (X=0)
E (X2) = 1 Ă p + 0 Ă (1-p)
E (X2) = p
Var (X) = p â (p)2 = p(1-p) = pq
Thus, the variance of the Bernoulli distribution is pq.
Examples
Let us consider a few Bernoulli distribution examples to understand the concept:
Example #1
Let us assume that out of every 50 people in a city, 1 is a business owner. So, If one citizen is selected randomly, what is the distribution of business owners?
Solution:
Given:
p = 1/50
P (X = x) = p x (1-p) (1-x)
Thus, P (X = x) = (1/50) x (1 - 1/50) (1-x)
Let us compute for x = 0, 1
- If, x = 1
Then P (X = 1) = 1/50 = 0.02
- If, x = 0
Then P (X = 0) = q = 1 â p = 1 â 1/50 = 49/50 = 0.98
Thus, the probability of success, i.e., the selected citizen being a business owner, is 0.02, and the probability of failure, i.e., the selected citizen not being a business owner, is 0.98.
Example #2
If 1 out of every 15 stocks in a portfolio performs extraordinarily, then what is the performance of a stock randomly selected from the portfolio?
Solution:
Given:
p = 1/15
P (X = x) = p x (1-p) (1-x)
Thus, P (X = x) = (1/15) x (1 - 1/15) (1-x)
Let us compute for x = 0, 1
- If, x = 1
Then P (X = 1) = 1/15 = 0.07
- If, x = 0
Then P (X = 0) = q = 1 â p = 1 â 1/15 = 14/15 = 0.93
Thus, the probability of getting an extraordinarily performing stock (success) is 0.07. Similarly, the probability of finding a stock not performing extraordinarily (failure) is 0.93.
Example #3
In a medical examination, the chances of error are 15%. Now, find the Bernoulli distribution if one patient is randomly selected out of 60 patients.
Solution:
Number of error reports when 60 patients are examined = 15% of 60 = 9 patients
Thus, the number of patients getting the correct reports = 60 â 9 = 51
p = 51/60 = 17/20
P (X = x) = p x (1-p) (1-x)
Thus, P (X = x) = (17/20) x (1 â 17/20) (1-x)
Computing for x = 0, 1
- If, x = 1
Then P (X = 1) = 17/20 = 0.85
- If, x = 0
Then P (X = 0) = q = 1 â p = 1 â 17/20 = 3/20 = 0.15
Thus, the probability of getting a successful result in the medical test is 0.85, whereas the probability of error (failure) is 0.15.
Graph
Let us plot the above example on a graph:
Given that p = 0.85 and q or 1-p = 0.15.
The above Bernoulli distribution graph indicates the chances of success or failure in a medical examination.
Applications
The Bernoulli method is easy to apply, especially when a single trial provides only two resultsâsuccess or failure. This method is applied in data science, mining, machine learning, analytics, medicines, finance, statistics, and sports.
For example, using this tool, the probability of side effects caused by a new medication can be measured. It can determine the probability of a medical testâs success or failure. It is used to gauge the probability of an email being spam. In marketing, this theorem predicts the probability of a customer buying or not buying a particular product.
This method effectively predicts the probability of a student passing or failing a test. A researcher can determine the chances of selecting or rejecting a recruit. It can also predict the probability of winning or losing a bet.