Categories
Tutorial

Types of Statistical Distribution

This article will go over statistical distribution. I will also discuss the types of statistical distribution.

What is a Distribution?

A statistical distribution shows the characteristics of a variable such as the possible values it has and the number of times it occurs.

We will understand this by taking the most general example which is of a die.
So for instance, a die has 6 sides (1,2,3,4,5,6). We want to figure out the probability of getting 1. The probability would be 1/6th. The same can be told for 2,3,4,5 and 6. However, if we wanted to figure out the probability of getting a 7, that would be impossible. Therefore, the probability of getting a 7 when rolling dice is 0.

We know that the distribution of an event consists of not only the input values but all the possible values. Therefore, the distribution of the event (rolling a die) will be the following.

The probability of getting:

  • 1 = 0.17
  • 2 = 0.17
  • 3 = 0.17
  • 4 = 0.17
  • 5 = 0.17
  • 6 = 0.17

Now that we have understood the meaning of distribution. Let us now understand the types of statistical distribution.

Types of Statistical Distribution

Types of statistical distribution

The following are the types of statistical distribution:

  • Continuous probability distribution
  • Discrete probability distribution

Continuous

When a data can take infinite amount of values, it is known as a continuous distribution. For instance, you can count the time from 0 hours to infinite as time is never ending. It is basically a probability distribution where any random variable X can take any random value. Since X can take any infinite values, the probability of X taking on any one specific value is zero.

The following are the types of continuous probability distribution:

  • Normal /Gaussian Distribution
  • Uniform Distribution
  • Exponential Distribution
  • Chi-Square Distribution

Let us understand them all one by one.

Normal or Gaussian Distribution

The Normal or Gaussian distribution is a continuous probability distribution. It fits the distribution of several events such as income distribution, average height of a population distribution, and so on.
The following is the formula for normal distribution:

Formula for normal or gaussian distribution

where:

  • μ = value of the mean
  • σ = Standard statistical distribution of probability (standard deviation)
  • x = a random variable

A distribution is said to be normal if μ = 0 and σ = 1

The following parameters can be used to define a Gaussian distribution:

  • Mean: The anticipated value of the distribution
  • Variance: Defines the deviation of observations from the mean
  • Standard Deviation: Describes the normalised dispersion of data from the mean

A normal distribution is also called a “Bell Curve” since it has a distribution curve shaped like a bell. It is symmetrical on both sides of the mean. Its mean, median and mode coincide with each other. The area under the distribution curve is equal to 1.
We should note that although all Gaussian distributions are symmetrical, not all symmetrical ones are Gaussian.
This type of distribution is very popular in data science.

Uniform Distribution

The Uniform or Rectangular Distribution is both a continuous and discrete probability distribution where every possible outcome has an equal chance of happening. In simple words, all the outcomes here has the same probability. The probability is constant in this case since each variable has an equal chance of becoming the outcome, resulting in a rectangle distribution. This kind of distribution helps in developing strategies for generating random numbers, such as the inversion method.

For a distribution to be uniform in nature, the following should be the probability density function for a variable X:

density function for uniform distribution

Exponential Distribution

This kind of distribution is concerned with the amount of time remaining until a certain event happens. For example, if you are a student and you get up at 7:15 a.m., how long must you wait to attend the 8:00 a.m. morning lecture?

A variable X is an exponential distribution when:

Exponential distribution

Where λ stands for rate and is always larger than zero.

The exponential distribution is mostly used in survival studies, such as for calculating the life of a milk bottle before it expires.

Chi-Square Distribution

This is a continuous form of distribution that describes the distribution of a sum of the squared random variables. In simple words, it is a single value that indicates how much difference there is between the counts you saw and the counts you would anticipate if the population had no association at all. Statistical tests require it.  The Chi-square goodness of fit test and the Chi-square test of independence are two typical tests that use the Chi-square distribution.

The following is the formula for calculating it:

types of statistical distribution, chi square distribution formula

Discrete

When data can only take specific values, it is known as a discrete distribution. For instance, when you toss a coin, the possible outcomes are heads or tails. You can’t get half heads and half tails, you will either get heads or tails and not both of them.

The following are the types of discrete probability distribution:

  • Bernoulli Distribution
  • Binomial Distribution
  • Geometric Distribution
  • Poisson Distribution

Bernoulli Distribution

This is one of the basic distributions from which more complicated distributions can be derived. It only has two potential values: 0 and 1. The number of trials for a single experiment must be predetermined under this distribution. Each experiment has just two outcomes: success or failure. Each experiment has the same chance of success and is independent of the others.

The following is the density function for this:

Density function for bernoulli distribution

Binomial Distribution

We use Binomial Distribution when a trial has exactly two mutually exclusive outcomes (success or failure). Binary outcome events use it where the likelihood of success in each trial is equal to the probability of failure. Tossing a coin a number of times is one example to describe this. Bi-parametric distribution is another name for it.

The following factors determine it:

  • n = the number of times something happens
  • p = probability given to one of the two classes

The following formula can be used to calculate the likelihood of a successful occurrence (x) within n trials:

Types of statistical distribution, Binomial distribution formula

Each trial is independent of the others in this distribution, which implies that the outcome of one trial has no effect on the matter of other trials. Each trial has two possible outcomes: success or failure, with probability of p and q. (1-p).

Geometric Distributon

Geometric Distribution deals with the number of attempts necessary for a single success. It essentially implies that it calculates the likelihood of success after N failures. As a result, it is a negative binomial distribution, with a success rate of one. In this distribution, each trial has two possible outcomes: success or failure, and all trials are independent of one another and have the same chance of success.

The following formula is useful for understanding this:

Geometric Distribution

Poisson Distribution

The Poisson Distribution describes the probability of a certain number of events occurring in a particular time period. This simply implies that it calculates how many times an event is likely to happen in ‘x’ amount of time. The success probability for a short period of time is equal to the success probability for a long period of time under this distribution. As the time of a specific task grows shorter, the chance of success equals zero. All successful occurrences are distinct from one another where the average rate of occurrence of occurrences is constant.

The formula below can be used to construct a poisson distribution:

Poisson Distribution Formula

where:

  • 𝝺 = the maximum number of events that may occur in a given amount of time
  • X = the total number of events that occurred within that time period

Leave a Reply

Your email address will not be published. Required fields are marked *