Table Of Contents
What Is Goodman And Kruskal's Gamma?
Goodman and Kruskal's gamma (gamma statistics or gamma coefficient) is a non-parametric statistical measure identifying the correlation between two ordinal variables. It determines the direction and strength of the relationship between these data pairs to predict the future trend or values.
The gamma coefficient applies to ordinal data sets with multiple tied ranks or orders with continuous or discrete variables. The gamma value ranges between -1 and 1, where -1 indicates a perfectly negative association between the data pairs, 1 represents a perfectly positive correlation, and 0 resembles no relationship.
Table of contents
- Goodman and Kruskal's gamma, gamma statistics, or gamma coefficient, is a statistical measure that gauges the strength and direction of association between two ordinal variables.
- The formula used to determine the gamma coefficient value is: γ=Nc-NdNc+Nd; where Nc represents the number of concordant pairs, and Nd denotes the number of discordant pairs.
- The value of Goodman-Kruskal gamma is between -1 to 1. While -1 indicates a strongly negative association between the variables, 1 signifies a positive relationship. However, 0 resembles that there is no connection between the variables.
Goodman And Kruskal's Gamma Explained
Goodman and Kruskal's gamma is a statistical measure utilized to evaluate the strength and direction of association between two ordinal variables. It was introduced between 1954 to 1972 in a series of papers written by Leo Goodman and William Kruskal as a crucial gauge that enables researchers to quantify the relationship between the variables measured on an ordinal scale. Moreover, by considering tied ranks, gamma provides a more accurate assessment of the association between variables. It extends Spearman's rank correlation coefficient and is particularly useful when working with non-parametric data.
Gamma statistics are widely utilized in business, social sciences, epidemiology, and market research, where researchers need to analyze the relationships between variables. Further, Goodman and Kruskal's gamma results can be effortlessly interpreted and communicated. Hence, it aids researchers in understanding patterns and relationships within their data, facilitating informed decision-making and meaningful conclusions. For instance, the gamma coefficient can be employed in business to assess the relationship between customer satisfaction and loyalty.
However, gamma statistics is designed explicitly for ordinal variables and may not be suitable for analyzing other types of data like the one measured on a nominal scale. Also, a large number of tied ranks can impact the accuracy of the measure and result in less reliable outcomes. Such an analysis fails to specify the nature of the relationship between variables, I.e., linear and nonlinear associations. Moreover, it cannot handle multiple independent variables since it is a bivariate measure. Even a small size may provide biased and inaccurate results limiting the statistical power of this measure.
Assumptions
There are two fundamental assumptions for the application of the given data. However, if a data set doesn't fulfill any of the following assumptions, then an alternative statistical measure should be used for analysis:
- The paired data sets should comprise ordinal variables. Ordinal variables possess categories or levels with natural order but lack specific numerical values. The examples include education level (high school, college, graduate) or Likert scale responses (strongly agree, agree, neutral, disagree, strongly disagree).
- The paired variables should exhibit a monotonic connection whereby a rise in one variable results in an apparent increase or decrease in the rank of another variable.
How To Calculate?
Let us split Goodman and Kruskal's gamma calculation into the following two categories: Calculation and interpretation.
Calculation
Although there are various Goodman and Kruskal gamma calculators available online, one can use the following steps to find the value of the gamma coefficient:
Step 1 - Create a contingency table
Construct a contingency table that presents the frequencies of the joint distribution of the two ordinal variables. The rows of the table represent the levels of one variable, while the columns represent the levels of the other variable.
Step 2 - Determine the number of concordant pairs (Nc)
Concordant pairs are observations exhibiting similar ordering or rank for both variables. Find the number of concordant pairs in the contingency table.
Step 3 - Find the number of discordant pairs (Nd)
Discordant pairs are observations with different orderings or ranks for the two variables. Ascertain the number of discordant pairs in the contingency table.
Step 4 - Evaluate the total number of pairs (Nc + Nd)
Compute the number of pairs by adding the concordant and discordant pairs.
Step 5 - Calculate Goodman and Kruskal's gamma
Compute gamma by subtracting the number of discordant pairs from the number of concordant pairs and dividing it by the total number of pairs. The gamma coefficient formula is mathematically denoted as follows:
γ=Nc-NdNc+Nd
Where:
- Nc denotes the number of concordant pairs; and
- Nd represents the number of discordant pairs.
Interpretation
The value of the gamma coefficient is between -1 to +1. The closer the value to the extreme ends, i.e., -1 or 1, the stronger the relationship between the data pairs. It resembles the following:
- -1 indicates a perfect negative association;
- 1 represents a perfect positive association; and
- 0 signifies no association between the variables.
Further, statistical tests such as the P value can be employed to determine whether the computed gamma value significantly differs from zero.
Examples
Let us consider the following examples to understand the concept better:
Example #1
The gamma coefficient can be used to determine the association between students' nervousness in tests and their performance. Thus, by evaluating the gamma value for the level of fear and the test results, i.e., pass or fail, the researcher can identify the strength (very weak, weak, negligible, strong, very strong) and direction (i.e., negative or positive) of the association between these two variables.
Example #2
A business analyst wants to determine the association between the consumer's income class and demand for luxury cars. If assuming that the analyst considers two income groups - upper and middle, and demand variables as high and low, we can use the special variation of the Goodman-Kruskal gamma, i.e., Yule's Q to find out the gamma value when the contingency table is as follows:
Income Class | ||
Demand | Upper | Middle |
High | 89 | 2 |
Low | 11 | 98 |
Finding Nc and Nd:
- Nc = 89 * 98 = 8722
- Nd = 2 * 11 = 22
γ=Nc-NdNc+Nd
γ = (8722 - 22) / (8722 + 22) = 0.99
The value of the gamma coefficient is very close to 1, i.e., 0.99. Hence, a strong positive association exists between income class and demand for luxury cars.
Frequently Asked Questions (FAQs)
A special condition of the gamma coefficient appears when the two ordinal data sets paired for analysis have two variables, I.e., when a 2x2 matrix is formed. In such a case, Yule's Q measure is employed for gauging the strength and direction of the association between the two data pairs.
While testing the gamma coefficient, a value is considered weak if it lies between 0.1 to 0.2 or -0.1 to -0.2. In such a case, the two variables have a negligible positive or negative relationship.
Like any other statistical measure for hypothetical testing, the gamma statistics also prove the null hypothesis when the gamma value is not zero, thus indicating a positive or negative relationship or connection between the data pairs.
Recommended Articles
This article has been a guide to What Is Goodman & Kruskal's Gamma. Here, we explain the concept along with its examples, assumptions, and how to calculate it. You may also find some useful articles here -