Kruskal Wallis Test

Published on :

21 Aug, 2024

Blog Author :

N/A

Edited by :

Ashish Kumar Srivastav

Reviewed by :

Dheeraj Vaidya

What Is Kruskal Wallis Test? 

Kruskal Wallis Test refers to a method of matching the medians of more than two groups for ascertaining whether the samples have the same group source as an origin or not. It is a key tool for comparing three or more groups based on a dependent variable by measuring at a categorical level.

Kruskal Wallis Test Meaning

It applies to the distribution of less or non-parametric population for contrasting more than two distinct and equal-sized data samples. This test examines the null hypothesis, which claims that 'k' samples from the same population had identical median values. It indicates the dominance of one sample of the variable over the other sample stochastically. 

  • Kruskal Wallis Test is a method to determine the dominance of a variable of one sample over a variable of another sample when the number of groups is more than three by comparison of their median in a parametric manner.
  • It has certain determinants like - more than three conditions and the absence of normally distributed data to apply the test.
  • It is used only when more than three groups have independent variables and more than three categories of data.
  • The value of p is always more than 0.05, indicating little power of the test. 

Kruskal Wallis Test Explained 

Kruskal Wallis H test is a type of one-way ANOVA test based on ranks that use statistical methods to compare the medians of more than two groups. So, one can apply it to all distributions containing variables that are either ordinal or dependent continuous levels. However, although it can establish which variables dominate within a group, it fails to answer the reason for dominance. 

Kruskal Wallis Test

The test in consideration is applicable only when the following conditions take place:

  • One has more than three conditions for comparison.
  • Every condition is regulated by a distinct participant group, meaning if one has the design of independence measures having more than three conditions.
  • The data for applying the test is not distributed normally
  • The data being measured have markedly different variances related to different conditions.

After the above conditions are met, the test is performed to check whether the outcome of one group under the test affects the outcome of the other sample in consideration.

The test assesses various data sets for understanding the interchange of data samples amongst numerous data sets. Then the null hypothesis is utilized to know whether the medians are equal as opposed to the alternative hypothesis, which suggests the difference in at least one of the samples exists. Machine learning also takes the help of the test to ascertain the difference between two or more groups without telling the reason for the existing difference between them.

Factors

Let us understand the concept by learning how test anxiety influences test results. Three possible values for the independent variable "test anxiety" exists – 

  1. No anxiety, 
  2. Low-medium anxiety, and 
  3. High anxiety. 

The dependent variable is the exam score, scaled from 0 to 100%.

Another example is learning how socioeconomic status influences perceptions of increasing sales tax. Socioeconomic status has three tiers: 

  1. Working class
  2. Middle class, and 
  3. Wealthy, 

These are the independent variable. The dependent variable is measured on a 5-point Likert scale, from strongly agree to disagree strongly.

Assumptions

The researcher's variables ought to include the following Kruskal Wallis test assumptions:

  • Two or more tiers and one independent variable (independent groups). The test is more frequently administered when statisticians have three or more levels. Consider utilizing the Mann-Whitney U Test for two levels instead.
  • Dependent variables with an interval, ratio, or ordinal scale.
  • Statisticians' observations ought to be impartial. Put another way, it shouldn't have any connections between the individuals who make up each group or between groupings. 
  • The distributions of shapes for all groups ought to be uniform. Most testing tools, including SPSS and Minitab, will check for this condition.

Formula

Let us determine the Kruskal Wallis test formula for comparing medians of more than two groups. Below given is the formula for the test:

Kruskal Wallis test formula

Where, 

  • K = number of groups used for comparison
  • N = total size of the sample
  • ni = i-th group's sample size
  • Ri = total of the ranks related to i-th group

The above formula is valid only for a group of samples: 

  • Having a minimum of five elements.
  • Group having no estimated population barriers.
  • The distribution of the population has no assumption.
  • Samples groups are independent.
  • Random selection of data in every group.
  • Having a very minimal ordinal.

Moreover, certain websites and software provide Kruskal Wallis test calculators for solving the formula.

Calculation Example 

Let us use a Kruskal Wallis test example to understand the concept easily. Let there be three machines, sewing machine 1, sewing machine 2, and sewing machine 3, with the following details and ranking:

Sewing Machine 1TimeRank
M123.7413
M124.1014
M125.1015
M125.4016
M126.3117
Sewing Machine 2TimeRank
M221.607
M221.808
M222.209
M222.7510
M223.4011
Sewing Machine 3TimeRank
M319.756
M320.004
M320.403
M320.602
M321.607

Let us apply the Kruskal Wallis Test to the sample data:

Kruskal-Wallis Test
Sum of Ranks Group 175
Sample Size Group 15
Sum of Ranks Group 245
Sample Size Group 25
Sum of Ranks Group 322
Sample Size Group 35
Total Sample Size n15
Total Sum of Ranks120
Total Sums CheckFailed--review manually assigned ranks.
a0.05
Sum of Squared Ranks/Sample Size1626.8
H33.34
Number of groups3
Critical Value5.991476357
p-value5.75852E-08
DecisionReject

Hence, one can see that the Kruskal-Wallis test has little power as the value of p is always greater than 0.05.

When To Use Kruskal Wallis Test? 

A continuous or ordinal dependent variable may be subjected to this rank-based nonparametric test to evaluate whether statistically significant differences exist between two or more independent variable groups of the independent variable. Certain scenarios where the Kruskal Wallis test can be used exist, such as:

  • One desires to know whether various groups vary in their key variable.
  • The variable one is interested in is ongoing.
  • There are three or more categories.
  • Variables have at least one independent variable containing more than two independent groups.

Frequently Asked Questions (FAQs)

1. How to interpret Kruskal Wallis test?

Users have proof that the null hypothesis is false if the p-value is modest, just under 0.05. People fail to accept the null hypothesis but conclude that it was at least perhaps one of the groups is most likely derived from a different distribution as compared to the others due to the result of lower p-values employing Kruskal-Wallis.

2. How to do the Kruskal Wallis test in SPSS?

· Choose: "Analyze ->Nonparametric Tests-> Independent Samples.
· Place the grouping factor inside the "Groups" box on the Fields tab and the dependent variable within the "Test Field" section.
· Although the dependent variable remains ordinal, such an approach won't work until SPSS classifies this one under the "Scale" variable.
· Pick the Kruskal-Wallis analysis under the Customize Tests option on the Settings tab. Execute with the multiple comparison tests as set to "All pair-wise"

3. How to do Kruskal Wallis test in excel?

· First, enter the information as given in the description.
· Second, grade the information.
· Third, grade the information to determine the overall test statistic, including the associated p-value.
· Finally, submit the outcomes in step four.

4. What is Kruskal Wallis test used for?

The Kruskal Wallis test is a distribution-less test that can contrast more than three groups containing the sample data.

This has been a guide to what is Kruskal Wallis Test. Here, we explain it with its assumptions, formula, calculation example, and when to use it. You can learn more about financing from the following articles –