a package for kruskal-wallis that shows pairwise comparison details
- by dalloliogm
The standard stats::kruskal.test module allows to calculate the kruskal-wallis test on a dataset:
>>> data(diamonds)
>>> kruskal.test.test(price~carat, data=diamonds)
Kruskal-Wallis rank sum test
data: price by carat by color
Kruskal-Wallis chi-squared = 50570.15, df = 272, p-value < 2.2e-16
this is fine, it is giving me the probability that all the groups in the data have the same mean.
However, I would like to have the details per each pair comparison, like if diamonds of colors D and E have the same mean price, as some other softwares (SPSS) do when you ask for a Kruskal test.
I have found kruskalmc from the package pgirmess which allows me to do what I want to do:
> kruskalmc(diamonds$price, diamonds$color)
Multiple comparison test after Kruskal-Wallis
p.value: 0.05
Comparisons
obs.dif critical.dif difference
D-E 571.7459 747.4962 FALSE
D-F 2237.4309 751.5684 TRUE
D-G 2643.1778 726.9854 TRUE
D-H 4539.4392 774.4809 TRUE
D-I 6002.6286 862.0150 TRUE
D-J 8077.2871 1061.7451 TRUE
E-F 2809.1767 680.4144 TRUE
E-G 3214.9237 653.1587 TRUE
E-H 5111.1851 705.6410 TRUE
E-I 6574.3744 800.7362 TRUE
E-J 8649.0330 1012.6260 TRUE
F-G 405.7470 657.8152 FALSE
F-H 2302.0083 709.9533 TRUE
F-I 3765.1977 804.5390 TRUE
F-J 5839.8562 1015.6357 TRUE
G-H 1896.2614 683.8760 TRUE
G-I 3359.4507 781.6237 TRUE
G-J 5434.1093 997.5813 TRUE
H-I 1463.1894 825.9834 TRUE
H-J 3537.8479 1032.7058 TRUE
I-J 2074.6585 1099.8776 TRUE
However, this package only allows for one categoric variable (e.g. I can't study the prices clustered by color and by carat, as I can do with kruskal.test), and I don't know anything about the pgirmess package, whether it is maintained or not, or if it is tested.