Rank Biserial Correlation

Explanation

Two variations on this coefficient are in circulation. One is used with a one-sample (os) or paired samples (ps) Wilcoxon signed rank test, the other with the Wilcoxon rank sum test for independent samples (is) (equal to the Mann-Whitney U test), which I'll refer to as the Glass rank biserial coefficient, and is the same as Cliff delta.

These are, in my opinion, an effect size measure and not really a correlation. The os and ps expresses the difference between positive and negative ranks, as a proportion of the maximum possible rank. The version of the is is twice the difference in average ranks, divided by the total sample size. The measure is derived from a similar approach as the Spearman correlation.

Table 1 shows rules-of-thumb for the classification.

Table 1
Interpretation for rank biserial
classification	Negligible	Small	Medium	Large
Cohen (1988, p. 82)	0 < 0.125	0.125 < 0.304	0.304 < 0.465	≥ 0.465
Vargha and Delaney (2000, p. 106)	0 < 0.11	0.11 < 0.28	0.28 < 0.43	≥ 0.43

Obtaining the Coefficient

for one-sample

with Excel

Excel file from video: ES - Rank Biserial (One-Sample) (E).xlsm

with stikpetE

without stikpetE

with Flowgorithm

Flowgorithm file: FL-ESrankbisOS.fprg

with Python

Notebook from video: ES - Rank Biserial (One-Sample) (P).ipynb

with stikpetP

without stikpetP, with scipy or pandas

without libraries

with R

Notebook from video: ES - Rank Biserial (One-Sample) (R).ipynb

with stikpetR

without stikpetR

with SPSS

Formulas

The Rank Biserial Correlation, can be calculated using (Cureton, 1956, p. 288; King & Minium, 2008, p. 403):

\(r_{rb} = \frac{4\times\left|R_{min} - \frac{R_{pos} + R_{neg}}{2}\right|}{n\times\left(n+1\right)} =\frac{\left|R_{pos} - R_{neg}\right|}{R} \)

Where \(n\) is the sample size, \(R\) the sum of all ranks, \(R_{neg}\) the sum of ranks of scores below the hypothesized median, \(R_{pos}\) of the scores above, \(R_{min}\) the minimum of those two, and \(R\) the sum of all ranks.

for independent-samples

with Excel

Excel file from video: ES - Rank-Biserial (ind samp) (E).xlsm

with stikpetE

TO BE UPLOADED

without stikpetE

TO BE UPLOADED

with Python

Notebook from video: ES - Rank-Biserial (ind samp) (P).ipynb

with stikpetP

TO BE UPLOADED

without stikpetP

TO BE UPLOADED

with R

Notebook from video: ES - Rank-Biserial (ind samp) (R).ipynb

with stikpetR

TO BE UPLOADED

without stikpetR

TO BE UPLOADED

Formulas

The Rank Biserial Correlation, can be calculated using (Glass, 1965, p. 91; Glass, 1966, p. 626; Cliff, 1993, p. 495):

\(r_{rb} = \frac{2\times\left|\bar{R}_{1} - \bar{R}_{2}\right|}{n} \)

Where \(n\) is the sample size, \(\bar{R}_{1}\) the average rank of scores in category 1, and \(\bar{R}_{2}\) the average rank of scores in category 2. These can be determined using:

\(\bar{R}_{i} = \frac{R_i}{n_i}\)

Where \(R_i\) is the sum of ranks in category \(i\) and \(n_i\) the number of scores in category \(i\).

Google adds