Fligner-Policello Test

Explanation

This test could be used if you have a binary and an ordinal variable. It is an alternative for the more famous Mann-Whitney U test. The MWU test has as an assumption that the scores in the two categories have the same shape (Fong & Huang, 2019) if we want to test if the medians are equal. The Fligner-Policello test does not, although the distribution around their medians should be symmetric in the population (Hollander & Wolfe, 1999, p. 135) and use continuous data (Fligner & Policello, 1981, p. 162), although Boden (2011, p. 385) does seem to list it as a test for ordinal data.

The Fligner-Policello test, tests if there is a shift in location. If we have scores 10, 20, 30 in category A, and 5, 20, 100 in category B, the medians are equal, but there is still quite a location shift in ranks when combining all scores.

Not to be confused with the Fligner-Killeen test, which can be used to test for homogeneity of variances

Performing the Test

with Excel

Excel file: TS - Fligner-Policello (E).xlsm

with stikpetE

To Be Made

without stikpetE

To Be Made

with Python

Jupyter Notebook: TS - Fligner-Policello (P).ipynb

with stikpetP

To Be Made

without stikpetP

To Be Made

with R

Jupyter Notebook: TS - Fligner-Policello (R).ipynb

with stikpetR

To Be Made

without stikpetR

To Be Made

with SPSS

To Be Made

Formulas

The formula for the Z statistics is (Fligner & Policello, 1981):

\(z = \frac{N_Y - N_X}{2\times\sqrt{SS_X + SS_Y - M_X\times M_Y}}\)

In case of a ties correction, subtract half from the absolute value of the numerator in the formula for the z-value.

\(z = \frac{\left|N_Y - N_X\right| - 0.5}{2\times\sqrt{SS_X + SS_Y - M_X\times M_Y}}\)

With:

\(SS_X = \sum_{x\in X} \left(N_X - M_X\right)^2, SS_Y = \sum_{y\in Y} \left(N_Y - M_Y\right)^2\)

\(M_X = \frac{N_X}{n_x}, M_Y = \frac{N_Y}{n_y}\)

\(N_X = \sum_{x \in X} N\left(x\right), N_Y = \sum_{y \in Y} N\left(y\right)\)

\(N\left(x\right) = \sum_{y\in Y} f\left(x, y\right)\)

\(N\left(y\right) = \sum_{x\in X} f\left(y, x\right)\)

\(f\left(a, b\right) = \begin{cases} 1 & \text{ if } a > b \\ 0 & \text{ if } a\leq b \end{cases}\)

or, if a ties correction is used (Hollander et al., 2014, p. 146):

\(f\left(a, b\right) = \begin{cases} 1 & \text{ if } a > b \\ 0.5 & \text{ if } a = b \\ 0 & \text{ if } a < b \end{cases}\)

Interpreting the Result

The assumption about the population for this test (the null hypothesis) is that the medians are equal

The test provides a p-value, which is the probability of a test statistic as from the sample, or even more extreme, if the assumption about the population would be true. If this p-value (significance) is below a pre-defined threshold (the significance level \(\alpha\) ), the assumption about the population is rejected. We then speak of a (statistically) significant result. The threshold is usually set at 0.05. Anything below is then considered low.

If the assumption is rejected, we conclude that the medians in the population will be different.

Note that if we do not reject the assumption, it does not mean we accept it, we simply state that there is insufficient evidence to reject it.

Writing the results

Writing up the results of the test uses the format (APA, 2019 p. 182):

z(n₁ = <number of cases in 1st category>, n₂ = <number of cases in 2nd category>) = <Z-value>, p = <p-value>

So for example:

A Fligner-Policello test indicated that the mean ranks for male and female were significantly different, z(n₁ = 11, n₂ = 34) = 2.845, p = .004.

A few notes about reporting statistical results with APA:

The p-value is shown with three decimal places, and no 0 before the decimal sign. If the p-value is below .0005, it can be reported as p < .001.
Both U and z are standard abbreviations from APA for the Mann-Whitney U test statistic, and standardized score (see APA, 2019, table 6.5). They do not need to be explained.
APA does not require to include references nor formulas for statical analysis that are in common use (2019, p. 181).
APA (2019, p. 88) states to also report an effect size measure.

Next...

The next step is to determine an effect size measure. Varha-Delaney A, a Rosenthal Correlation, or a (Glass) Rank Biserial Correlations (Cliff Delta), could be suitable for this.

Alternatives

alternatives for testing stochastic equivelance:

Mann-Whitney U. Chung and Romano (2011, p. 5) note that it fails to control type 1 errors
the Brunner-Munzel test
the Brunner-Munzel studentized permutation test
Cliff-Delta, which according to Delaney and Vargha (2002), performs similar as the Brunner-Munzel test.
C-square test, which is an improvement on the Brunner-Munzel test

if you only want to test if the medians are equal:

Mann-Whitney U, assuming distributions have the same shape
Mood-Median, although according to Schlag (2015) this is actually testing quantiles, and can lead to over rejection.
Schlag, but only used to accept or reject, no p-value

Links to parts

Google adds