Fligner-Policello Test
Explanation
This test could be used if you have a binary and an ordinal variable. It is an alternative for the more famous Mann-Whitney U test. The MWU test has as an assumption that the scores in the two categories have the same shape (Fong & Huang, 2019) if we want to test if the medians are equal. The Fligner-Policello test does not, although the distribution around their medians should be symmetric in the population (Hollander & Wolfe, 1999, p. 135) and use continuous data (Fligner & Policello, 1981, p. 162), although Boden (2011, p. 385) does seem to list it as a test for ordinal data.
The Fligner-Policello test, tests if there is a shift in location. If we have scores 10, 20, 30 in category A, and 5, 20, 100 in category B, the medians are equal, but there is still quite a location shift in ranks when combining all scores.
Not to be confused with the Fligner-Killeen test, which can be used to test for homogeneity of variances
Performing the Test
with Excel
Excel file: TS - Fligner-Policello (E).xlsm
with stikpetE
To Be Made
without stikpetE
To Be Made
with Python
Jupyter Notebook: TS - Fligner-Policello (P).ipynb
with stikpetP
To Be Made
without stikpetP
To Be Made
with R
Jupyter Notebook: TS - Fligner-Policello (R).ipynb
with stikpetR
To Be Made
without stikpetR
To Be Made
with SPSS
To Be Made
Formulas
The formula for the Z statistics is (Fligner & Policello, 1981):
\(z = \frac{N_Y - N_X}{2\times\sqrt{SS_X + SS_Y - M_X\times M_Y}}\)
In case of a ties correction, subtract half from the absolute value of the numerator in the formula for the z-value.
\(z = \frac{\left|N_Y - N_X\right| - 0.5}{2\times\sqrt{SS_X + SS_Y - M_X\times M_Y}}\)
With:
\(SS_X = \sum_{x\in X} \left(N_X - M_X\right)^2, SS_Y = \sum_{y\in Y} \left(N_Y - M_Y\right)^2\)
\(M_X = \frac{N_X}{n_x}, M_Y = \frac{N_Y}{n_y}\)
\(N_X = \sum_{x \in X} N\left(x\right), N_Y = \sum_{y \in Y} N\left(y\right)\)
\(N\left(x\right) = \sum_{y\in Y} f\left(x, y\right)\)
\(N\left(y\right) = \sum_{x\in X} f\left(y, x\right)\)
\(f\left(a, b\right) = \begin{cases} 1 & \text{ if } a > b \\ 0 & \text{ if } a\leq b \end{cases}\)
or, if a ties correction is used (Hollander et al., 2014, p. 146):
\(f\left(a, b\right) = \begin{cases} 1 & \text{ if } a > b \\ 0.5 & \text{ if } a = b \\ 0 & \text{ if } a < b \end{cases}\)
Interpreting the Result
The assumption about the population for this test (the null hypothesis) is that the medians are equal
The test provides a p-value, which is the probability of a test statistic as from the sample, or even more extreme, if the assumption about the population would be true. If this p-value (significance) is below a pre-defined threshold (the significance level \(\alpha\) ), the assumption about the population is rejected. We then speak of a (statistically) significant result. The threshold is usually set at 0.05. Anything below is then considered low.
If the assumption is rejected, we conclude that the medians in the population will be different.
Note that if we do not reject the assumption, it does not mean we accept it, we simply state that there is insufficient evidence to reject it.
Writing the results
Writing up the results of the test uses the format (APA, 2019 p. 182):
z(n1 = <number of cases in 1st category>, n2 = <number of cases in 2nd category>) = <Z-value>, p = <p-value>
So for example:
A Fligner-Policello test indicated that the mean ranks for male and female were significantly different, z(n1 = 11, n2 = 34) = 2.845, p = .004.
A few notes about reporting statistical results with APA:
- The p-value is shown with three decimal places, and no 0 before the decimal sign. If the p-value is below .0005, it can be reported as p < .001.
- Both U and z are standard abbreviations from APA for the Mann-Whitney U test statistic, and standardized score (see APA, 2019, table 6.5). They do not need to be explained.
- APA does not require to include references nor formulas for statical analysis that are in common use (2019, p. 181).
- APA (2019, p. 88) states to also report an effect size measure.
Next...
The next step is to determine an effect size measure. Varha-Delaney A, a Rosenthal Correlation, or a (Glass) Rank Biserial Correlations (Cliff Delta), could be suitable for this.
Alternatives
alternatives for testing stochastic equivelance:
- Mann-Whitney U. Chung and Romano (2011, p. 5) note that it fails to control type 1 errors
- the Brunner-Munzel test
- the Brunner-Munzel studentized permutation test
- Cliff-Delta, which according to Delaney and Vargha (2002), performs similar as the Brunner-Munzel test.
- C-square test, which is an improvement on the Brunner-Munzel test
if you only want to test if the medians are equal:
- Mann-Whitney U, assuming distributions have the same shape
- Mood-Median, although according to Schlag (2015) this is actually testing quantiles, and can lead to over rejection.
- Schlag, but only used to accept or reject, no p-value
Google adds