One-Sample Proportion Test

Introduction

A one-sample proportion test is an approximation of the one-sample binomial test. The binomial test uses the binomial distribution can be approximated with the standard normal distribution, if the sample size is large enough.

The advantage of using the normal distribution over the binomial distribution, is mostly in the computation. The binomial distribution uses factorials, and these quickly get extremely large. Agresti and Coull (1998) also argue why sometimes an approximation is better than an 'exact' version.

A small issue is that to approximate a binomial distribution with the standard normal distribution, we need to calculate a mean and standard deviation, based only on the sample size, and the sample or expected population proportion. For calculating the mean the expected proportion is used, but for the standard deviation one variation uses the expected proportion, while another uses the sample proportion. SPSS refers to the first as a score test, and the second as a Wald test (IBM, 2021, p. 997). For the Wald test SPSS refers to Agresti (2013, p. 10), who in turn refers to Wald (1943). The score test is also referred to as a Wilson test (Wilson, 1927, p. 212).

Performing the Test

with Excel

Excel file from video: TS - Proportion (one-sample) (E).xlsm.

with stikpetE

without stikpetE

with Flowgorithm

A basic implementation for a one-sample proportion test is shown in the flowchart in figure 2

Figure 2
Flowgorithm for one-sample proportion test

It takes as input the frequency of one of the categories (k), the sample size (n), if the Wald test should be used, and if Yates correction should be applied. This makes use of the standard normal distribution cumulative density function.

Flowgorithm file: TS - Proportion test (one-sample).fprg.

with Python

Jupyter Notebook from videos: TS - Proportion (one-sample) (P).ipynb.

with stikpetP

without stikpetP

with R

Jupyter Notebook from videos: TS - Proportion (one-sample) (R).ipynb.

with stikpetR

without stikpetR

Datafile used in video: StudentStatistics.sav

with SPSS

Datafile used in video: StudentStatistics.sav

Formulas

The normal approximation of the binomial is usually done with:

\(z=\frac{x-\mu}{\sigma}\)

where \(\mu = n\times p_0\) and \(\sigma=\sqrt{\mu\times\left(1-p_0\right)}\)

Here \(p_0\) is the expected proportion (the proportion according to the null hypothesis), and \(x\) the observed count in the sample

This will then follow a standard normal distribution, from which p-values can easily be calculated

For the Wald test we can do the exact same, but only change our \(\sigma\) to:

\(s = \sqrt{x\times\left(1 - \frac{x}{n}\right)}\)

For the Yates correction we need to determine the absolute value of the numerator in the z-formula and subtract 0.5. i.e.

\(z_{Yates} = \frac{\left|x - \mu\right| - 0.5}{\sigma}\)

For the chi-square tests, please refer to the formulas at the 'analyzing a single variable' and then the test section'

Interpreting the Result

As mentioned earlier, the normal approximation is appropriate to use if the sample size is large enough and the sample proportion not too small. See the normal approximation on the binomial distribution page, for some rule of thumb that can be used to determine if the conditions are met.

The assumption about the population for this test (the null hypothesis) is that the proportion for one category is X. Where X can be any value between 0 and 1. Very often X is set to 0.5.

The test provides a p-value, which is the probability of a test statistic as from the sample, or even more extreme, if the assumption about the population would be true. If this p-value (significance) is below a pre-defined threshold (the significance level \(\alpha\) ), the assumption about the population is rejected. We then speak of a (statistically) significant result. The threshold is usually set at 0.05. Anything below is then considered low.

If the assumption is rejected, we conclude that the proportion the category will have in the population will be different than the one used in the test.

Note that if we do not reject the assumption, it does not mean we accept it, we simply state that there is insufficient evidence to reject it.

Writing the results

Since the one-sample proportion test uses a standard normal distribution, you can report the results as follows:

z = <z-value>, p = <p-value>

So for example:

A one-sample score proportion test, with continuity correction, indicated that the percentages were significantly different, z = 3.10, p < .001.

The p-value is shown with three decimal places, and no 0 before the decimal sign. If the p-value is below .0005, it can be reported as p < .001.

Corrections

Because the standard normal distribution is a continuous distribution, while the binomial is a discrete distribution, a small correction is also often applied. This is known as a Yates continuity correction. It simply adds or subtracts 0.5 from the number of successes.

Next step and Alternatives

APA (2019, p. 88) states to also report an effect size measure. If the assumption was 0.5 one possible option could be Cohen g, otherwise Cohen h2 the Alternative Ratio or a Rosenthal correlation coefficient could also be used.

Alternative test could be a one-sample binomial test.

Tests

Binomial (one-sample)

Freeman-Tukey

G / Likelihood Ratio / Wilks

Multinomial GoF

Pearson Chi-Square

Proportion (one-sample)

Google adds