Multinomial Distribution
Introduction
As the name implies, the multinomial distribution is an extension of the binomial distribution. Where the binomial distribution is applied with two categories, a multinomial distribution can have any number of categories.
Given the observed counts (\(F\)) of \(k\) categories and the expected probability for each category \(P\), the multinomial probability mass function (mpmf) will determine the probability of having those observed counts, given those probabilities.
For example, I have picked 9 people and three are local, two are neighbouring and four are from far away. I wanted to be as fair as possible and give each category an equal chance, i.e. my expected probabilities are 1/3 for each category. The mpmf can now determine the probability of getting indeed the 3, 2, 4 split, if the expected probabilities are 1/3 for each category. Note that the expected probabilities were the same for each category in the example, but don't have to be.
The cumulative density function will give the probability of the observed counts (the pmf) or more extreme. It looks at all possible arrangements of the \(n\) items, determines the pmf for each, and adds it to the cdf if it is less or equal to the one from the observed counts.
The distribution depends on the number of categories, the counts of each of those categories and the probability for each category. That is a lot of different possible combinations, and because of this, to my knowledge, no distribution tables exist. To determine the pmf or cdf, either use software, or do the math.
Use some software (easy)
Excel
Excel file: DI - Multinomial (E).xlsm
Python
Jupyter Notebook: DI - Multinomial (P).ipynb
R
Jupyter Notebook: DI - Multinomial (R).ipynb
SPSS (not possible)
To my knowledge there is no GUI method in SPSS to see/determine the multinomial distribution. There is a macro available here that can generate a random multinomial value, but that's the closest I've seen.
Do the math (hard core)
The basic formula for the pmf is:
\(mpmf\left(F, P\right) = \frac{n!}{\prod_{i=1}^{k} \left(F_i!\right)} \times \prod_{i=1}^{k} P_i^{F_i}\)
This formula was most likely already used by for example Edgeworth (1905), but can for example also be found in Berry and Mielke (1995, p. 769).
It can also be expressed using the gamma function as:
\(mpmf\left(F, P\right) = \frac{\Gamma\left(1+n\right)}{\prod_{i=1}^n \Gamma\left(1+F_i\right)} \times \prod_{i=1}^{k} P_i^{F_i}\)
Or using the logarithm of the gamma function as (Arnold, 2018):
\(mpmf\left(F, P\right) = e^{\ln\left(mpmf\left(F, P\right)\right)}\)
with:
\(\ln\left(mpmf\left(F, P\right)\right) = \ln\left(\Gamma\left(n + 1\right)\right) + \sum_{i=1}^k F_i\times\ln\left(P_i\right) - \ln\left(\Gamma\left(F_i + 1\right)\right)\)
Another option is an algorithm from García-Pérez (1999):
- Determine \(F^*\), the counts in descending order, and move the elements in \(P\) accordingly creating \(P^*\).
- Set \(pmf = 1\), \(t=P_1^*\), \(i=2\), \(x=0\), and \(m=F_1^*\)
- Set \(l = F_i^*\). For \(r=1\) to \(l\) do:
- update \(x = x + 1\)
- if \(x > F_1^*\) then set \(t = 1\) (else nothing)
- update \(pmf = pmf \times P_i^* \times \frac{r + m}{r}\)
- If \(i=k\), then go to step 5, otherwise update \(i = i + 1\), \(m = m + F_i^*\) and go to step 3
- If \(x < F_1^*\) then for \(r=x + 1\) to \(F_1^*\) update \(pmf = pmf \times P_1^*\)
The cumulative (i.e. as extreme or more extreme) finds all possible arrangements with the same sum and number of categories, and add those with a pmf less or equal to the original:
\(\sum_{f_1=0}^n \sum_{f_2=0}^{n-f_1} \sum_{f_3=0}^{n-f_1-f_2} \cdots \sum_{f_k=0}^{n-\sum_{i=1}^{k-1} f_i} \begin{cases} \text{pmf}((f_1, f_2, \ldots, f_k); P), & \text{if } \text{pmf}((f_1, f_2, \ldots, f_k); P) \leq \text{pmf}(F; P), \\ 0, & \text{otherwise} \end{cases}\)
or written without nested sums:
\(\sum_{i \in S} \text{pmf}\left(i; P\right)\)
\(S = \left\{ (f_1, f_2, \ldots, f_k) \ \middle| \ \sum_{i=1}^k f_i = n, \text{pmf}\left((f_1, f_2, \ldots, f_k); P\right) \leq \text{pmf}\left(F; P\right)\right\}\)
Google adds