Module stikpetP.effect_sizes.eff_size_tschuprow_t
Expand source code
def es_tschuprow_t(chi2, n, r, c, cc=None):
'''
Tschuprow T
-----------
Tschuprow T is one possible effect size when using a chi-square test.
It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.
A Bergsma correction is also possible.
Parameters
----------
chi2 : float
the chi-square test statistic
n : int
the sample size
r : int
the number of rows
c : int
the number of columns
cc : boolean, optional
to indicate the use of the Bergsma correction (default is False)
Returns
-------
es : float
Tschuprow's T value
Notes
-----
The formula used is (Tschuprow, 1939, p. 53):
$$V = \\sqrt{\\frac{\\chi^2}{n\\times\\sqrt{\\left(r - 1\\right)\\times\\left(c - 1\\right)}}}$$
*Symbols used:*
* \\(n\\), the total sample size
* \\(r\\), the number of rows
* \\(c\\), the number of columns
* \\(\\chi^2\\), the chi-square value of a test of independence.
The source is a translation from Tschuprow (1925).
The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):
$$\\tilde{V} = \\sqrt{\\frac{\\tilde{\\varphi}^2}{\\min\\left(\\tilde{r} - 1, \\tilde{c} - 1\\right)}}$$
With:
$$\\tilde{\\varphi}^2 = max\\left(0,\\varphi^2 - \\frac{\\left(r - 1\\right)\\times\\left(c - 1\\right)}{n - 1}\\right)$$
$$\\tilde{r} = r - \\frac{\\left(r - 1\\right)^2}{n - 1}$$
$$\\tilde{c} = c - \\frac{\\left(c - 1\\right)^2}{n - 1}$$
$$\\varphi^2 = \\frac{\\chi^{2}}{n}$$
References
----------
Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. *Journal of the Korean Statistical Society, 42*(3), 323–328. doi:10.1016/j.jkss.2012.10.002
Tschuprow, A. A. (1925). *Grundbegriffe und Grundprobleme der Korrelationstheorie*. B.G. Teubner.
Tschuprow, A. A. (1939). *Principles of the mathematical theory of correlation* (M. Kantorowitsch, Trans.). W. Hodge.
Author
------
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076
'''
if cc=="bergsma":
phi2 = chi2/n
rHat = r - (r - 1)**2/(n - 1)
cHat = c - (c - 1)**2/(n - 1)
df = (r - 1)*(c - 1)
phi2 = max(0, phi2 - df/(n - 1))
es = (phi2/((rHat - 1)*(cHat - 1))**0.5)**0.5
else:
es = (chi2/(n*((r - 1)*(c - 1))**0.5)) **0.5
return es
Functions
def es_tschuprow_t(chi2, n, r, c, cc=None)
-
Tschuprow T
Tschuprow T is one possible effect size when using a chi-square test.
It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.
A Bergsma correction is also possible.
Parameters
chi2
:float
- the chi-square test statistic
n
:int
- the sample size
r
:int
- the number of rows
c
:int
- the number of columns
cc
:boolean
, optional- to indicate the use of the Bergsma correction (default is False)
Returns
es
:float
- Tschuprow's T value
Notes
The formula used is (Tschuprow, 1939, p. 53): V = \sqrt{\frac{\chi^2}{n\times\sqrt{\left(r - 1\right)\times\left(c - 1\right)}}}
Symbols used:
- n, the total sample size
- r, the number of rows
- c, the number of columns
- \chi^2, the chi-square value of a test of independence.
The source is a translation from Tschuprow (1925).
The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):
\tilde{V} = \sqrt{\frac{\tilde{\varphi}^2}{\min\left(\tilde{r} - 1, \tilde{c} - 1\right)}}With: \tilde{\varphi}^2 = max\left(0,\varphi^2 - \frac{\left(r - 1\right)\times\left(c - 1\right)}{n - 1}\right) \tilde{r} = r - \frac{\left(r - 1\right)^2}{n - 1} \tilde{c} = c - \frac{\left(c - 1\right)^2}{n - 1} \varphi^2 = \frac{\chi^{2}}{n}
References
Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. Journal of the Korean Statistical Society, 42(3), 323–328. doi:10.1016/j.jkss.2012.10.002
Tschuprow, A. A. (1925). Grundbegriffe und Grundprobleme der Korrelationstheorie. B.G. Teubner.
Tschuprow, A. A. (1939). Principles of the mathematical theory of correlation (M. Kantorowitsch, Trans.). W. Hodge.
Author
Made by P. Stikker
Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076Expand source code
def es_tschuprow_t(chi2, n, r, c, cc=None): ''' Tschuprow T ----------- Tschuprow T is one possible effect size when using a chi-square test. It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1. A Bergsma correction is also possible. Parameters ---------- chi2 : float the chi-square test statistic n : int the sample size r : int the number of rows c : int the number of columns cc : boolean, optional to indicate the use of the Bergsma correction (default is False) Returns ------- es : float Tschuprow's T value Notes ----- The formula used is (Tschuprow, 1939, p. 53): $$V = \\sqrt{\\frac{\\chi^2}{n\\times\\sqrt{\\left(r - 1\\right)\\times\\left(c - 1\\right)}}}$$ *Symbols used:* * \\(n\\), the total sample size * \\(r\\), the number of rows * \\(c\\), the number of columns * \\(\\chi^2\\), the chi-square value of a test of independence. The source is a translation from Tschuprow (1925). The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325): $$\\tilde{V} = \\sqrt{\\frac{\\tilde{\\varphi}^2}{\\min\\left(\\tilde{r} - 1, \\tilde{c} - 1\\right)}}$$ With: $$\\tilde{\\varphi}^2 = max\\left(0,\\varphi^2 - \\frac{\\left(r - 1\\right)\\times\\left(c - 1\\right)}{n - 1}\\right)$$ $$\\tilde{r} = r - \\frac{\\left(r - 1\\right)^2}{n - 1}$$ $$\\tilde{c} = c - \\frac{\\left(c - 1\\right)^2}{n - 1}$$ $$\\varphi^2 = \\frac{\\chi^{2}}{n}$$ References ---------- Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. *Journal of the Korean Statistical Society, 42*(3), 323–328. doi:10.1016/j.jkss.2012.10.002 Tschuprow, A. A. (1925). *Grundbegriffe und Grundprobleme der Korrelationstheorie*. B.G. Teubner. Tschuprow, A. A. (1939). *Principles of the mathematical theory of correlation* (M. Kantorowitsch, Trans.). W. Hodge. Author ------ Made by P. Stikker Companion website: https://PeterStatistics.com YouTube channel: https://www.youtube.com/stikpet Donations: https://www.patreon.com/bePatron?u=19398076 ''' if cc=="bergsma": phi2 = chi2/n rHat = r - (r - 1)**2/(n - 1) cHat = c - (c - 1)**2/(n - 1) df = (r - 1)*(c - 1) phi2 = max(0, phi2 - df/(n - 1)) es = (phi2/((rHat - 1)*(cHat - 1))**0.5)**0.5 else: es = (chi2/(n*((r - 1)*(c - 1))**0.5)) **0.5 return es