Module `stikpetP.effect_sizes.eff_size_tschuprow_t`

Expand source code

def es_tschuprow_t(chi2, n, r, c, cc=None):
    '''
    Tschuprow T
    -----------
    
    Tschuprow T is one possible effect size when using a chi-square test. 
    
    It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.
    
    A Bergsma correction is also possible.
    
    Parameters
    ----------
    chi2 : float
        the chi-square test statistic
    n : int
        the sample size
    r : int
        the number of rows
    c : int
        the number of columns
    cc : boolean, optional 
        to indicate the use of the Bergsma correction (default is False)
        
    Returns
    -------
    es : float
        Tschuprow's T value
   
    Notes
    -----
    The formula used is (Tschuprow, 1939, p. 53):
    $$V = \\sqrt{\\frac{\\chi^2}{n\\times\\sqrt{\\left(r - 1\\right)\\times\\left(c - 1\\right)}}}$$
    
    *Symbols used:*
    
    * \\(n\\), the total sample size
    * \\(r\\), the number of rows
    * \\(c\\), the number of columns
    * \\(\\chi^2\\), the chi-square value of a test of independence.
    
    The source is a translation from Tschuprow (1925). 
    
    The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):    
    $$\\tilde{V} = \\sqrt{\\frac{\\tilde{\\varphi}^2}{\\min\\left(\\tilde{r} - 1, \\tilde{c} - 1\\right)}}$$
    
    With:
    $$\\tilde{\\varphi}^2 = max\\left(0,\\varphi^2 - \\frac{\\left(r - 1\\right)\\times\\left(c - 1\\right)}{n - 1}\\right)$$
    $$\\tilde{r} = r - \\frac{\\left(r - 1\\right)^2}{n - 1}$$
    $$\\tilde{c} = c - \\frac{\\left(c - 1\\right)^2}{n - 1}$$
    $$\\varphi^2 = \\frac{\\chi^{2}}{n}$$
    
    References
    ----------
    Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. *Journal of the Korean Statistical Society, 42*(3), 323–328. doi:10.1016/j.jkss.2012.10.002
    
    Tschuprow, A. A. (1925). *Grundbegriffe und Grundprobleme der Korrelationstheorie*. B.G. Teubner.
    
    Tschuprow, A. A. (1939). *Principles of the mathematical theory of correlation* (M. Kantorowitsch, Trans.). W. Hodge.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    '''
    
    if cc=="bergsma":
        phi2 = chi2/n
        rHat = r - (r - 1)**2/(n - 1)
        cHat = c - (c - 1)**2/(n - 1)
        df = (r - 1)*(c - 1)
        phi2 = max(0, phi2 - df/(n - 1))
        es = (phi2/((rHat - 1)*(cHat - 1))**0.5)**0.5
        
    else:
        es = (chi2/(n*((r - 1)*(c - 1))**0.5))  **0.5
        
    return es

Functions

def es_tschuprow_t(chi2, n, r, c, cc=None)

Tschuprow T

Tschuprow T is one possible effect size when using a chi-square test.

It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.

A Bergsma correction is also possible.

Parameters

chi2 : float: the chi-square test statistic
n : int: the sample size
r : int: the number of rows
c : int: the number of columns
cc : boolean, optional: to indicate the use of the Bergsma correction (default is False)

Returns

es : float: Tschuprow's T value

Notes

The formula used is (Tschuprow, 1939, p. 53): $V = \sqrt{\frac{\chi^2}{n\times\sqrt{\left(r - 1\right)\times\left(c - 1\right)}}}$

Symbols used:

$n$ , the total sample size
$r$ , the number of rows
$c$ , the number of columns
$\chi^2$ , the chi-square value of a test of independence.

The source is a translation from Tschuprow (1925).

The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):
$\tilde{V} = \sqrt{\frac{\tilde{\varphi}^2}{\min\left(\tilde{r} - 1, \tilde{c} - 1\right)}}$

With: $\tilde{\varphi}^2 = max\left(0,\varphi^2 - \frac{\left(r - 1\right)\times\left(c - 1\right)}{n - 1}\right)$ $\tilde{r} = r - \frac{\left(r - 1\right)^2}{n - 1}$ $\tilde{c} = c - \frac{\left(c - 1\right)^2}{n - 1}$ $\varphi^2 = \frac{\chi^{2}}{n}$

References

Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. Journal of the Korean Statistical Society, 42(3), 323–328. doi:10.1016/j.jkss.2012.10.002

Tschuprow, A. A. (1925). Grundbegriffe und Grundprobleme der Korrelationstheorie. B.G. Teubner.

Tschuprow, A. A. (1939). Principles of the mathematical theory of correlation (M. Kantorowitsch, Trans.). W. Hodge.

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Expand source code

def es_tschuprow_t(chi2, n, r, c, cc=None):
    '''
    Tschuprow T
    -----------
    
    Tschuprow T is one possible effect size when using a chi-square test. 
    
    It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.
    
    A Bergsma correction is also possible.
    
    Parameters
    ----------
    chi2 : float
        the chi-square test statistic
    n : int
        the sample size
    r : int
        the number of rows
    c : int
        the number of columns
    cc : boolean, optional 
        to indicate the use of the Bergsma correction (default is False)
        
    Returns
    -------
    es : float
        Tschuprow's T value
   
    Notes
    -----
    The formula used is (Tschuprow, 1939, p. 53):
    $$V = \\sqrt{\\frac{\\chi^2}{n\\times\\sqrt{\\left(r - 1\\right)\\times\\left(c - 1\\right)}}}$$
    
    *Symbols used:*
    
    * \\(n\\), the total sample size
    * \\(r\\), the number of rows
    * \\(c\\), the number of columns
    * \\(\\chi^2\\), the chi-square value of a test of independence.
    
    The source is a translation from Tschuprow (1925). 
    
    The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):    
    $$\\tilde{V} = \\sqrt{\\frac{\\tilde{\\varphi}^2}{\\min\\left(\\tilde{r} - 1, \\tilde{c} - 1\\right)}}$$
    
    With:
    $$\\tilde{\\varphi}^2 = max\\left(0,\\varphi^2 - \\frac{\\left(r - 1\\right)\\times\\left(c - 1\\right)}{n - 1}\\right)$$
    $$\\tilde{r} = r - \\frac{\\left(r - 1\\right)^2}{n - 1}$$
    $$\\tilde{c} = c - \\frac{\\left(c - 1\\right)^2}{n - 1}$$
    $$\\varphi^2 = \\frac{\\chi^{2}}{n}$$
    
    References
    ----------
    Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. *Journal of the Korean Statistical Society, 42*(3), 323–328. doi:10.1016/j.jkss.2012.10.002
    
    Tschuprow, A. A. (1925). *Grundbegriffe und Grundprobleme der Korrelationstheorie*. B.G. Teubner.
    
    Tschuprow, A. A. (1939). *Principles of the mathematical theory of correlation* (M. Kantorowitsch, Trans.). W. Hodge.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    '''
    
    if cc=="bergsma":
        phi2 = chi2/n
        rHat = r - (r - 1)**2/(n - 1)
        cHat = c - (c - 1)**2/(n - 1)
        df = (r - 1)*(c - 1)
        phi2 = max(0, phi2 - df/(n - 1))
        es = (phi2/((rHat - 1)*(cHat - 1))**0.5)**0.5
        
    else:
        es = (chi2/(n*((r - 1)*(c - 1))**0.5))  **0.5
        
    return es