Module `stikpetP.effect_sizes.eff_size_cramer_v_ind`

Expand source code

def es_cramer_v_ind(chi2, n, r, c, cc=None):
    '''
    Cramer's V for Test of Independence
    -----------------------------------
    
    Cramér's V is one possible effect size when using a chi-square test. 
    
    It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.
    
    As for a classification Cramér's V can be converted to Cohen w, for which Cohen provides rules of thumb.
    
    A Bergsma correction is also possible.
    
    Parameters
    ----------
    chi2 : float
        the chi-square test statistic
    n : int
        the sample size
    r : int
        the number of rows
    c : int
        the number of columns
    bergsma : boolean, optional 
        to indicate the use of the Bergsma correction (default is False)
        
    Returns
    -------
    v : float
        Cramer's V value
   
    Notes
    -----
    The formula used is (Cramér, 1946, p. 282):
    $$V = \\sqrt{\\frac{\\chi^2}{n\\times\\min\\left(r - 1, c - 1\\right)}}$$
    
    *Symbols used:*
    
    * \\(n\\), the total sample size
    * \\(r\\), the number of rows
    * \\(c\\), the number of columns
    * \\(\\chi^2\\), the chi-square value of a test of independence.
    
    The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):    
    $$\\tilde{V} = \\sqrt{\\frac{\\tilde{\\varphi}^2}{\\min\\left(\\tilde{r} - 1, \\tilde{c} - 1\\right)}}$$
    
    With:
    $$\\tilde{\\varphi}^2 = max\\left(0,\\varphi^2 - \\frac{\\left(r - 1\\right)\\times\\left(c - 1\\right)}{n - 1}\\right)$$
    $$\\tilde{r} = r - \\frac{\\left(r - 1\\right)^2}{n - 1}$$
    $$\\tilde{c} = c - \\frac{\\left(c - 1\\right)^2}{n - 1}$$
    $$\\varphi^2 = \\frac{\\chi^{2}}{n}$$
    
    References
    ----------
    Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. *Journal of the Korean Statistical Society, 42*(3), 323–328. doi:10.1016/j.jkss.2012.10.002
    
    Cramér, H. (1946). *Mathematical methods of statistics*. Princeton University Press.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    '''
    m = min(r, c)
    
    if cc=="bergsma":
        phi2 = chi2/n
        mHat = m - (m - 1)**2/(n - 1)
        df = (r - 1)*(c - 1)
        phi2 = max(0, phi2 - df/(n - 1))
        
        es = (phi2/(mHat - 1))**0.5
    else:
        es = (chi2/(n*min(r-1, c-1)))**0.5
        
    return es

Functions

def es_cramer_v_ind(chi2, n, r, c, cc=None)

Cramer's V for Test of Independence

Cramér's V is one possible effect size when using a chi-square test.

It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.

As for a classification Cramér's V can be converted to Cohen w, for which Cohen provides rules of thumb.

A Bergsma correction is also possible.

Parameters

chi2 : float: the chi-square test statistic
n : int: the sample size
r : int: the number of rows
c : int: the number of columns
bergsma : boolean, optional: to indicate the use of the Bergsma correction (default is False)

Returns

v : float: Cramer's V value

Notes

The formula used is (Cramér, 1946, p. 282): $V = \sqrt{\frac{\chi^2}{n\times\min\left(r - 1, c - 1\right)}}$

Symbols used:

$n$ , the total sample size
$r$ , the number of rows
$c$ , the number of columns
$\chi^2$ , the chi-square value of a test of independence.

The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):
$\tilde{V} = \sqrt{\frac{\tilde{\varphi}^2}{\min\left(\tilde{r} - 1, \tilde{c} - 1\right)}}$

With: $\tilde{\varphi}^2 = max\left(0,\varphi^2 - \frac{\left(r - 1\right)\times\left(c - 1\right)}{n - 1}\right)$ $\tilde{r} = r - \frac{\left(r - 1\right)^2}{n - 1}$ $\tilde{c} = c - \frac{\left(c - 1\right)^2}{n - 1}$ $\varphi^2 = \frac{\chi^{2}}{n}$

References

Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. Journal of the Korean Statistical Society, 42(3), 323–328. doi:10.1016/j.jkss.2012.10.002

Cramér, H. (1946). Mathematical methods of statistics. Princeton University Press.

Author

Made by P. Stikker

Companion website: https://PeterStatistics.com
YouTube channel: https://www.youtube.com/stikpet
Donations: https://www.patreon.com/bePatron?u=19398076

Expand source code

def es_cramer_v_ind(chi2, n, r, c, cc=None):
    '''
    Cramer's V for Test of Independence
    -----------------------------------
    
    Cramér's V is one possible effect size when using a chi-square test. 
    
    It gives an estimate of how well the data then fits the expected values, where 0 would indicate that they are exactly equal. If you use the equal distributed expected values the maximum value would be 1, otherwise it could actually also exceed 1.
    
    As for a classification Cramér's V can be converted to Cohen w, for which Cohen provides rules of thumb.
    
    A Bergsma correction is also possible.
    
    Parameters
    ----------
    chi2 : float
        the chi-square test statistic
    n : int
        the sample size
    r : int
        the number of rows
    c : int
        the number of columns
    bergsma : boolean, optional 
        to indicate the use of the Bergsma correction (default is False)
        
    Returns
    -------
    v : float
        Cramer's V value
   
    Notes
    -----
    The formula used is (Cramér, 1946, p. 282):
    $$V = \\sqrt{\\frac{\\chi^2}{n\\times\\min\\left(r - 1, c - 1\\right)}}$$
    
    *Symbols used:*
    
    * \\(n\\), the total sample size
    * \\(r\\), the number of rows
    * \\(c\\), the number of columns
    * \\(\\chi^2\\), the chi-square value of a test of independence.
    
    The Bergsma correction uses a different formula (Bergsma, 2013, pp. 324-325):    
    $$\\tilde{V} = \\sqrt{\\frac{\\tilde{\\varphi}^2}{\\min\\left(\\tilde{r} - 1, \\tilde{c} - 1\\right)}}$$
    
    With:
    $$\\tilde{\\varphi}^2 = max\\left(0,\\varphi^2 - \\frac{\\left(r - 1\\right)\\times\\left(c - 1\\right)}{n - 1}\\right)$$
    $$\\tilde{r} = r - \\frac{\\left(r - 1\\right)^2}{n - 1}$$
    $$\\tilde{c} = c - \\frac{\\left(c - 1\\right)^2}{n - 1}$$
    $$\\varphi^2 = \\frac{\\chi^{2}}{n}$$
    
    References
    ----------
    Bergsma, W. (2013). A bias-correction for Cramér’s and Tschuprow’s. *Journal of the Korean Statistical Society, 42*(3), 323–328. doi:10.1016/j.jkss.2012.10.002
    
    Cramér, H. (1946). *Mathematical methods of statistics*. Princeton University Press.
    
    Author
    ------
    Made by P. Stikker
    
    Companion website: https://PeterStatistics.com  
    YouTube channel: https://www.youtube.com/stikpet  
    Donations: https://www.patreon.com/bePatron?u=19398076
    
    '''
    m = min(r, c)
    
    if cc=="bergsma":
        phi2 = chi2/n
        mHat = m - (m - 1)**2/(n - 1)
        df = (r - 1)*(c - 1)
        phi2 = max(0, phi2 - df/(n - 1))
        
        es = (phi2/(mHat - 1))**0.5
    else:
        es = (chi2/(n*min(r-1, c-1)))**0.5
        
    return es