Trimmed Mean t-Test for Independent Samples

Explanation

This test could be used to verify if the (trimmed) mean of two populations could be different. For example to test if the average score between national and international students would be different.

As the name implies, this test uses a trimmed mean, rather than the regular mean. A trimmed mean simply removes some of the highest and lowest scores. It also uses a Winsorized variance, which replaces (not removes) some of the highest and lowest scores. The alternative Student t-test uses the regular mean.

Two variations exist for this test. The first is more in line with the Student t-test and uses the same degrees of freedom and the standard error calculated using the SE, as proposed by Yuen and Dixon (1973, p. 374). The other is more in line with the Welch(-Satterthwaite) test as proposed by Yuen (1974, p. 167), and is also sometimes referred to as a Yuen(-Welch) t-test.

The advantage of trimming and using Windsorized variance, is that this test will be more suitable if the data comes from a non-normal distribution, which some claim to be a criteria to use the Student t-test.

Although strickly speaking, the test is for testing trimmed means (Fradette et al, 2003, p. 483), it is often interpreted as if it was for the regular arithmetic mean.

Performing the Test

with Excel

Excel file: TS - Trimmed Means (ind samples) (E).xlsm

with stikpetE

To Be Made

without stikpetE

To Be Made

with Python

Jupyter Notebook: TS - Trimmed Means (ind samples) (P).ipynb

with stikpetP

To Be Made

without stikpetP

To Be Made

with R

Jupyter Notebook: TS - Trimmed Means (ind samples) (R).ipynb

with stikpetR

To Be Made

without stikpetR

To Be Made

Formulas

The formula is (Yuen & Dixon, 1973, p. 394):

\(t = \frac{\bar{x}_{t,1} - \bar{x}_{t,2}}{SE}\)

\(sig = 2\times\left(1 - T\left(\left|t\right|, df\right)\right)\)

With:

\(SE = \sqrt{\frac{SSD_{w,1} + SSD_{w,2}}{m_1 + m_2 - 2}\times\left(\frac{1}{m_1} + \frac{1}{m_2}\right)}\)

\(df = m_1 + m_2 - 2\)

\(\bar{x}_{t,i} = \frac{\sum_{j=g_i+1}^{n_i - g_i}y_{i,j}}{}\)

\(g_i = \lfloor n_i\times p_t\rfloor\)

\(m_i = n_ - 2\times g_i\)

\(SSD_{w,i} = g_i\times\left(y_{i,g_i+1} - \bar{x}_{w,i}\right)^2 + g_i\times\left(y_{i,n_i-g_i} - \bar{x}_{w,i}\right)^2 + \sum_{j=g+1}^{n_i - g_i} \left(y_{i,j} - \bar{x}_{w,i}\right)^2\)

\(\bar{x}_{w,i} = \frac{\bar{x}_{t,i}\times m_i + g_i\times\left(y_{i, g_i+1} + y_{i, n_i-g_i}\right)}{n_i}\)

Yuen (1974, p. 167) suggested the same, except for the standard error and degrees of freedom:

\(SE = \sqrt{\frac{s_{w,1}^2}{m_1} + \frac{s_{w,2}^2}{m_2}}\)

\(s_{w,i}^2 = \frac{SSD_{w,i}}{m_i - 1}\)

\(df = \frac{1}{\frac{c^2}{m_1 - 1} + \frac{\left(1 - c\right)^2}{m_2 -1}}\)

\(c = \frac{\frac{s_{w,1}^2}{m_1}}{\frac{s_{w,1}^2}{m_1} + \frac{s_{w,2}^2}{m_2}}\)

Symbols used:

\(x_{t,i}\), the trimmed mean of the scores in category i
\(x_{w,i}\), The Winsorized mean of the scores in category i
\(SSD_{w,i}\), the sum of squared deviations from the Winsorized mean of category i
\(m_i\), the number of scores in the trimmed data set from category i
\(y_{i,j}\), the j-th score after the scores in category i, after they are sorted from low to high
\(p_t\), the proportion of trimming on each side, we can define
\(T\left(\left\dots\right)\), the cumulative distribution function of the student t-distribution.

Interpreting the Result

The assumption about the population for this test (the null hypothesis) is that the (trimmed) means are equal for the two samples.

The test provides a p-value, which is the probability of a test statistic as from the sample, or even more extreme, if the assumption about the population would be true. If this p-value (significance) is below a pre-defined threshold (the significance level \(\alpha\) ), the assumption about the population is rejected. We then speak of a (statistically) significant result. The threshold is usually set at 0.05. Anything below is then considered low.

If the assumption is rejected, we conclude that the means in the population will be different.

Note that if we do not reject the assumption, it does not mean we accept it, we simply state that there is insufficient evidence to reject it.

Writing the results

Writing up the results of the test uses the format (APA, 2019 p. 182):

t(<degrees of freedom.>) = <t-value>, p = <p-value>

So for example:

The trimmed mean grade of the national students was 59.64 (\(n_{n}\) = 30), while for the international it was 53.73 (\(n_{i}\) = 11). Using a 10% trimmed Yuen t-test, there was no significant difference, t(14.32) = 0.677, p < .509.

The p-value is shown with three decimal places, and no 0 before the decimal sign. If the p-value is below .0005, it can be reported as p < .001. The name of the test is explicitly mentioned, since it is a somewhat obscure test to use.

APA (2019, p. 88) states to also report an effect size measure.

Next...

After this test you might want an effect size measure. Various options are available for this: Common Language, Cohen d_s, Cohen U, Hedges g, Glass delta, biserial correlation, point-biserial correlation

Alternatives

for the two independent samples, the following tests could be considered:

Table 1
Independent samples mean tests
test	equal variance assumption	normality assumption
Student	yes	yes
Welch	no	yes
Trimmed	yes	no
Yuen-Welch	no	no

Links to parts

Google adds