Trimmed Mean t-Test for Independent Samples
Explanation
This test could be used to verify if the (trimmed) mean of two populations could be different. For example to test if the average score between national and international students would be different.
As the name implies, this test uses a trimmed mean, rather than the regular mean. A trimmed mean simply removes some of the highest and lowest scores. It also uses a Winsorized variance, which replaces (not removes) some of the highest and lowest scores. The alternative Student t-test uses the regular mean.
Two variations exist for this test. The first is more in line with the Student t-test and uses the same degrees of freedom and the standard error calculated using the SE, as proposed by Yuen and Dixon (1973, p. 374). The other is more in line with the Welch(-Satterthwaite) test as proposed by Yuen (1974, p. 167), and is also sometimes referred to as a Yuen(-Welch) t-test.
The advantage of trimming and using Windsorized variance, is that this test will be more suitable if the data comes from a non-normal distribution, which some claim to be a criteria to use the Student t-test.
Although strickly speaking, the test is for testing trimmed means (Fradette et al, 2003, p. 483), it is often interpreted as if it was for the regular arithmetic mean.
Performing the Test
with Excel
Excel file: TS - Trimmed Means (ind samples) (E).xlsm
with stikpetE
To Be Made
without stikpetE
To Be Made
with Python
Jupyter Notebook: TS - Trimmed Means (ind samples) (P).ipynb
with stikpetP
To Be Made
without stikpetP
To Be Made
with R
Jupyter Notebook: TS - Trimmed Means (ind samples) (R).ipynb
with stikpetR
To Be Made
without stikpetR
To Be Made
Formulas
The formula is (Yuen & Dixon, 1973, p. 394):
\(t = \frac{\bar{x}_{t,1} - \bar{x}_{t,2}}{SE}\)
\(sig = 2\times\left(1 - T\left(\left|t\right|, df\right)\right)\)
With:
\(SE = \sqrt{\frac{SSD_{w,1} + SSD_{w,2}}{m_1 + m_2 - 2}\times\left(\frac{1}{m_1} + \frac{1}{m_2}\right)}\)
\(df = m_1 + m_2 - 2\)
\(\bar{x}_{t,i} = \frac{\sum_{j=g_i+1}^{n_i - g_i}y_{i,j}}{}\)
\(g_i = \lfloor n_i\times p_t\rfloor\)
\(m_i = n_ - 2\times g_i\)
\(SSD_{w,i} = g_i\times\left(y_{i,g_i+1} - \bar{x}_{w,i}\right)^2 + g_i\times\left(y_{i,n_i-g_i} - \bar{x}_{w,i}\right)^2 + \sum_{j=g+1}^{n_i - g_i} \left(y_{i,j} - \bar{x}_{w,i}\right)^2\)
\(\bar{x}_{w,i} = \frac{\bar{x}_{t,i}\times m_i + g_i\times\left(y_{i, g_i+1} + y_{i, n_i-g_i}\right)}{n_i}\)
Yuen (1974, p. 167) suggested the same, except for the standard error and degrees of freedom:
\(SE = \sqrt{\frac{s_{w,1}^2}{m_1} + \frac{s_{w,2}^2}{m_2}}\)
\(s_{w,i}^2 = \frac{SSD_{w,i}}{m_i - 1}\)
\(df = \frac{1}{\frac{c^2}{m_1 - 1} + \frac{\left(1 - c\right)^2}{m_2 -1}}\)
\(c = \frac{\frac{s_{w,1}^2}{m_1}}{\frac{s_{w,1}^2}{m_1} + \frac{s_{w,2}^2}{m_2}}\)
Symbols used:
- \(x_{t,i}\), the trimmed mean of the scores in category i
- \(x_{w,i}\), The Winsorized mean of the scores in category i
- \(SSD_{w,i}\), the sum of squared deviations from the Winsorized mean of category i
- \(m_i\), the number of scores in the trimmed data set from category i
- \(y_{i,j}\), the j-th score after the scores in category i, after they are sorted from low to high
- \(p_t\), the proportion of trimming on each side, we can define
- \(T\left(\left\dots\right)\), the cumulative distribution function of the student t-distribution.
Interpreting the Result
The assumption about the population for this test (the null hypothesis) is that the (trimmed) means are equal for the two samples.
The test provides a p-value, which is the probability of a test statistic as from the sample, or even more extreme, if the assumption about the population would be true. If this p-value (significance) is below a pre-defined threshold (the significance level \(\alpha\) ), the assumption about the population is rejected. We then speak of a (statistically) significant result. The threshold is usually set at 0.05. Anything below is then considered low.
If the assumption is rejected, we conclude that the means in the population will be different.
Note that if we do not reject the assumption, it does not mean we accept it, we simply state that there is insufficient evidence to reject it.
Writing the results
Writing up the results of the test uses the format (APA, 2019 p. 182):
t(<degrees of freedom.>) = <t-value>, p = <p-value>
So for example:
The trimmed mean grade of the national students was 59.64 (\(n_{n}\) = 30), while for the international it was 53.73 (\(n_{i}\) = 11). Using a 10% trimmed Yuen t-test, there was no significant difference, t(14.32) = 0.677, p < .509.
The p-value is shown with three decimal places, and no 0 before the decimal sign. If the p-value is below .0005, it can be reported as p < .001. The name of the test is explicitly mentioned, since it is a somewhat obscure test to use.
APA (2019, p. 88) states to also report an effect size measure.
Next...
After this test you might want an effect size measure. Various options are available for this: Common Language, Cohen d_s, Cohen U, Hedges g, Glass delta, biserial correlation, point-biserial correlation
Alternatives
for the two independent samples, the following tests could be considered:
test | equal variance assumption | normality assumption |
---|---|---|
Student | yes | yes |
Welch | no | yes |
Trimmed | yes | no |
Yuen-Welch | no | no |
Google adds