Cross/Contingency Table
A contingency table can be defined as “tables arising when observations on a number of categorical variables are cross-classified” (Everitt, 2004, p.89). An example is shown in table 1.
Female | Male | Total | ||
---|---|---|---|---|
Valid | Married | 456 ( 52%) |
516 ( 48%) |
972 ( 50%) |
Widowed | 58 ( 7%) |
123 ( 12%) |
181 ( 9%) |
|
Divorced | 142 ( 16%) |
172 ( 16%) |
314 ( 16%) |
|
Separated | 29 ( 3%) |
50 ( 5%) |
79 ( 4%) |
|
Never married | 188 ( 22%) |
207 ( 19%) |
395 ( 20%) |
|
Total | 873 (100%) |
1068 (100%) |
1941 (100%) |
Click here to see how you can create a cross table...
with SPSS
There are a two different ways to create a cross table with SPSS.
using Crosstabs
Data file: Holiday Fair.sav.
using Custom Tables
Data file: Holiday Fair.sav.
From the table we can tell that for example there were 456 respondents who indicated to be female and married.
There are quite a few variations on the name for this type of table. Perhaps the oldest name is actually contingency table, which was the name Pearson (1904, p. 34) gave to them. Another popular name is cross tabulation (Upton & Cook, 2002, p. 79), but also cross classification table (Zekeck, 2014, p. 71) and bivariate frequency table (Porkess, 1988, p. 48) are used. The one I used cross table which can for example be found in Newbold et al. (2013, p. 9) or Sá (2007, p. 52).
Which variable in the rows and which in the columns?
When constructing a cross table as above the first decision is to choose which variable will be in the columns, and which will be representing the rows. Some textbooks will decide on this based on what the independent and what the dependent variable is. As the names imply an independent variable is a variable that does not depend on something, and the dependent variable does (Porkess, 1991, p. 64)a. In the example table it is unlikely that your gender will depend on your marital status, but it could be that your marital status depends on your gender. Gender is therefor in the example the independent variable, and marital status the dependent. Demo-graphical variables (age, gender, city, etc.) are often independent variables.
One problem however is that some textbooks will say to place the independent variable in the columns (De Vaus, 2002, p. 243; Wrenn et al., 2002, p. 213), and other textbooks will say to place it in the rows (Acock, 2008, p. 110; Huizingh, 2007, p. 246). In my experience there are more textbooks using the independent in columns convention, but I haven’t done any research on this.
Another consideration can be avoiding to waste space on a paper or screen size. If you have 10 values for one variable and only two for another variable, it might save some space to place the variable with 10 values in the columns, provided it will fit on the page. If one variable has so many values that it doesn’t fit on the page if you place them in the columns, then placing it in the rows might solve the problem.
In short, my suggestion would be as follows. If you have one variable that will not fit into the columns then it should be placed in the rows, otherwise if you have a clear independent variable and a dependent variable, place the independent variable in the columns. In all other cases, do what you want.
Calculating Percentages
There are three ways to calculate percentages.
-
Column totals
Uses the column totals for 100%. For example the married-female would be 456/873 × 100 = 52%. This indicates that of those that indicated to be female, 52% were married. -
Row totals
Uses the row totals for 100%. For example the married-female would be 456/972 × 100 = 47%, indicating that 47% of those who were married were female. -
Grand totals
Uses the grand total for 100%. For example the married-female would be 456/1941 × 100 = 23%, indicating that 23% of the respondents were married and female.
The one to use will depend on the situation, but my suggestion would be that if you have an independent and a dependent variable, to use the indepenent variable as the totals. For example, if the independent variables are in the columns, use the column totals. This way, you can easily compare the independent categories with each other. See the previous section on the terms independent and dependent variable.
Google adds