How to check t-test results/test for significance in Quantum/Quanvert/Dimensions (i.e. mrStudio, mrTables, Desktop Reporter)
You cannot tell if two percentages/means will be significantly different or not just by looking at the percentages/means themselves as these are only one factor in the equation.
Sometimes you may find that Quantum/Quanvert/mrStudio/mrTables/Desktop Reporter does not indicate significance when you think it should or vice versa. For example you may find a 1% difference between two percentages is flagged as being significant or a large percentage difference is not, how can we check to see if these findings are correct?
When carrying out statistical tests to see if there is a significant difference between two percentages or means, the difference between the percentages or means is only one of the factors involved in determining whether a difference is significant or not. Other factors include:
1) The size of the samples being compared. It is more probable that a difference will be significant when the sample sizes are larger. This is because in the formula for the calculation of the value of the T value TVAL, the divisor computes the values of 1/e1 and 1/e2 (where ei is the effective base for column i), so the larger the value of ei, the smaller the value of 1/ei and thus the larger the value of T.
2) Weighting. Weighting affects the makeup of the sample and so may introduce significance differences when there are not any. For this reason these products use a base called the effective base to reduce the likelihood of the statistics producing significant results.
3) Overlapping data. Most statistical tests are based on comparing two independent samples, that is no respondent is common to both samples. Due to the nature of market research questions, there may be occassions where this is not the case, for example the two samples being compared are two products A and B and one or more respondent has given A and B as an answer. The overlap option makes an adjustment to the formulae used to ensure that respondents who are common to both columns are only counted once.
4) Level of significance. The statistic computed froma difference between two percentages or means will show more significant results at higher levels of significance, that is a statistic is more likely to be significant at a 10% significance (90% confidence) level than a 5% significance (95% confidence) level.
Because of these factors you cannot simply make decisions as to whether the difference between two percentages or means will be significant or not just by examining the table results. In order to determine whether a difference is significant or not, we need to use the figures printed out to the tstat.dmp file from the tstat option debug (Quantum/Quanvert) or the diagnostics file from Dimensions. Please refer to the following DDL topics to produce this information for the Dimensions products.
mrtables.chm::/Tabug_view_diagnostics_howto.htm (mrTables/Web Reports for Surveys)
Reporter.chm::/Repug_tasks_stats_details.htm (Desktop Reporter/Reports for Surveys)
Let's work through a worked example using the table mode x class from the example Quanvert database, skidemo. The test that has been applied is a column proportions test, run at the 90% confidence (10% significance level).
Table of mode analysed by class with tstat prop test run at 90% confidence level
Base AB C1 C2 DE
(A) (B) (C) (D)
Mode of transfer from Airport to Resort
Base 3142 648 713 861 920
Helicopter 273 59 59 89 65
8.7% 9.2% 8.3% 10.4%D 7.1%
Train 939 198 218 234 288
29.9% 30.5% 30.6% 27.2% 31.3%C
Road (Car/Coach 1930 391 436 537 566
etc) 61.4% 60.4% 61.1% 62.4% 61.6%
SPSS MR - for demonstration purposes only
Proportions: Columns Tested (10% risk level) - A/B/C/D
Excerpt of tstat.dmp file produced for Helicopter row in table above
TEST P1 P2 CORR S.E. D.O.F. RISK TVAL PVAL
A B 0.091502 0.082506 0.000000 0.015368 1345.862 0.10 0.585401 0.558377
A C 0.091502 0.103935 0.000000 0.015658 1477.012 0.10 0.794032 0.427305
A D 0.091502 0.071082 0.000000 0.013988 1541.659 0.10 1.459796 0.144550
B C 0.082506 0.103935 0.000000 0.014915 1545.774 0.10 1.436775 0.150984
B D 0.082506 0.071082 0.000000 0.013311 1610.421 0.10 0.858201 0.390909
C D 0.103935 0.071082 0.000000 0.013514 1741.571 0.10 2.431092 0.015154
Looking at the Helicopter row we can see printed the letter D in the column for social class C2, column C. This indicates that the proprtion in column C, 10.4% is significantly different from that in column D (7.1%) at the 10% significance level.
If we look at columns A and D in the same row, we can see no letters, so at the 10% significance level, no significant difference was found.
In order to determine how Quantum arrived at these findings we need to look at the values in the tstat.dmp file.
Let's take our first set of columns C and D and see how Quantum deduced there was a significance difference between the two proportions in these columns. We can do one of two things:
look at the PVAL column and compare it with the value in the RISK column. If the PVAL is less than the RISK value there is a significant difference.
compare the TVAL with the critical value for the level at which the test is applied with that listed in the table below. If the TVAL is greater than that listed, then there is a significant difference.
Critical Values for a two-tail test Significance Level Critical Values
Note: In reality the TVAL can be +ve or -ve, depending on whether p1>p2 or mean1>mean2 or vice versa. If p1>p2 or mean1>mean2 then the TVAL is really negative. However the tstat.dmp value is always shown as the positive value, so you do not need to worry about comparing a -ve TVAL value with a -ve critical value to detect significance.
If we look in the row C D, we can see the RISK value is 0.10 and the PVAL is 0.015154. As the PVAL is less than the RISK value, then the proportions in column C and D are significantly different to each other and Quantum reflects this by printing the letter of the column containing the smaller proportion (D) in that containing the larger (C). Looking at the TVAL we can see this is 2.431092. Comparing this with the critical value at the 10% significance level of 1.64, TVAL is greater than this critical value and so C and D are significantly different to each other.
For the row A D, the RISK value is still 0.10 but the PVAL is 0.144550. As the PVAL is greater than the RISK value, then the proportions for columns A and D are not significantly different to each other. Looking at the TVAL we can see this is 1.459796. Comparing this with the critical value at the 10% significance level of 1.64, TVAL is smaller than this critical value and so A and D are not significantly different to each other.
To help you calculate the T statistic for the more commonly applied T-tests: column proportions; column means and paired preference please use this Excel spreadsheet.
*Note Grid tables can not be checked using the attached spreadsheet. Can still check the pval and risk using the tstat.dmp file.
**Note: If in the tstat.dmp file P1-P2 or MEAN1-MEAN2 is equal to Zero (0) then the stat test will not be applied.
If you are conducting statistical tests on manipulated rows, please refer to resolution 63381 in our Knowledgebase.
To assist you, we have made two Excel spreadsheets that allow you to enter the information from the table output and diagnostics file to verify that the t value computed by the software is correct.
The file to use for Quantum/Quanvert is the one called Quantum_Diagnostics_Checker.xls, while that for Dimensions/Data Collection is TOM_Diagnostics_Checker.xls.
These spreadsheets only include the more popular tests such as column proportions and column means. Please read the information in the first page of each file for more details of when you can use these files and how to work with them
|Business Analytics||SPSS Data Collection Heritage||Quantum||5.8, 5.7|
|Business Analytics||SPSS Data Collection Heritage||Quanvert||1.7|