Statistical Testing

Let’s look back at the data from the sorting experiment presented in the Numerical Data section:

Run	Bubble	Quick	Selection	Insertion	Merge
1	17384	24	3258	3	30
2	17559	21	3386	3	27
3	17795	19	3344	4	28
4	17484	20	3417	3	28
5	17642	19	3358	3	30
Average	17572.8	20.6	3352.6	3.2	28.6

In the example, it is quite clear that bubble sort is much slower than quicksort, but is quick sort faster than merge sort? This is not as clear. The average values show that quick sort is faster, but the difference is quite small.

Statistical Test

To be absolutely sure we must make a statistical test. When comparing two average values the most common test is a T-test. If we do a T-test between the execution times of quicksort and merge sort we get a P-value of 0.001. If the P-value is lower than 0.05 the difference is statistically significant. Since this is true in our example, we can safely say that quicksort is faster than merge sort!

If we want to compare three or more average values we must use another test called ANOVA. Note that both the T-test and ANOVA requires that your data is normally distributed (approximately follows the normal distribution). If it is not normally distributed, the Wilcoxon tests must be used for comparing two average values and the Kruskal-Wallis or Friedman test for three or more average values. An example of data that is typically not normally distributed is Likert and rating scales used in questionnaires. Ask your supervisor if you are unsure which test to use for your data.

All tests will output a P-value. If the P-value, as said before, is lower than or equal to 0.05 the difference is statistically significant and we can say that there is a difference between the average values. If you have three or more averages, the test will also tell you which pairwise differences that are statistically significant and which are not. If you, for example, have average execution times of three algorithms A, B and C the difference between A and B can be statistically significant but not the difference between B and C.

If the P-value is above 0.05 the difference is so small that we cannot rule out that it might be caused by chance. In this case, we simply say that there is no difference between the average values.

How to conduct a test?

EZ Statistics is a web service that supports most of the common statistical tests and is easy to use.

The t-test can also be made in Excel. You can read a guide about it here.

Degree Projects in Computer Science

Statistical Test

How to conduct a test?

Welcome to CoursePress

Student account

Log in LNU