Data Structures and Algorithms: Experimental Design

Data Structures and Algorithms

Experimental Design

Designing Experiments

Designing a good experiment is not a trivial task: it needs some thought and care if the results of your work are going to lead to valid conclusions. While there are many variations of good strategies, the following basic steps will generally lead to a good procedure:

Form a hypothesis which your experiment(s) will test,
Design some experiments to test this hypothesis,
Run the experiments and collect the data,
Analyze your data:
- Transform your raw experimental data into a form which permits readily verifying or disproving the hypothesis,
- Estimate the experimental error in the original data and transformed data,
- Compare the transformed data with expectations produced by the hypothesis.
If the experimental data and expected results are in accord (within the experiment's error limits), then you have a basis for claiming the original hypothesis to be true.
If there is a discrepancy, you have a number of strategies:
- Analyse the experiments for sources of error,
- Repeat the experiments to confirm the results (not usually very profitable for computer based experiments!)
- Re-design the experiments or
- Form a new hypothesis.

Of course, if you are doing original research (that is, you are testing some new hypothesis for the first time), then you now need to form alternative hypotheses and see whether the experimental data would match them also and repeat the experiments (or formulate new ones) to ensure that you can reproduce the original results. This cycle is potentially never-ending!

However, for the experiments you will need to do in this course, the results are well-known and one 'pass' through this process should suffice!

Rules

There are a few very important rules:

You must report all experimental results, unless they are clearly wrong (eg your program had an error).
You may not reject any measurement unless you can provide an iron-clad argument to justify its rejection.

An example of an acceptable reason for rejecting or ignoring a result is:

The computer's timer has a minimum resolution of 10ms, so all runs with times less than 1s were ignored because they could have had errors >1%.

Of course, a really careful experimenter would not discount these results either, but confirm that they also matched the hypothesis - making allowance for the known error. However, in the interests of efficiency, this level of precision will not be expected here!

Reporting results

It is usual to transform your results to some form which makes it easy to verify that they conform to the hypothesis. The most common strategy is linearisation - plotting results on a graph with axes designed to produce a straight line. Another good strategy is normalisation - reduction of all results to a constant value. This normalisation strategy is a good one to employ here: assume that you are verifying the time complexity of quicksort. This is usually a O(nlogn) algorithm.

Your hypothesis is that quicksort takes n logn time, or more formally: T(n) <= c n log n where T(n) is the time to sort n items and c is a constant.
So time a series of sorts of increasingly larger collections of items.
Divide the times by nlogn. If you have a set of constants, c +/- some suitably small variation, then the running time may be expressed as: T(n) = c n log n verifying your hypothesis.

Note that you will need to ensure the values of n chosen for the experiment span a suitably wide range - so that cn logn, the expected running time also spans a reasonable range. For the sort experiment, this will be satisfied if the largest value of n is 10 times the smallest one - giving times differing by a factor of ~30 ( 10log₂10 ).

However, if you were timing a search algorithm with O(logn) time complexity, then values of n ranging over one order of magnitude will only produce times varying by a factor of just over 3 (log₂n). This will generally be insufficient to provide a convincing verification (over such a small range, other functions of n will also fit quite well!), so you would need to try to have values of n ranging over, say, three orders of magnitude, to give times ranging over one order of magnitude (log₂1000 ~ 10).

Lecture slides

You will find some additional points and examples in the lecture slides.

Back to Course Management Back to the Table of Contents
© , 1998