Progress Log

Weekly journal entries.

March 12
First meeting with company representatives. More details about the project received, including specifications and requirements.

March 21
First meeting with supervisor. Went over the details of the project and discussed about its requirements. Initial project plan developed.

March 28
Project plan revised and finalized.

April 17
Research on two possible data mining prediction models to be used in this project - Clustering and Classification.

April 24
Prepared project requirements document and further development on choice of prediction model.

May 5
Prepared for introductory seminar and prediction model finalized to cluster analysis.

May 8
Went over introductory seminar with supervisor and advised on improvements.

May 9
INTRODUCTORY SEMINAR
Requirements sent to Greg.

May 30
Requirements finalized with Greg and analysed the data entry requirements of the application. Begin research on K-Means clustering algorithm and begin implementation

==========
Exam break
==========

July 3
Asked Greg to provide sample data and in the mean time revised clustering algorithm with made up data. Began research on cluster evaluation techniques.

July 14
Reviewed implementation with supervisor. Revised algorithm and analysed made up data for relevance.

July 29
Sample data received and code adapted to suit new information. Began implementation on cluster evaluation method, silhouette measure.

July 30
Finalised silhouette measure and got data to be loaded imported instead of manually entering. Prepared for mid-year presentation.

August 1
Reviewed revised prepared mid-year presentation with supervisor

August 3
Final edits to mid-year presentation slides and tried to improve silhouette value of clusters.

August 6
Decided to normalize duration to dataset to avoid possible outliers. Practised for presentation.

August 7
MID-YEAR SEMINAR

August 8
Began writing mid-year report.

August 12
Project web page created and mid-year report uploaded.

August 14
Sent Greg presentation slides and mid-year report.

August 28
Begin validation of clusters and output stop duration, and further accuracy and performance tweaking.

September 11
More data cleaning performed including graphing the frequency of durations to determine outliers in the dataset.
Began implementation of the final prediction evaluation method - 10x10-Fold Cross Validation

September 18
Continued with final prediction evaluation implementation with more data cleaning.
Decided to graph the SSE values against the number of clusters to find the optimal number for K.

September 25
Finished first implementation of the final prediction evaluation method.

October 3
Found errors in the final prediction evaluation method and began fixing them.
Began planning of the final user interface of the application.

October 20
Finalized the final prediction evaluation method and tested for any more errors.
Began working on final presentation slides.

October 22
FINAL PRESENTATION

October 23
Began writing the final report.

October 28
Completed report and uploaded to website.

END OF PROJECT

Looking for the resources of this project?

Click Here