Progress Log
Weekly journal entries.
March 12
First meeting with company representatives.
More details about the project received,
including specifications and requirements.
March 21
First meeting with supervisor.
Went over the details of the project and
discussed about its requirements. Initial project plan developed.
March 28
Project plan revised and finalized.
April 17
Research on two possible data mining prediction models
to be used in this project - Clustering and Classification.
April 24
Prepared project requirements document and further development
on choice of prediction model.
May 5
Prepared for introductory seminar and prediction model
finalized to cluster analysis.
May 8
Went over introductory seminar with supervisor and advised on improvements.
May 9
INTRODUCTORY SEMINAR
Requirements sent to Greg.
May 30
Requirements finalized with Greg and analysed the data entry requirements of
the application. Begin research on K-Means clustering algorithm and begin
implementation
==========
Exam break
==========
July 3
Asked Greg to provide sample data and in the mean time revised clustering algorithm
with made up data. Began research on cluster evaluation techniques.
July 14
Reviewed implementation with supervisor. Revised algorithm and analysed
made up data for relevance.
July 29
Sample data received and code adapted to suit new information. Began
implementation on cluster evaluation method, silhouette measure.
July 30
Finalised silhouette measure and got data to be loaded imported instead
of manually entering. Prepared for mid-year presentation.
August 1
Reviewed revised prepared mid-year presentation with supervisor
August 3
Final edits to mid-year presentation slides and tried to improve silhouette
value of clusters.
August 6
Decided to normalize duration to dataset to avoid possible outliers. Practised
for presentation.
August 7
MID-YEAR SEMINAR
August 8
Began writing mid-year report.
August 12
Project web page created and mid-year report uploaded.
August 14
Sent Greg presentation slides and mid-year report.
August 28
Begin validation of clusters and output stop duration, and further accuracy and
performance tweaking.
September 11
More data cleaning performed including graphing the frequency of durations to
determine outliers in the dataset.
Began implementation of the final prediction evaluation method - 10x10-Fold Cross Validation
September 18
Continued with final prediction evaluation implementation with more data cleaning.
Decided to graph the SSE values against the number of clusters to find the optimal number for K.
September 25
Finished first implementation of the final prediction evaluation method.
October 3
Found errors in the final prediction evaluation method and began fixing them.
Began planning of the final user interface of the application.
October 20
Finalized the final prediction evaluation method and tested for any more errors.
Began working on final presentation slides.
October 22
FINAL PRESENTATION
October 23
Began writing the final report.
October 28
Completed report and uploaded to website.
END OF PROJECT
Looking for the resources of this project?
Click Here