Homan Kin Cheung Ma, ME (Software Engineering), 2007

Using Variable Identifiers to Index the Java 1.4.2 API

Abstract
Having a collection of reusable software components or software reuse repository available can be a great asset for a developer in the development of a complex system. The advantages of software reuse, the software development methodology of building systems from existing software components, is well known to provide many benefits, through reduction in development time, the development of more reliable software, less maintenance and generally lower development costs. The Java Standard API is an example of such a software reuse repository and has grown enormously since Java's beginnings, now consisting of over 3,000 classes and 20,000 methods. The intent of this API is to provide high quality components that can be easily reused and so increase the Java developer' productivity but does it? In this thesis, I will try to answer this question.

In order to use the components within the Java API, the programmer has to locate the component first. What was found during the first half of my research, using an extensive corpus of open-source software, was that only about 50\% of the classes within the Standard API are used at all, and around 21\% of the methods are used. Having such a large reusable software component code repository can be very advantageous in the development of any java written software system, however there is generally a difficulty in locating relevant reusable components from the API, reflected by the low usage stats. Indexing tools have been seen as one method in reducing the complexity of locating components from the Java API. However, previous attempts into the development of an indexing tool have shown little success. The second part of my research presented in this thesis focuses on the development and testing of a new implementation of an indexing tool using information found within Java source code, particularly the use of variable identifiers. This research also provides opinions on the idea of information found within source code is not as useful as information found within documentation. I also discuss the implications this has for future development of both the API itself, and for tools to support the API.