Use this URL to cite or link to this record in EThOS:
Title: Functional and structural descriptors for software component retrieval
Author: Yusof, Yuhanis
ISNI:       0000 0004 2750 214X
Awarding Body: Cardiff University
Current Institution: Cardiff University
Date of Award: 2007
Availability of Full Text:
Access from EThOS:
Access from Institution:
Identifying appropriate software components in a repository is an important task in software reuse after all, components must be found before they can be reused. Program source code documents written in a computer programming language has the possibility to be a software component. Program source code is a form of data, containing both structure and function it is therefore important to make use of this information in representing programs in a software repository. Existing approaches in software component retrieval systems focus on retrieving a component based on either its function or structure. Such an approach may not be suitable to users that require examples of programs that illustrate a particular function and structure, there is therefore a need for combining this information together. The objective of this research is to build a software repository of Java programs, to facilitate the search and selection of programs using the information about a program's function and structure. The hypothesis is that retrieval of program source code is better undertaken using a combination of functional and structural descriptors rather than using functional descriptors on their own. This thesis presents a program retrieval and indexing model which can be used in developing a source code retrieval system. The model reveals on how functional and structural descriptors are identified and combined into a single representation. The functional descriptors are identified by extracting selected terms from program source code and a weighting scheme is adopted to differentiate the importance of terms. As programs in the repository are from open-source applications, extracting information that does not rely on semantic terms would be beneficial, as these programs are written by various developers with different programming background and experience. Structural descriptors that comprise of information generated based on structural relationships, such as design patterns and software metrics, are extracted from a program to be added as the program descriptor. The functional and structural descriptors are combined into a single index, known as a compound index, which is used as a program descriptor. The degree of similarity between a given query and programs in a repository is identified using measurements undertaken based on vector model and data distribution based approaches. Lessons learned from the experiments undertaken reveals that programs retrieved using the proposed method are less complex and easy to maintain. Furthermore, it is suggested that programs from different application domains contain different trends in their software metrics.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available