Use this URL to cite or link to this record in EThOS:
Title: A data mining model to capture user web navigation patterns
Author: Cabral de Moura Borges, José Luis
ISNI:       0000 0001 3513 5106
Awarding Body: University of London
Current Institution: University College London (University of London)
Date of Award: 2000
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
This thesis proposes a formal data mining model to capture user web navigation patterns. Information characterising the user interaction with the web is obtained from log files which provide the necessary data to infer navigation sessions. We model a collection of sessions as a hypertext probabilistic grammar (HPG) whose higher probability strings correspond to the navigation trails preferred by the user. A breadth-first search algorithm (BFS) is provided to find the set of strings with probability above a given cut-point; we call this set of strings the maximal set. The BFS algorithm is shown to be, on average, linear in the variation of the number of iterations performed with the grammar's number of states. By making use of results in the field of probabilistic regular grammars and Markov chains, the model is provided with a sound foundation which we use to study its properties. We also propose the use of entropy to measure the statistical properties of a HPG. Two heuristics are provided to enhance the model's analysis capabilities. The first heuristic implements an iterative deepening search wherein the set of rules is incrementally augmented by first exploring the trails with higher probability. A stopping parameter measures the distance between the current rule-set and its corresponding maximal set providing the analyst with control over the number of induced rules. The second heuristic aims at finding a small set of longer rules composed of links with high probability on average. A dynamic threshold is provided whose value is set in such a way that it can be kept proportional to the length of the trail being evaluated. Finally, a set of binary operations on HPGs is defined, giving us the ability to compare the structure of two grammars. The operations defined are: intersection, difference, union, and sum.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available