Web site link prediction and semantic relatedness of web pages
Relying solely on Web browsers to navigate large Web sites has created some navigation
problems for users. Many researchers have stressed the importance of improving site user
orientation and have suggested the use of information visualisation techniques, in
particular "site maps" or "overview diagrams" to address this issue. Link prediction and
the semantic relatedness of Web pages have been incorporated into such site maps.
This thesis addresses disorientation within Web sites by presenting a visualisation
of the site in order to answer one of the three fundamental questions identified by Nielsen
and others that users might ask when they become disoriented while navigating a Web site,
namely, Where am I now? Where have I been? Where can I go next?
A method for making link predictions, which is based on Markov chains, has been
developed and implemented in order to answer the third question, "where can I go next?".
The method utilises information about the path already followed by the user. In addition to
link prediction, pages which are semantically similar to the "current" page are
automatically identified using an approach which is based on lexical chains.
The proposed approach for link prediction using an exponentially-smoothed
transition probability matrix incorporating site usage data over a time period was evaluated
by comparing with similar approach developed by Sarukkai. The proposed semantic
relatedness approach using weighted lexical chains was empirically compared with an
earlier approach developed by Green using synset weight vectors.
In conclusion, this thesis argues that Web site link prediction and the identification
of semantically-related Web pages can be used to overcome disorientation. The approaches
proposed are demonstrated to be superior to earlier methods.