Use this URL to cite or link to this record in EThOS:
Title: Language identification using text, audio and video feature mapping
Author: Dai, Zhuoyi
ISNI:       0000 0004 7973 2000
Awarding Body: University of East Anglia
Current Institution: University of East Anglia
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Unlike text language identification techniques, which are now quite mature, audio and video language identification techniques still face many challenges. One of the main challenges, due to a variety of reasons, is that there are not enough audio and video datasets. However, text data are sufficient for experiments and many text databases are free for research which leads to an interesting question: can we identify an unknown video or audio language based on the relationship between the known text languages? To answer this question, it requires us to examine two issues: language identification and language mapping. In language identification, we compare two methods which are zipping classification and N-gram modelling. An advantage of zipping classification is that it tolerates the lack of long training data and can be applied to a large variety of problems without modification. However, the N-gram model provides a high classification accuracy and efficiency which makes it worthy of consideration. Also, we evaluate another audio classification method based on the MPEG compression to compare with the general zipping tools and the N-gram model. For the language mapping section, we firstly use the Robinson-Foulds tree distance to measure the distances between the language trees and also use Sammon mapping and Shepard's interpolation to map the language distance results from the higher dimensions to the lower dimensions and try to find the optimal language relationships in the specific dimension.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available