The development of an enhanced electropalatography system for speech research
To understand how speech is produced by individual human beings, it is fundamentally important to be able to determine exactly the three-dimensional shape of the vocal tract. The vocal tract is inaccessible so its exact form is difficult to determine with live subjects. There is a wide variety of methods that provide information on the vocal tract shape. The technique of Electropalatography (EPG) is cheap, relatively simple, non-invasive and highly informative. Using EPG on its own, it is possible to deduce information about the shape, movement and position of tongue-palate contact during continuous speech. However, data provided by EPG is in the form of a two-dimensional representation in which all absolute positional information is lost. This thesis describe the development of an enhanced Electropalatography (eEPG) system, which retains most of the advantages of EPG while overcoming some of the disadvantages by representing the three-dimensional (3D) shape of the palate. The eEPG system uses digitised palate shape data to display the tongue-palate contact pattern in 3D. The 3D palate shape is displayed on a Silicon Graphics workstation as a surface made up of polygons represented by a quadrilateral mesh. EPG contact patterns are superimposed onto the 3D palate shape by displaying the relevant polygons in a different colour. By using this system, differences in shape between individual palates, apparent on visual inspection of the actual palates, are also apparent in the image on screen. The contact patterns can be related more easily to articulatory features such as the alveolar ridge since the ridge is visible on the 3D display. Further, methods have been devised for computing absolute distances along paths lying on the palate surface. Combining this with calibrated palate shape data allows measurements accurate to 1 mm to be made between contact locations on the palate shape. These have been validated with manual measurements. The sampling rate for EPG is 100Hz and the data rate is equivalent to 62 bits per 10ms. In the past few years, some coding (parameterization) methods have been introduced to try to reduce the amount of data while retaining the important aspects. Feature coding methods are proposed here and several parameters are investigated, expressed in terms of both conventional measures such as row number, and in absolute measures of distance and area (i.e. mm and mm2). Features studied include location of constriction and degree of constriction. Finally, in order to reduce the amount of data while retaining the spatial information, composite frames that represent a series of EPG frames are computed. Measures of goodness of the composite frames that do and do not use 3D data are described. Some example are given in which fricative data has been processed by generating a composite frame for the entire fricative, and computing an area estimate for each row of the composite frame using the assumption of a flat tongue. This thesis demonstrates the current capability and inherent flexibility of the enhanced electropalatography system. In the future, the eEPG system can be extended to compute volume estimates again using a flat tongue model. By incorporating information on the tongue surface provided by other imaging methods such as ultrasound, more accurate area and volume estimates can be obtained.