HIERARCHICAL CLUSTERING OF SPEAKERS INTO ACCENTS WITH THE ACCDIST METRIC

Mark Huckvale
University College London

ID 1204
[full paper]

Hierarchical clustering of speakers by their pronunciation patterns could be a useful technique for the discovery of accents and the relationships between accents and sociological variables. However it is first necessary to ensure that the clustering is not influenced by the physical characteristics of the speakers. In this study a number of approaches to agglomerative hierarchical clustering of 275 speakers from 14 regional accent groups of the British Isles are formally evaluated. The ACCDIST metric is shown to have superior performance both in terms of accent purity in the cluster tree and in terms of the interpretability of the higher-levels of the tree. Although operating from robust spectral envelope features, the ACCDIST measure also showed the least sensitivity to speaker gender. The conclusion is that, if performed with care, hierarchical clustering could be a useful technique for discovery of accent groups from the bottom up.