April/May 2008, Vol 5, No 2  
e-News Home     |     Previous issues     |     SERA Home




Language and mathematics converge to provide insight

  
Prof Etienne Barnard

  
When analysing the duration of phonemes, one finds that the vowels and consonants behave in quite distinct ways.

In the latest edition of the CSIR Science Scope, Dr Etienne Barnard, co-leader of the Digital@SERA Human Language Technologies research group, contributes to the publication's theme of "the art and science of modeling and simulation".

The divide between the 'two cultures' - one humanist and language oriented, the other scientific, with mathematics as its basic tool - has always been suspect to many. Great scientists often display a strong love of language (Einstein and Gell-Mann are examples), whereas JM Coetzee is one of a long line of famous writers who underwent advanced training in the mathematical sciences. In fact, research is casting doubt on the very basis of this division, as the linkages between language and mathematics become increasingly clear. As a case in point, CSIR researchers in human language technologies (HLT) are using mathematical models to come to terms with spoken and written language.

In a joint project with the University of the Witwatersrand, a team of researchers from the Meraka Institute of the CSIR are deriving mathematical descriptors for the expression of tone in languages such as Northern Sotho. In these languages, the prosody (or 'melody') of a spoken phrase conveys information on many factors, ranging from stress and emotion, to semantics (meaning).

This poses a puzzle: How do listeners know what the purpose of a particular tonal pattern is? We analyse recordings from several speakers, saying carefully selected phrases, and are starting to understand that tonal 'gestures' correspond to speakers' intent in highly predictable ways. Interestingly, these tonal patterns are quite different from those found in other tonal languages such as Mandarin Chinese. Whereas 'level tones' (where the pitch takes on a certain steady value) and 'contour tones' (where the pitch changes in a particular pattern) are quite distinct in Mandarin, speakers of Northern Sotho apparently use levels and contours interchangeably to indicate a certain tonal value. Tone also spreads across adjacent syllables in the African languages according to a definite set of rules that have no counterparts in the better-researched tone languages.

Another application of mathematical modelling in HLT relates to the durations of the basic units of speech (known as 'phonemes'). Although we rarely think of this when speaking, the durations of these units are highly meaningful. Factors such as word and sentence stress, emphasis and emotion clearly all play a role in the determination of phoneme durations, but how do these interact? And are the effects (and their interactions) different for different types of phonemes?

HLT Master's student Charl van Heerden is building sophisticated models that make it possible to answer questions of this nature. He has found that phoneme durations vary in subtle ways between speakers - to such an extent that these durations are actually useful in determining who the speaker was - and also that the broad classes of phonemes (e.g. vowels and consonants) behave in systematically different ways. Van Heerden's models have to address one of the fundamental challenges to many mathematical models in the domain of language technology: Since the range of phenomena that occur in a particular language is very large, it is not feasible to measure all the combinations of factors that may occur in practice. He therefore develops models that are able to generalise to unseen circumstances. Mathematical models with the ability to generalise also play a key role in one of the fundamental tools used in HLT development, namely pronunciation modelling. Pronunciation models allow computer systems to predict the acoustic properties of words based on how they are written and are crucial in applications such as speech recognition. Dr Marelie Davel, HLT research group leader, has developed a succession of increasingly accurate methods for learning pronunciation models from a limited set of example pronunciations in a given language. Her models are now widely used in speech-processing systems for many different languages, and have been shown to be more accurate than alternative approaches when generalising from small data sets.

Alta de Waal, senior HLT researcher, uses the conceptual framework known as Bayesian statistics to create mathematical models of written documents. The internet has stimulated an explosive growth in the number of documents available for reading - a deluge that would be impossible to navigate if we did not have automated tools to process the contents of these documents.

Search engines and PC tools generally base their findings on specific words in a document, but more sophisticated techniques that delve more deeply into the content of a document are a growing trend.

The most successful techniques are still far short from human understanding: The main successes to date rely on mathematical models that discover underlying topics in documents through the repeated usage of key words related to the topic.

These HLT research projects illustrate that mathematics and language are deeply related - mathematical models, along with careful linguistic analysis, provide deep insights along with practical applications.

Enquiries: Professor Etienne Barnard

Source CSIR's Science Scope - March 2008