Acoustic typology of vowel inventories and Dispersion Theory: Insights from a large cross-linguistic corpus
This dissertation examines the relationship between the structural, phonemic properties of vowel inventories and their acoustic phonetic realization, with particular focus on the adequacy of Dispersion Theory, which maintains that inventories are structured so as to maximize perceptual contrast between their component vowels.
In order to assess this relationship between structure and realization of vowel inventories, formant frequency data were collected from 320 studies describing the acoustic properties of 555 inventories of a wide variety of structures from 230 different languages (many represented by multiple dialects). The formant data of the different inventories in the corpus were normalized with respect to vocal tract and elicitation method differences, and data from multiple instantiations of the same inventory structure in the same language were pooled. The result of this process is a corpus of normalized formant data from 304 inventories representing unique language-structure combinations. The distribution of structures in this corpus is similar to the distribution attested in studies of structural typology of inventories.
By averaging data from same-structure inventories of different languages, prototypical acoustic realizations of many of the more universally common inventory structures are established, in terms of well-defined ranges of formant frequencies for the component vowels in each structure. In some cases, the emerging acoustic patterns challenge certain previous typological findings. For example, it is shown that instead of the two six-vowel structures [i e a o u [close central unrounded vowel]] and [i e a o u [schwa]], which were assumed to be distinct, with the non-peripheral vowel sharing the same height as either the high or the mid peripheral vowels, there is in fact only one structure, which can be broadly transcribed as [i e a o u [schwa]]. Its non-peripheral vowel [schwa] has a distinct height, and is not more variable than the other vowels in the structure.
Rigorous statistical analyses of the corpus data are used to test various principles of Dispersion Theory, and in most cases these principle are shown to operate consistently. Thus, there is clear correlation between the number of vowels and acoustic space sizes of inventories, with the number of peripheral vowels more correlated with the F1 frequency spans of inventories and the number of non-peripheral vowels with the F2 frequency spans. In addition, instantiations of specific vowel categories shift acoustically as a function of structure so as to escape from crowded regions and fill less crowded ones. For example, low vowels vary horizontally between back, central and front when inventory peripheries are respectively front-crowded asymmetrical, symmetrical and back-crowded asymmetrical. This behavior of low vowels is one aspect of a more general push chain shift process, which is initiated by increased local pressure upon the addition of a peripheral vowel, and propagates with gradual decay through the entire inventory periphery. Moreover, vowel height levels in the inventory periphery are evenly spaced perceptually (only when the log(Hz) scale is used).
In order to accommodate non-peripheral vowels in the inventory, the periphery expands horizontally. However, this effect is sensitive to perceptual needs: when there is only one non-peripheral vowel, this vowel tends to occupy a distinct vowel height and thus contrast with peripheral vowels both horizontally and vertically, requiring only limited horizontal expansion of the periphery. Once a second non-peripheral vowel is added, the non-peripheral vowels are forced into the same height levels as peripheral vowels. Consequently, this contrast becomes predominantly horizontal, and peripheral vowels become substantially more extreme. However, in the lower-mid region, where non-peripheral vowels rarely appear, such horizontal expansion is not necessary, and is indeed avoided.
Unlike vertical spacing, horizontal spacing is not even, and non-peripheral vowels tend to be acoustically and perceptually fronter than the midline between front and back vowels. This tendency to be fronter is significantly stronger in higher non-peripheral vowels, where it results in close proximity of F2 and F3, thus providing support for the Dispersion-Focalization Theory, a particular model that combines dispersion with preference of vowels with proximate formants.
All these behaviors are consistent, but at the same time they are too gradient and subtle to surface in typological studies based on phonological, contrast-oriented descriptions. While the corpus data strongly support the concepts and principles underlying Dispersion Theory in general and Dispersion-Focalization Theory in particular, they also point to severe limitations of previous attempts to formalize these concepts in simulations of computational models, implying that accurate prediction of acoustic patterns of inventory structures would require a drastically revised Dispersion-Focalization Theory model. Possible modifications and additional components necessary for such an improved model are discussed.