We have seen that a filter is a system that alters the frequency content of an acoustic signal. The filter in speech production is the vocal tract, a tube of uneven cross section which is closed at one end (by the glottis) and measures around 17.5 cm for the average male. As with any other filter, this tube has a characteristic spectrum. Importantly, this spectrum changes as the shape of the vocal tract changes during speech production. The different qualities of the speech sounds are produced by changing the shape of the vocal tract to produce a particular set of filter characteristics.
To understand the effects of the vocal tract we can model it as a cylindrical tube, closed at one end, which makes the calculation of the filter spectrum much easier. The spectrum of this tube has a set of peaks which correspond to resonant frequencies -- basically the frequencies of vibration which fit into the tube best. For a full discussion of this model see Harrington and Cassidy Section 3.3.1. Resonance is the property which allows you to get a single note out of a half full beer bottle by blowing over the top: the wavelength of the vibration is such that it fits within the tube. Change the length of the tube (by taking a swig of beer) and the note changes because now a different wavelength fits the tube best. In fact you can get more than one note if you are well practiced -- this is how a trumpet can play so many notes with only three `keys'. Each length of tube can fit a number of different resonances but the resonances frequencies are all related to the length of the tube.
The end result then, is that a cylindrical tube has a spectrum with a set of peaks at around 500Hz, 1500Hz, 2500Hz etc. which correspond to the resonant frequencies of the tube. Now, if the shape of the tube is changed, for example by adding a constriction halfway along it, the position of these resonances changes also. In the case of a constriction, the effect is to move the resonant frequencies relative to each other -- so that they are no longer equally spaced. Translating this back to the real vocal tract, the cylindrical tube corresponds approximately to a relaxed vocal tract and the corresponding pattern of resonance matches that of the schwa vowel. Constrictions are introduced by raising the tongue and moving the position of the raised tongue back and forward. This has the effect of modifying the positions of the resonances relative to each other, producing the different vowel sounds.
Another filter is involved in speech production, corresponding to the effect of the lips and the air beyond them on the acoustic signal. This lip-radiation filter has a characteristic spectrum which is a 6dB/octave rise -- that is the lower frequencies are attenuated wheras higher frequencies are not. This spectrum is multiplied by the spectrum of the vocal tract to produce an overall spectrum for the system.