Table of Contents
Having digitised a speech signal we now have a vector of samples which can be manipulated in various ways. Our aim in this course is to be able to extract characteristic parameters from a signal, which can be used to identify different kinds of speech sounds. This session will discuss some parameters which can be extracted directly from the time domain signal.
When we analyse signals we tend to do so on only small portions at a time. The reason for this is that we typically assume that the signal is constant (eg. has a constant frequency) over the time-span of our analysis. Since speech is actually changing rapidly, we have to cut it into small parts for this assumption to hold. Hence we typically window a speech signal before analysis, which just means that we cut out a small section.
Windowing can be seen as multiplying a signal by a window which is zero everywhere except for the region of interest, where it is one. Since we pretend that our signals are infinite, we can discard all of the resulting zeros and concentrate on just the windowed portion of the signal.
The above window (zeros and ones) is known as a rectangular window because of its shape. One problem with this kind of window is the abrupt change at the edge, which can cause distortion in the signal being analysed (in fact any windowing operation causes distortion, since the signal is being modified by the window). To reduce this distortion we often use a smoother window shape called a Hamming window. This window is zero at the edges and rises gradually to be 1 in the middle. When this window is used the edges of the signal are de-emphasised and the edge effects are reduced.
It is important to use a Hamming (or the similar Hann window) in some kinds of analysis, particularly the frequency domain methods we will look at later. For many time domain methods the rectangular window is preferable.
If we are analysing a long signal, then successive windows are taken along the length of the signal. Often we overlap the windows if a Hamming window is being used so that the de-emphasised part of one window becomes the middle of the next.