Speech Processing: LPC Exercise in MATLAB

MATLAB Exercises

First, take a simple signal (e.g., one period of a sinusoid at some frequency) and plot its autocorrelation sequence for appropriate values of l. You may wish to use the xcorr MATLAB function to compare with your own version of this function. At what time shift l is r_ss[l] maximized and why? Is there any symmetry in r_ss[l]? What does r_ss[l] look like for periodic signals?

Next, write your own version of the Levinson-Durbin algorithm in MATLAB. Note that MATLAB uses indexing from 1 rather than 0. One way to resolve this problem is to start the loop withi=2, then shift the variables k, E, α, and r_ss to start at i=1 and j=1. Be careful with indices such as i−j, since these could still be 0.

Apply your algorithm to a 20- 30 ms segment of a speech signal. Use a microphone to record .wav audio files on the PC using Sound Recorder or a similar application. Typically, a sample rate of8 kHz is a good choice for voice signals, which are approximately bandlimited to 4 kHz. You will use these audio files to test algorithms in MATLAB. The functions wavread, wavwrite,sound will help you read, write and play audio files in MATLAB:

The output of the algorithm is the prediction coefficients a_k (usually about P=10 coefficients is sufficient), which represent the speech segment containing significantly more samples. The LPC coefficients are thus a compressed representation of the original speech segment, and we take advantage of this by saving or transmitting the LPC coefficients instead of the speech samples. Compare the coefficients generated by your function with those generated by the levinson or lpc functions available in the MATLAB toolbox. Next, plot the frequency response of the IIR model represented by the LPC coefficients (see Speech Processing: Theory of LPC Analysis and Synthesis). What is the fundamental frequency of the speech segment? Is there any similarity in the prediction coefficients for different 20- 30 ms segments of the same vowel sound? How could the prediction coefficients be used for recognition?

Implementation

The sample rate on the 6-channel DSP boards is fixed at 44.1 kHz, so decimate by a factor of 5 to achieve the sample rate of 8.82 kHz, which is more appropriate for speech processing.

Compute the autocorrelation or autocovariance coefficients of 256-sample blocks of input samples from a function generator for time shifts l={0,1,…,15} (i.e., for P=15) and display these on the oscilloscope with a trigger. (You may zero out the other 240 output samples to fill up the 256-sample block). For computing the autocorrelation, you will have to use memory to record the last 15 samples of the input due to the overlap between adjacent blocks. Compare the output on the oscilloscope with simulation results from MATLAB.

The next step is to use a speech signal as the input to your system. Use a microphone as input to the original thru6.asm code and adjust the gains in your system until the output uses most of the dynamic range of the system without saturating. Now, to capture and analyze a small segment of speech, write code that determines the start of a speech signal in the microphone input, records a few seconds of speech, and computes the autocorrelation or autocovariance coefficients. The start of a speech signal can be determined by comparing the input to some noise threshold; experiment to find a good value. For recording large segments of speech, you may need to use external memory. Refer to Core File: Accessing External Memory on TI TMS320C54x for more information.

Finally, incorporate your code which computes autocorrelation or autocovariance coefficients with the code which takes speech input and compare the results seen on the oscilloscope to those generated by MATLAB.

Integer division (optional)

In order to implement the Levinson-Durbin algorithm, you will need to use integer division to do Step 1 of the algorithm. Refer to the Applications Guide [link] and the subc instruction for a routine that performs integer division.