Our goal is to create a dependable and robust system that can effectively identify an audio clip from a database of songs and return identification information such as the title.
We propose a system that stores “fingerprints” of audio files in a database. When a match request is initiated, the system takes a fingerprint of the file to be searched for and returns song information based on fingerprint comparisons.
To add a song to the database, the full-length high-quality (44.1 kHz, 16-bits) audio file is sent to the audio fingerprint generator (AFG). The output of AFG is then stored in an array in the database.
To identify an audio clip, the audio clip file (any quality) is sent to the AFG and the resulting fingerprint is compared against every fingerprint in the database using the match recognition system (MRS).
Figure 1 |
---|
Full Correlation |
---|
Now that we have tested both the time and frequency domains, we will look into ways the process can be optimized. The first method is similar to the FFT Method already discussed. However, to improve the robustness of the process, a Rotation Matrix can be added to the mix. The second process involves non-signal processing procedures and classifying images according to their physical properties.
Straightforward Matching through Signal Processing
The first method to improve the image matching process is the "FFT Method" already discussed, with one major addition: a Rotation Matrix. Normal Fourier analysis only works in two dimensions. However, a scanned inmage will, more often than not, be oriented at a certain angle from the normal.
Improved Correlation
Using the library of Matlab files at our disposal, the scanned image is first "passed through" the rotational matrix to get an array of matrices. Each matrix in this array corresponds to a different angular orientation. (The range of the angle can be user set.) From this point, each matrix in the array is compared to the database image matrix. Again, like in the FFT Method, the maximum point is obtained to find the point of highest correlation. The matrix that contains the largest correlation is the angle that matches most closely to the database image. The process follows these steps:
FFT Method w/ Rotation Matrix
- Pass scanned image through a rotation matrix of user set range
- Place each resultant orientation matrix into an array
- Perform the FFT Method on each matrix in the array, against the database image
- Identify the maximum value (highest correlation) and the matrix (image) that a "match" would correspond to.
Properties of the Rotation + FFT Method
- Advantages: Much more robust and likely to get a match compared to the regular FFT method; Doesn't take as much time as Spatial Method
- Disadvantages: While it doesn't take as much time as the spatial method, it is slower than the basic FFT method. Also, as the number of increments in the angle range increase, the slower this method gets. Has the potential to become slower than the spatial method.
Classification by Physical Properties
The second form of optimizing classification is kind of a step in the opposiet direction from ordinary signal processing. Whereas all the previous methods dealt with find the correlation between two photos, this type of classification is a minutiae-based approach . This classification separates fingerprints into different types, as seen below.
Fingerprint Types |
---|
Properties of a Fingerprint |
---|
Minutiae-Based Matching |
---|
CAVEAT:
The database must be large for this fusion of methods to produce faster results.Properties of Minutiae-Based Method
- Advantages: Given the resources, this method is the best, with the highest matching capabilty; When used in conjunction with Fourier methods, can speed up the identification process.
- Disadvantages: Has a large dependence on the image quality and size of the scanned fingerprints; Can be expensive in processor and monetary terms; Database must be large in order for this process to be effective.
Additional Optimization
It has been shown how additional optimization schemes have helped distinct performance features of our biometric authentication process:
- The frequency domain is the processing domain of choice due to the lower computational complexity thus greater computation speed.
- The dynamic rotational matrix utilizes orientation shifts rather than vertical and horizontal shifts. It was necessary to add this feature since the 2D-convolution is not a very robust scheme considering orientation shifts in fingerprint placements. Although this method is more complex since it has the complexity of N*O(frequency domain) (O = computational complexity), it is a more robust method ensuring that carelessness of the end user does not result in unwanted errors.
- Noise is simulated with a point spread function using MATLAB motion blur.
- Deconvolution of the image and the simulated noise is then performed to obtain a “deblurred” image.
For the final page in this series of modules, continue to the module "You Are Cleared for Access..."
Thumbprint Spectrum |
---|
Optimization Through Signal Processing
Historically, there has always been a need for effective authentication solutions. Biometrics has been proven to be an effective answer to this problem because of the uniqueness that biometric keys possess. Therefore, the challenge now is to develop effective biometric solutions optimized to certain constraints depending on the needs and resources of the end user. Some will prefer solutions with real-time results while others will prefer solutions that are nearly error-free. Our research focuses on using the fundamentals of digital signal processing to bring about the optimized solutions.
Two Equivalent Processes, Two Different Domains
Fingerprint matching is process that involves scanning an image, then testing that image against a database of known fingerprints to find out if a match has occurred. This process of finding a "correlation" between two pictures can be executed in one of two ways using signal processing: In the Time/Spatial Domain or in the Frequency Domain. In addition to written analysis, be sure to check out the Matlab .m files on the sidebar.
Time/Spatial Domain
The Time/spatial domain process of matching involves using 2D Convolution to find a correlation between two image matrices. In order to find a correlation, first we can use the Matlab command "conv2" as our workhorse. Once the two images are convolved, the maximum value in the matrix is obtain using the command "max."
Properties of the Spatial Convolution Method
- Advantages: Known Method that works, fairly robust
- Disadvantages: Calculations take up many clock cycles i.e. fairly expensive, timewise
Frequency Domain
Just like in the spatial domain, in the frequency domain we use a Matched Filter to determine the correlation between two images. The frequency domain counterpart to 2D convolution is multiplication using Fourier manipulation. That is, instead of spending a great deal of clock cycles convolving two large images (256x256), we simply can perform the FFT on the two image matrices and then multiply the result. In order to get good results, it is also necessary to use the "normalized" matrices with Matlab's "norm" command. More specifically, we used the Frobenius Norm, also called the Euclidean Norm. Here is the process in full:
Frequency Domain Matching
- Take Frobenius Norm of both images
- Perform the FFT on both images
- Multiply the resultant matrices together
- Similarly to spatial domain, find the maximum value, or correlation
Since the multiplication in the Fourier domain is the same as convolution in the time/spatial domain, we choose Fourier analysis for sake of pure speed and wait time. In Matlab, the "FFT" command is just more optimized for faster calculations, and the difference is noticable.
Properties of the Fourier Method
- Advantages: Same as spatial domain method only much faster
- Disadvantages: Doesn't take into account angular orientation of the fingerprint. Only 2D.
Now that we have looked at the two different domains and tested their equivalence, we move on to further optimization of this classification process. Go to the module, "Classification of Images".
0 comments:
Post a Comment