Muffler Algorithm Guidedavidslife.com › School › Muffler › Algorithm GuideThis document contains a brief guide to speech enhancement algorithms. Introduction to Speech EnhancementThe goal of a speech is to convey a certain message from one person to another. This transmission of data can be described as the communication of raw information. Alternatively, this data communication can also be characterized as the acoustic waveform that carries this information. This acoustic waveform is generally known as a signal. In short-range communications (e.g. unaided two-person configuration), transmitting information via speech is generally foolproof (the message information is successfully transferred from the speaker to the listener). However, in long-range communications that involve a third party system to transfer information, transmitting information via speech can be less reliable. This is usually caused by noise. Noise can be described as the part of the acoustic waveform that (1) is unintentionally created and (2) does not pertain to the message information. Noise can be created by two main sources. The first source of noise is ambient noise. An example of ambient noise is the background noise (vehicles, unrelated human conversations, etc.) in an urban setting. The second source of noise is channel noise. Channel noise is divided into two categories based on the medium through which the sound signal is transmitted. In a medium like water, air, or a vacuum, the channel noise is classified as softwire noise. Some examples of softwire noise are reverberations, 'rippling' noises, and sound distortions caused by high pressure. If the sound signal is transmitted through a medium like coaxial cables, telephone lines, or satellite cables, the channel noise is classified as hardwire noise. Natural electric disturbances (lightning), satellite transponders, and static are all examples of hardwire noise. As one can see, channel noise is wholly dependent on its medium of signal transfer. The third source of noise is frequency dependent noise. The goal of sound enhancement is to remove noise from a particular signal. Speech enhancement is a specific subsection of sound enhancement. Speech enhancement has a three-fold nature: (1) to remove extraneous noise, (2) improve the quality of the signal, and (3) improve the intelligibility of the signal. Quality can be defined as the pleasantness of the sound signal. Quality is important in speech enhancement because this factor decreases listener fatigue and aids intelligibility. Intelligibility can be defined as the percentage of raw information successfully transmitted from the speaker to the listener. Algorithms used in Speech EnhancementThere are five substantial algorithms currently used in speech enhancement. This section will give a brief description of each algorithm. Lowpass / HighpassA lowpass filter is a linear time-invariant (LTI) method that blocks high-pitched frequencies. A highpass filter is a LTI method that blocks low-pitched frequencies. Bandpass / BandstopA bandpass filter is also a LTI method that blocks all frequencies outside a certain range. A bandstop filter is a LTI method that blocks all frequencies inside a certain range. Spectral SubtractionA spectral subtraction method works by taking the power spectral density of a signal when the speaker is talking (clean signal + noise) and when the speaker is not talking (clean signal) and subtracting the two. Speech Enhancement via SSAThe signal subspace algorithm is an altered version of the spectral subtraction method that has proved to be extremely effective in speech processing situation. Below is an illustration of a waveform shown before enhancement and after enhancement.
|
