This article refers to the address: http://
Author Email:Abstract : This paper presents several hardware-related issues that should be considered when designing high-performance speech recognition products using RSC-3x: noise reduction, circuit design, PCB design, microphone selection, microphone placement, and power supply design.
Keywords : RSC-3x, recognition rate, noise reduction, circuit design, PCB design, microphone, power supply
RSC-3x is an interactive voice product from Sensory Corporation of the United States. Like the other series of RSC products, it uses the neuron algorithm to realize the function of speech recognition. In an ideal environment, its recognition rate can reach over 97%. It also has voice processing functions such as speech synthesis, recording playback, and four-channel music synthesis. With an 8-bit processor, the RSC-3x also implements system control for general-purpose processors. The high performance and moderate price of the RSC-3x makes it primarily used in consumer electronics and price sensitive home appliances.
However, how can we develop a good speech recognition product using the RSC-3x series? This paper presents several issues that should be considered in terms of hardware when designing speech recognition products.
First, noise reduction
The accuracy of speech recognition (referred to as recognition rate) will be reduced by many factors. One of the most common factors contributing to the decline in recognition rate is noise: electronic noise from inside the system and audible noise picked up by the microphone. One of the main innovations of the RSC-3x is the addition of an audio preamplifier circuit to its chip. The voltage signal from a typical electret microphone is only millivolts, and the entire preamplifier gain signal that can be used by the RSC-3x is amplified by more than 200 times. With the built-in preamplifier circuit of the RSC-3x, this amplification can be achieved with just a few passive components. Good grounding and elimination of cross-interference in analog circuits will further ensure a good recognition rate. Encouraging users to speak loudly and approach the microphone can help achieve a good signal-to-noise ratio.
Second, the circuit design
Figure 1 shows the reference circuit of the RSC-3x audio preamplifier. The microphone resistance (Rx) with a resistance of 1.5K has a large influence on the system gain, so the value of the microphone should be determined according to the sensitivity of the microphone. The 1.5K in the figure is a typical value.
The recommended values ​​for Rx and Cx are listed in the table below:
Rx | Cx |
1K | 0.01uF |
1.5K | 0.0068uF |
2.2K | 0.0047uF |
2.7K | 0.0033uF |
3.9K | 0.0027uF |
4.7K | 0.0022uF |
Third, PCB design
It is recommended to use a double-sided printed PCB with a ground plane. The ground plane should cover the entire analog circuit area and be grounded only near the RSC-3x. To reduce cross-talk, the analog ground and digital ground should be physically separated as much as possible. It is important to note that high-speed clock lines (such as address lines and data lines) are kept away from microphone components and circuits.
Each digital IC must be connected to a 0.1uF bypass capacitor next to VDD. Each pair of VDD pin and VSS pin of the RSC chip must be connected. The bypass capacitor should be a ceramic capacitor with a maximum voltage of 50V. If a 3-terminal regulator (such as the 7805) is used, connect the bypass capacitor to the place where the input/output pin is grounded close to the regulator.
In products that use batteries, connect a diode in series to avoid damage to the circuit when the battery is reversed.
If there are other modules in the product that need to use the digital clock (such as switching power supply, LCD driver, etc.) in the product, pay special attention to prevent these signals from entering the audio circuit of the RSC.
Fourth, the choice of microphone
For most products, an inexpensive multi-directional electret condenser microphone (with a minimum sensitivity of -60dB) is sufficient. In some applications, directional microphones may be more suitable when signal and audio noise originate in different directions. Since the frequency response of a directional microphone depends on the distance between the microphone and the sound source, such a microphone should be used with caution. For best performance, the speech recognition product should be used in a quiet environment, and the speaker's mouth should be very close to the microphone. If the product is designed for use in noisy environments, the design should take into account the noise of the surrounding environment. Increasing the signal to noise ratio will help the product's success.
Fifth, the placement of the microphone
It is important to design the proper microphone embedding method and select a microphone with consistent performance. Improper acoustic placement of the microphone reduces the recognition rate of the RSC-3x. There are many possible physical placements of microphone components, but some will perform better than others. To this end, Sensory recommends the following microphone placement options:
First of all: In the product, the microphone components should be as close as possible to the housing and should be completely inside the plastic housing. There should be no gaps between the microphone element and the housing. As long as there is a gap, an echo is generated, which reduces the recognition rate.
Secondly: the front of the microphone component should be clean and free of dirt to avoid interference recognition. Holes with a diameter of at least 5 mm are to be retained on the outer casing of the microphone. If you must add a plastic surface in front of the microphone, the plastic surface should be as thin as possible, preferably not more than 0.7mm.
Three: If possible, the microphone should be isolated from the outer casing. The microphone can be wrapped with a sponge material such as rubber or foam. The purpose of this is to prevent the acoustic noise generated by the handling or shaking of the product from being collected by the microphone. This extraneous noise will reduce the recognition rate.
If the microphone is moved from a distance of 15 cm from the speaker's mouth to a distance of 30 cm, the signal power is reduced by a quarter. The difference between treble and bass is also greater than 1/4. The RSC-3x provides an AGC (Automatic Gain Adjustment) to compensate for sound signals that are too large or too small. The AGC works in the preamplifier of the microphone. If the adjustment range of the AGC is exceeded, the software will provide a voice feedback to the speaker, such as the prompt "please speak loudly" or "please say a little voice" to remind the speaker.
Sixth, power supply design
Since the RSC-3x operates with a speech recognition circuit that consumes approximately 10 mA, the design of the power supply is particularly important here. If the system listens continuously to find a given vocabulary, it can consume the power of a button battery in a few hours and consume a large amount of alkaline battery in a few days. Therefore, if the product requires the identifier to be in operation, the system should be powered by utility power. Conversely, if the power supply is battery powered, then the product should work in a low-power "sleep" state most of the time, and only wake up when it needs to be identified. The wake-up of the RSC-3x can be achieved by a button or other IO port event or clock countdown of the oscillator 2, but cannot be awakened by the voice signal collected by the microphone.
When using mains supply, the mains supply ripple between VDD and GND should not exceed 5mV. Therefore, it is necessary to add a DC power regulator (such as 7805) to the power supply section to stabilize the voltage.
In summary, as long as you pay more attention to the hardware design, you can get a good signal-to-noise ratio and develop a high-performance speech recognition product.
Usb 2.0 Hub,7 Port Usb Hub,Usb Cable
USB3.0/2.0 Docking Station Wireless Adapter Co.,Ltd , http://www.chwirelessrouter.com