Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

More documents

Recommendations

Info

Figure 1- Auto Wheelchair System Design The implementation of our project was done using the following major components: 1. A PIC 16F877 microcontroller to control the speed and direction of the wheelchair. 2. Relays in an Hbridge circuit to control the direction of current through the motors. 3. DC motors to drive the wheels of the chair. 4. Batteries to supply the desired power to the motors. 5. A microphone serving as an input device for speech commands. 6. A camera used for image processing purpose for obstacle avoidance. A belt is used to synchronize the motor’s rotation to the wheel. The belt was mounted on the motor’s shaft to the axle of the wheel. A relay based H-bridge was used to to drive the motor in both forward and backward direction. SPDT relay with a current rating of 30A are used. Figure 2- Hardware System Architecture The microcontroller acts as a master i.e. controls all the activities of our system, as shown in Figure 2. It generates correct signals by analyzing the data being fed to it. To move the 725 wheelchair in a specific direction e.g. forward the microcontroller will generate control signals for the relays that will ensure that the motors drive the wheelchair in the forward direction. For the joystick control we use 5 push buttons one each for forward, reverse, left, right and stop, which once pressed send the corresponding signal to the microcontroller which after processing it sends appropriate signals to relays (in H-bridge circuit) driving the motors. IV. SPEECH RECOGNITION For the speech recognition interface the Microsoft speech SDK Speech Application Programming Interface (SAPI) is used. Microphone training wizard is used for training the SDK engine so that SAPI is able to recognize the commands. The speech is then converted into text using this application and the computer matches the input with a template that has a known meaning. Speech recognition basically converts PCM (Pulse Code Modulation) digital audio from a sound card into recognized speech. First, the digital audio signal coming from the sound card is converted into a format that is representative of what a person hears. The digital audio is a stream of amplitudes. In this form, it is difficult to identify any patterns from which we can determine what was actually said. To make pattern recognition easier, the digital audio is transformed into the "frequency domain". The speech recognizer has a database of several thousand graphs (called a codebook) that identify different types of sounds that the human voice can make. The sound is identified by matching it to its closest entry in the codebook, producing a number that describes the sound. This number is called the feature number [11]. In the next step each feature number is matched to a phenome. If for example, a segment of audio resulted in feature number 52, it could mean that the user made an "h" sound. Feature 53 might be an "f" sound. However, this doesn’t work because (i) Every time a user speaks a word it sounds different. (ii)The background noise from the microphone sometimes causes the recognizer to hear a different sound. (iii) The sound of a phoneme changes depending on what phonemes surround it. The "t" in "talk" sounds different than the "t" in "attack" and "mist". (iv)The sound produced by a phoneme changes from the
eginning to the end of the phoneme, and is not constant. The beginning of a "t" will produce different feature numbers than the end of a "t" [11]. The background noise and variability problems are solved by allowing a feature number to be used by more than just one phoneme. This can be done because a phoneme lasts for a relatively long time, 50 to 100 feature numbers, and it is likely that one or more sounds are predominant during that time. Hence, it is possible to predict what phoneme was spoken. In the third step [11] to learn how a phoneme sounds, a training tool is used. We use over a hundred samples of the phoneme. The tool analyzes these samples and produces a feature number. It thus learns how likely it is for a particular feature number to appear in a specific phoneme. For example, for the phoneme "h", there might be a 55% chance of feature 52 appearing, 30% chance of feature 189 appearing, and 15% chance of feature 53. The probability analysis done during training is used during recognition. Given a voice sample, the feature numbers corresponding to it are used to compute probabilities of various phenomes. The most probable phenome is then chosen. It is important that the speech recognition system adapt to the user’s voice and speaking style to improve accuracy. A word can be spoken with different pronunciations. However, after the user has spoken the word a number of times the recognizer will have enough examples that it can determine what pronunciation the user spoke. The communication between the software and the wheelchair platform is done through serial port of the system. V. OBSTACLE AVOIDANCE The obstacle avoidance processing is done using MATLAB. Our obstacle detection system is purely based on the appearance of individual pixels captured by the camera. Any pixel that differs in appearance from the ground is classified as an obstacle. The method is based on three assumptions that are reasonable for a variety of indoor and outdoor environments [5]: (i)Obstacles differ in appearance from the ground. (ii)The ground is relatively flat. 726 (iii)There are no overhanging obstacles. The first assumption enables us to distinguish obstacles from the ground, while the second and third assumptions enable us to estimate the distances between detected obstacles and the camera. The key feature of the algorithm used is the global thresholding. The algorithm consists of two stages namely vision stage and decision stage. Vision stage deals with retrieving the image and the decision stage deals with classifying the image according to our wheelchair. The Vision stage includes a camera which is connected to the laptop. We have used Logitech QuickCam Pro for Notebooks for this purpose. The camera gives its output to Laptop. The whole system is placed on the wheelchair. On the Laptop MATLAB Image Processing tool is used for processing the image. MATLAB process the image and gives signal to the control circuit through Serial port. The time interval between “the time vision” and “decision” stages of the obstacle avoidance is the time taken to process the image which is about 50 milliseconds. In an actual implementation a microprocessor will be used instead of a laptop to do the processing. This was tested and works perfectly. The control circuit then drives the motors accordingly via H-Bridges We have implemented the following algorithm: 1. Take input from the camera in Real Time. The size of image is 320x240. 2. Convert the image to grayscale 3. Adjust the contrast of the image taken 4. Take the Global Threshold of the image using Otsu‘s Algorithm [12] 5. Convert the image to binary image using the threshold obtained in step 4. The binary image is used to check for the obstacles that may be present. The camera is slightly tilted towards the ground, After thresholding, the image that comes out gives white for the plane surfaces and black for any discontinuity. The black points give the obstacle location. We count the number of black pixels in the image and we use a threshold barrier that if more than 10% of the pixels present in the image are black then we have encountered an obstacle and a command will be sent to the microcontroller for the wheelchair to stop.
Page 1: VOICE CONTROLLED MOTORIZED WHEELCHA
Page 5: VIII. REFERENCES [1] Chin-Tuan Tan

Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

Create successful ePaper yourself

Delete template?

Save as template?