11.11.2012 Views

Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

VOICE CONTROLLED MOTORIZED<br />

WHEELCHAIR WITH REAL TIME<br />

OBSTACLE AVOIDANCE<br />

Omair Babri, Saqlain Malik, Talal Ibrahim and Zeeshan Ahmed<br />

Department of Electrical Engineering, University of Engineering and Technology, Lahore<br />

Abstract: A voice controlled motorized<br />

wheelchair <strong>with</strong> real time obstacle avoidance is<br />

designed and implemented. It enables a disabled<br />

person to move around independently, using a<br />

joystick and a voice recognition application which<br />

is interfaced <strong>with</strong> motors. The prototype of the<br />

wheelchair is built using a micro-controller,<br />

chosen for its low cost, in addition to its<br />

versatility and performance in mathematical<br />

operations and communication <strong>with</strong> other<br />

electronic devices. A camera is mounted on the<br />

chair for real time obstacle avoidance. The system<br />

has been designed and implemented in a costeffective<br />

way so that if our project is<br />

commercialized the needy users in developing<br />

countries will benefit from it.<br />

I. INTRODUCTION<br />

Low cost data processing chips and<br />

computers have made it possible to produce<br />

economically viable systems that provide various<br />

facilities and opportunities to the handicapped to<br />

lead meaningful lives. Engineers now design and<br />

manufacture listening devices for the hearing<br />

impaired [1, 2], seeing aids for the visual impaired<br />

[3, 4] and artificial limbs and joints to assist<br />

people <strong>with</strong> locomotive disabilities [5].<br />

Producing such aids and devices at low<br />

cost is important because most of the handicapped<br />

are financially challenged individuals. In this<br />

paper, we describe our effort in designing and<br />

building an economical voice controlled<br />

wheelchair that automatically avoids obstacles in<br />

real-time.<br />

© ICCIT 2012 724<br />

II. PROBLEM STATEMENT<br />

A handicapped person <strong>with</strong> locomotive<br />

disabilities needs a wheelchair to perform<br />

functions that require him or her to move around.<br />

He can do so manually by pushing the wheelchair<br />

<strong>with</strong> his hands. However many individuals have<br />

weak upper limbs or find the manual mode of<br />

operating too tiring. Hence it is desirable to<br />

provide them <strong>with</strong> a motorized wheelchair that<br />

can be controlled by moving a joystick or through<br />

voice commands. Since the motorized wheelchair<br />

can move at a fair speed, it is important that it be<br />

able to avoid obstacles automatically in real time.<br />

All this should be achieved at a cost that is<br />

affordable for as many handicapped people as<br />

possible, as well as for organizations that support<br />

them. With these requirements in mind we<br />

propose an automated wheelchair <strong>with</strong> real-time<br />

obstacle avoidance capability.<br />

III. WHEELCHAIR SYSTEM DESIGN AND<br />

HARDWARE IMPLEMENTATION<br />

Figure 1 schematically shows the main<br />

components of the wheelchair system. DC motors<br />

M1 and M2 and 12V,12A batteries B1 and B2 are<br />

mounted on a steel platform installed under the<br />

seat of the wheelchair. A camera is also mounted<br />

under the steel platform to provide the image of<br />

the surrounding scene in front of the chair. A<br />

joystick J and microphone V provide the control<br />

ports through which the handicapped user<br />

interacts <strong>with</strong> the wheelchair. The control<br />

circuitry consisting of the microcontroller, relays,<br />

H-bridge circuit and voltage converters is also<br />

mounted on this steel platform.


Figure 1- Auto <strong>Wheelchair</strong> System Design<br />

The implementation of our project was<br />

done using the following major components: 1. A<br />

PIC 16F877 microcontroller to control the speed<br />

and direction of the wheelchair. 2. Relays in an Hbridge<br />

circuit to control the direction of current<br />

through the motors. 3. DC motors to drive the<br />

wheels of the chair. 4. Batteries to supply the<br />

desired power to the motors. 5. A microphone<br />

serving as an input device for speech commands.<br />

6. A camera used for image processing purpose<br />

for obstacle avoidance.<br />

A belt is used to synchronize the motor’s<br />

rotation to the wheel. The belt was mounted on<br />

the motor’s shaft to the axle of the wheel. A relay<br />

based H-bridge was used to to drive the motor in<br />

both forward and backward direction. SPDT relay<br />

<strong>with</strong> a current rating of 30A are used.<br />

Figure 2- Hardware System Architecture<br />

The microcontroller acts as a master i.e.<br />

controls all the activities of our system, as shown<br />

in Figure 2. It generates correct signals by<br />

analyzing the data being fed to it. To move the<br />

725<br />

wheelchair in a specific direction e.g. forward the<br />

microcontroller will generate control signals for<br />

the relays that will ensure that the motors drive<br />

the wheelchair in the forward direction.<br />

For the joystick control we use 5 push<br />

buttons one each for forward, reverse, left, right<br />

and stop, which once pressed send the<br />

corresponding signal to the microcontroller which<br />

after processing it sends appropriate signals to<br />

relays (in H-bridge circuit) driving the motors.<br />

IV. SPEECH RECOGNITION<br />

For the speech recognition interface the<br />

Microsoft speech SDK Speech Application<br />

Programming Interface (SAPI) is used.<br />

Microphone training wizard is used for training<br />

the SDK engine so that SAPI is able to recognize<br />

the commands. The speech is then converted into<br />

text using this application and the computer<br />

matches the input <strong>with</strong> a template that has a<br />

known meaning. Speech recognition basically<br />

converts PCM (Pulse Code Modulation) digital<br />

audio from a sound card into recognized speech.<br />

First, the digital audio signal coming from<br />

the sound card is converted into a format that is<br />

representative of what a person hears. The digital<br />

audio is a stream of amplitudes. In this form, it is<br />

difficult to identify any patterns from which we<br />

can determine what was actually said. To make<br />

pattern recognition easier, the digital audio is<br />

transformed into the "frequency domain". The<br />

speech recognizer has a database of several<br />

thousand graphs (called a codebook) that identify<br />

different types of sounds that the human voice can<br />

make. The sound is identified by matching it to its<br />

closest entry in the codebook, producing a number<br />

that describes the sound. This number is called the<br />

feature number [11].<br />

In the next step each feature number is<br />

matched to a phenome. If for example, a segment<br />

of audio resulted in feature number 52, it could<br />

mean that the user made an "h" sound. Feature 53<br />

might be an "f" sound. However, this doesn’t<br />

work because (i) Every time a user speaks a word<br />

it sounds different. (ii)The background noise from<br />

the microphone sometimes causes the recognizer<br />

to hear a different sound. (iii) The sound of a<br />

phoneme changes depending on what phonemes<br />

surround it. The "t" in "talk" sounds different than<br />

the "t" in "attack" and "mist". (iv)The sound<br />

produced by a phoneme changes from the


eginning to the end of the phoneme, and is not<br />

constant. The beginning of a "t" will produce<br />

different feature numbers than the end of a "t"<br />

[11].<br />

The background noise and variability<br />

problems are solved by allowing a feature number<br />

to be used by more than just one phoneme. This<br />

can be done because a phoneme lasts for a<br />

relatively long time, 50 to 100 feature numbers,<br />

and it is likely that one or more sounds are<br />

predominant during that time. Hence, it is possible<br />

to predict what phoneme was spoken.<br />

In the third step [11] to learn how a<br />

phoneme sounds, a training tool is used. We use<br />

over a hundred samples of the phoneme. The tool<br />

analyzes these samples and produces a feature<br />

number. It thus learns how likely it is for a<br />

particular feature number to appear in a specific<br />

phoneme. For example, for the phoneme "h",<br />

there might be a 55% chance of feature 52<br />

appearing, 30% chance of feature 189 appearing,<br />

and 15% chance of feature 53.<br />

The probability analysis done during<br />

training is used during recognition. Given a voice<br />

sample, the feature numbers corresponding to it<br />

are used to compute probabilities of various<br />

phenomes. The most probable phenome is then<br />

chosen.<br />

It is important that the speech recognition<br />

system adapt to the user’s voice and speaking<br />

style to improve accuracy. A word can be spoken<br />

<strong>with</strong> different pronunciations. However, after the<br />

user has spoken the word a number of times the<br />

recognizer will have enough examples that it can<br />

determine what pronunciation the user spoke.<br />

The communication between the software<br />

and the wheelchair platform is done through serial<br />

port of the system.<br />

V. OBSTACLE AVOIDANCE<br />

The obstacle avoidance processing is<br />

done using MATLAB. Our obstacle detection<br />

system is purely based on the appearance of<br />

individual pixels captured by the camera. Any<br />

pixel that differs in appearance from the ground is<br />

classified as an obstacle. The method is based on<br />

three assumptions that are reasonable for a variety<br />

of indoor and outdoor environments [5]:<br />

(i)<strong>Obstacle</strong>s differ in appearance from the<br />

ground.<br />

(ii)The ground is relatively flat.<br />

726<br />

(iii)There are no overhanging obstacles.<br />

The first assumption enables us to<br />

distinguish obstacles from the ground, while the<br />

second and third assumptions enable us to<br />

estimate the distances between detected obstacles<br />

and the camera. The key feature of the algorithm<br />

used is the global thresholding.<br />

The algorithm consists of two stages<br />

namely vision stage and decision stage. Vision<br />

stage deals <strong>with</strong> retrieving the image and the<br />

decision stage deals <strong>with</strong> classifying the image<br />

according to our wheelchair. The Vision stage<br />

includes a camera which is connected to the<br />

laptop. We have used Logitech QuickCam Pro for<br />

Notebooks for this purpose. The camera gives its<br />

output to Laptop. The whole system is placed on<br />

the wheelchair. On the Laptop MATLAB Image<br />

Processing tool is used for processing the image.<br />

MATLAB process the image and gives signal to<br />

the control circuit through Serial port. The time<br />

interval between “the time vision” and “decision”<br />

stages of the obstacle avoidance is the time taken<br />

to process the image which is about 50<br />

milliseconds. In an actual implementation a<br />

microprocessor will be used instead of a laptop to<br />

do the processing. This was tested and works<br />

perfectly. The control circuit then drives the<br />

motors accordingly via H-Bridges<br />

We have implemented the following<br />

algorithm:<br />

1. Take input from the camera in <strong>Real</strong> <strong>Time</strong>. The<br />

size of image is 320x240.<br />

2. Convert the image to grayscale<br />

3. Adjust the contrast of the image taken<br />

4. Take the Global Threshold of the image using<br />

Otsu‘s Algorithm [12]<br />

5. Convert the image to binary image using the<br />

threshold obtained in step 4.<br />

The binary image is used to check for the<br />

obstacles that may be present. The camera is<br />

slightly tilted towards the ground, After<br />

thresholding, the image that comes out gives<br />

white for the plane surfaces and black for any<br />

discontinuity. The black points give the obstacle<br />

location. We count the number of black pixels in<br />

the image and we use a threshold barrier that if<br />

more than 10% of the pixels present in the image<br />

are black then we have encountered an obstacle<br />

and a command will be sent to the microcontroller<br />

for the wheelchair to stop.


Figure 3 shows the steps in which the<br />

image is converted from the real time image taken<br />

from the camera to the binary image which is used<br />

for actual pixel calculation. The top picture shows<br />

the real time image taken from the camera. The<br />

next picture shows the image converted into a<br />

grey scale image. The third picture shows the<br />

same image after it has its contrast adjusted and<br />

the final image is our required binary image.<br />

Figure 3- Conversion of real time image into binary<br />

image<br />

727<br />

VI. CONCLUSIONS<br />

We have successfully designed and<br />

implemented a motorized wheelchair controlled<br />

by a joystick or through voice recognition. The<br />

total cost was Rs.25000 (US $ 300) excluding the<br />

cost of the wheelchair.<br />

The voice recognition system worked for<br />

most of the commands (over 95%). Only when a<br />

word was not properly vocalized, the system did<br />

not recognize it. However, the joystick can always<br />

be used as a foolproof backup in this case.<br />

Overall, users reported satisfaction <strong>with</strong> the<br />

system.<br />

The obstacle avoidance system had<br />

satisfactory performance. Only very small objects<br />

like pencils, tennis balls or books were difficult to<br />

identify. Further work is needed to better identify<br />

small objects.<br />

VII. RESULTS<br />

After completion of our project, it was<br />

first tested indoors using easy to spot obstacles<br />

like chairs, flower pots, walls and people. With<br />

these objects the obstacle avoidance worked<br />

<strong>with</strong>out any error. Next we tested our system on<br />

smaller objects like books, pencils, tennis balls<br />

and other similar small objects. Although the<br />

obstacle avoidance worked well <strong>with</strong> these objects<br />

too but in some rare cases (around 2-3%) the<br />

obstacles were not properly detected.<br />

The voice recognition system was first<br />

tested in a quiet room <strong>with</strong> a single user. All<br />

words were correctly recognized. Next we tested<br />

it <strong>with</strong> a different user on whom the system was<br />

not trained. About 5% errors occurred in this case,<br />

for example words like “right” were recognized as<br />

“write”. This was because the recognizer heard a<br />

different pronunciation. However, after the user<br />

had spoken the word a number of times the<br />

recognizer had enough examples and properly<br />

determined what pronunciation the user spoke.<br />

Next we tested the system in a noisy room by<br />

turning on some music in that room. When the<br />

music was light there was no problem in correctly<br />

recognizing the words but when we turned the<br />

volume high the recognizer found it difficult to<br />

recognize the user’s voice and often took<br />

commands from what it heard in the song.<br />

The joystick control was foolproof and<br />

worked perfectly in all cases <strong>with</strong> no problems.


VIII. REFERENCES<br />

[1] Chin-Tuan Tan and Brian C. J. Moore,<br />

Perception of nonlinear distortion by<br />

hearing-impaired people, International<br />

Journal of Ideology 2008, Vol. 47, No. 5 ,<br />

Pages 246-256.<br />

[2] Oberle, S., and Kaelin, A. "Recognition of<br />

acoustical alarm signals for the profoundly<br />

deaf using hidden Markov models," in IEEE<br />

International symposium on Circuits and<br />

Systems (Hong Kong), pp. 2285-2288.,1995<br />

[3] A. Shawki and Z. J., A smart reconfigurable<br />

visual system for theblind, Proceedings of the<br />

Tunisian-German Conference on: Smart<br />

Systems and Devices, 2001.<br />

[4] C. M. Higgins and V. Pant, Biomimetic VLSI<br />

sensor for visual tracking of small moving<br />

targets, IEEE Transactions on Circuits an<br />

Systems, vol. 51, pp. 2384–2394, 2004.<br />

[5] F. Daerden and D. Lefeber, The concept and<br />

design of pleated pneumatic artificial<br />

muscles. International Journal of Fluid<br />

Power, vol. 2, no. 3, 2001, pp. 41–45<br />

[6] http://msdn.microsoft.com/enus/library/default.aspx<br />

[7] K. R. Castleman, Digital Image Processing,<br />

Pearson Education, 1996.<br />

[8] M. A. Maziddi, PIC microcontroller and<br />

Embedded Systems, 2008.<br />

[9] http://electronics.howstuffworks.com/gadgets<br />

/high-tech-gadgets/speech-recognition.htm<br />

[10] D. Murray and A. Basu, ‘Motion tracking<br />

<strong>with</strong> an active camera’, IEEE Trans. Pattern<br />

Analysis and Machine Intelligence, Vol 16,<br />

No. 5, pp. 449-459, 1994.<br />

[11] http://www.voicerecognition.com/<br />

[12] N. Otsu. A threshold selection method from<br />

gray-level histogram, IEEE Trans. System,<br />

Man, and Cybernetic. vol. 9, no.1, pp. 62-66,<br />

1979.<br />

728

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!