Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

VOICE CONTROLLED MOTORIZED 

WHEELCHAIR WITH REAL TIME 

OBSTACLE AVOIDANCE 

Omair Babri, Saqlain Malik, Talal Ibrahim and Zeeshan Ahmed 

Department of Electrical Engineering, University of Engineering and Technology, Lahore 

Abstract: A voice controlled motorized 

wheelchair with real time obstacle avoidance is 

designed and implemented. It enables a disabled 

person to move around independently, using a 

joystick and a voice recognition application which 

is interfaced with motors. The prototype of the 

wheelchair is built using a micro-controller, 

chosen for its low cost, in addition to its 

versatility and performance in mathematical 

operations and communication with other 

electronic devices. A camera is mounted on the 

chair for real time obstacle avoidance. The system 

has been designed and implemented in a costeffective 

way so that if our project is 

commercialized the needy users in developing 

countries will benefit from it. 

I. INTRODUCTION 

Low cost data processing chips and 

computers have made it possible to produce 

economically viable systems that provide various 

facilities and opportunities to the handicapped to 

lead meaningful lives. Engineers now design and 

manufacture listening devices for the hearing 

impaired [1, 2], seeing aids for the visual impaired 

[3, 4] and artificial limbs and joints to assist 

people with locomotive disabilities [5]. 

Producing such aids and devices at low 

cost is important because most of the handicapped 

are financially challenged individuals. In this 

paper, we describe our effort in designing and 

building an economical voice controlled 

wheelchair that automatically avoids obstacles in 

real-time. 

© ICCIT 2012 724 

II. PROBLEM STATEMENT 

A handicapped person with locomotive 

disabilities needs a wheelchair to perform 

functions that require him or her to move around. 

He can do so manually by pushing the wheelchair 

with his hands. However many individuals have 

weak upper limbs or find the manual mode of 

operating too tiring. Hence it is desirable to 

provide them with a motorized wheelchair that 

can be controlled by moving a joystick or through 

voice commands. Since the motorized wheelchair 

can move at a fair speed, it is important that it be 

able to avoid obstacles automatically in real time. 

All this should be achieved at a cost that is 

affordable for as many handicapped people as 

possible, as well as for organizations that support 

them. With these requirements in mind we 

propose an automated wheelchair with real-time 

obstacle avoidance capability. 

III. WHEELCHAIR SYSTEM DESIGN AND 

HARDWARE IMPLEMENTATION 

Figure 1 schematically shows the main 

components of the wheelchair system. DC motors 

M1 and M2 and 12V,12A batteries B1 and B2 are 

mounted on a steel platform installed under the 

seat of the wheelchair. A camera is also mounted 

under the steel platform to provide the image of 

the surrounding scene in front of the chair. A 

joystick J and microphone V provide the control 

ports through which the handicapped user 

interacts with the wheelchair. The control 

circuitry consisting of the microcontroller, relays, 

H-bridge circuit and voltage converters is also 

mounted on this steel platform.

Figure 1- Auto Wheelchair System Design 

The implementation of our project was 

done using the following major components: 1. A 

PIC 16F877 microcontroller to control the speed 

and direction of the wheelchair. 2. Relays in an Hbridge 

circuit to control the direction of current 

through the motors. 3. DC motors to drive the 

wheels of the chair. 4. Batteries to supply the 

desired power to the motors. 5. A microphone 

serving as an input device for speech commands. 

6. A camera used for image processing purpose 

for obstacle avoidance. 

A belt is used to synchronize the motor’s 

rotation to the wheel. The belt was mounted on 

the motor’s shaft to the axle of the wheel. A relay 

based H-bridge was used to to drive the motor in 

both forward and backward direction. SPDT relay 

with a current rating of 30A are used. 

Figure 2- Hardware System Architecture 

The microcontroller acts as a master i.e. 

controls all the activities of our system, as shown 

in Figure 2. It generates correct signals by 

analyzing the data being fed to it. To move the 

725 

wheelchair in a specific direction e.g. forward the 

microcontroller will generate control signals for 

the relays that will ensure that the motors drive 

the wheelchair in the forward direction. 

For the joystick control we use 5 push 

buttons one each for forward, reverse, left, right 

and stop, which once pressed send the 

corresponding signal to the microcontroller which 

after processing it sends appropriate signals to 

relays (in H-bridge circuit) driving the motors. 

IV. SPEECH RECOGNITION 

For the speech recognition interface the 

Microsoft speech SDK Speech Application 

Programming Interface (SAPI) is used. 

Microphone training wizard is used for training 

the SDK engine so that SAPI is able to recognize 

the commands. The speech is then converted into 

text using this application and the computer 

matches the input with a template that has a 

known meaning. Speech recognition basically 

converts PCM (Pulse Code Modulation) digital 

audio from a sound card into recognized speech. 

First, the digital audio signal coming from 

the sound card is converted into a format that is 

representative of what a person hears. The digital 

audio is a stream of amplitudes. In this form, it is 

difficult to identify any patterns from which we 

can determine what was actually said. To make 

pattern recognition easier, the digital audio is 

transformed into the "frequency domain". The 

speech recognizer has a database of several 

thousand graphs (called a codebook) that identify 

different types of sounds that the human voice can 

make. The sound is identified by matching it to its 

closest entry in the codebook, producing a number 

that describes the sound. This number is called the 

feature number [11]. 

In the next step each feature number is 

matched to a phenome. If for example, a segment 

of audio resulted in feature number 52, it could 

mean that the user made an "h" sound. Feature 53 

might be an "f" sound. However, this doesn’t 

work because (i) Every time a user speaks a word 

it sounds different. (ii)The background noise from 

the microphone sometimes causes the recognizer 

to hear a different sound. (iii) The sound of a 

phoneme changes depending on what phonemes 

surround it. The "t" in "talk" sounds different than 

the "t" in "attack" and "mist". (iv)The sound 

produced by a phoneme changes from the

eginning to the end of the phoneme, and is not 

constant. The beginning of a "t" will produce 

different feature numbers than the end of a "t" 

[11]. 

The background noise and variability 

problems are solved by allowing a feature number 

to be used by more than just one phoneme. This 

can be done because a phoneme lasts for a 

relatively long time, 50 to 100 feature numbers, 

and it is likely that one or more sounds are 

predominant during that time. Hence, it is possible 

to predict what phoneme was spoken. 

In the third step [11] to learn how a 

phoneme sounds, a training tool is used. We use 

over a hundred samples of the phoneme. The tool 

analyzes these samples and produces a feature 

number. It thus learns how likely it is for a 

particular feature number to appear in a specific 

phoneme. For example, for the phoneme "h", 

there might be a 55% chance of feature 52 

appearing, 30% chance of feature 189 appearing, 

and 15% chance of feature 53. 

The probability analysis done during 

training is used during recognition. Given a voice 

sample, the feature numbers corresponding to it 

are used to compute probabilities of various 

phenomes. The most probable phenome is then 

chosen. 

It is important that the speech recognition 

system adapt to the user’s voice and speaking 

style to improve accuracy. A word can be spoken 

with different pronunciations. However, after the 

user has spoken the word a number of times the 

recognizer will have enough examples that it can 

determine what pronunciation the user spoke. 

The communication between the software 

and the wheelchair platform is done through serial 

port of the system. 

V. OBSTACLE AVOIDANCE 

The obstacle avoidance processing is 

done using MATLAB. Our obstacle detection 

system is purely based on the appearance of 

individual pixels captured by the camera. Any 

pixel that differs in appearance from the ground is 

classified as an obstacle. The method is based on 

three assumptions that are reasonable for a variety 

of indoor and outdoor environments [5]: 

(i)Obstacles differ in appearance from the 

ground. 

(ii)The ground is relatively flat. 

726 

(iii)There are no overhanging obstacles. 

The first assumption enables us to 

distinguish obstacles from the ground, while the 

second and third assumptions enable us to 

estimate the distances between detected obstacles 

and the camera. The key feature of the algorithm 

used is the global thresholding. 

The algorithm consists of two stages 

namely vision stage and decision stage. Vision 

stage deals with retrieving the image and the 

decision stage deals with classifying the image 

according to our wheelchair. The Vision stage 

includes a camera which is connected to the 

laptop. We have used Logitech QuickCam Pro for 

Notebooks for this purpose. The camera gives its 

output to Laptop. The whole system is placed on 

the wheelchair. On the Laptop MATLAB Image 

Processing tool is used for processing the image. 

MATLAB process the image and gives signal to 

the control circuit through Serial port. The time 

interval between “the time vision” and “decision” 

stages of the obstacle avoidance is the time taken 

to process the image which is about 50 

milliseconds. In an actual implementation a 

microprocessor will be used instead of a laptop to 

do the processing. This was tested and works 

perfectly. The control circuit then drives the 

motors accordingly via H-Bridges 

We have implemented the following 

algorithm: 

1. Take input from the camera in Real Time. The 

size of image is 320x240. 

2. Convert the image to grayscale 

3. Adjust the contrast of the image taken 

4. Take the Global Threshold of the image using 

Otsu‘s Algorithm [12] 

5. Convert the image to binary image using the 

threshold obtained in step 4. 

The binary image is used to check for the 

obstacles that may be present. The camera is 

slightly tilted towards the ground, After 

thresholding, the image that comes out gives 

white for the plane surfaces and black for any 

discontinuity. The black points give the obstacle 

location. We count the number of black pixels in 

the image and we use a threshold barrier that if 

more than 10% of the pixels present in the image 

are black then we have encountered an obstacle 

and a command will be sent to the microcontroller 

for the wheelchair to stop.

Figure 3 shows the steps in which the 

image is converted from the real time image taken 

from the camera to the binary image which is used 

for actual pixel calculation. The top picture shows 

the real time image taken from the camera. The 

next picture shows the image converted into a 

grey scale image. The third picture shows the 

same image after it has its contrast adjusted and 

the final image is our required binary image. 

Figure 3- Conversion of real time image into binary 

image 

727 

VI. CONCLUSIONS 

We have successfully designed and 

implemented a motorized wheelchair controlled 

by a joystick or through voice recognition. The 

total cost was Rs.25000 (US $ 300) excluding the 

cost of the wheelchair. 

The voice recognition system worked for 

most of the commands (over 95%). Only when a 

word was not properly vocalized, the system did 

not recognize it. However, the joystick can always 

be used as a foolproof backup in this case. 

Overall, users reported satisfaction with the 

system. 

The obstacle avoidance system had 

satisfactory performance. Only very small objects 

like pencils, tennis balls or books were difficult to 

identify. Further work is needed to better identify 

small objects. 

VII. RESULTS 

After completion of our project, it was 

first tested indoors using easy to spot obstacles 

like chairs, flower pots, walls and people. With 

these objects the obstacle avoidance worked 

without any error. Next we tested our system on 

smaller objects like books, pencils, tennis balls 

and other similar small objects. Although the 

obstacle avoidance worked well with these objects 

too but in some rare cases (around 2-3%) the 

obstacles were not properly detected. 

The voice recognition system was first 

tested in a quiet room with a single user. All 

words were correctly recognized. Next we tested 

it with a different user on whom the system was 

not trained. About 5% errors occurred in this case, 

for example words like “right” were recognized as 

“write”. This was because the recognizer heard a 

different pronunciation. However, after the user 

had spoken the word a number of times the 

recognizer had enough examples and properly 

determined what pronunciation the user spoke. 

Next we tested the system in a noisy room by 

turning on some music in that room. When the 

music was light there was no problem in correctly 

recognizing the words but when we turned the 

volume high the recognizer found it difficult to 

recognize the user’s voice and often took 

commands from what it heard in the song. 

The joystick control was foolproof and 

worked perfectly in all cases with no problems.

VIII. REFERENCES 

[1] Chin-Tuan Tan and Brian C. J. Moore, 

Perception of nonlinear distortion by 

hearing-impaired people, International 

Journal of Ideology 2008, Vol. 47, No. 5 , 

Pages 246-256. 

[2] Oberle, S., and Kaelin, A. "Recognition of 

acoustical alarm signals for the profoundly 

deaf using hidden Markov models," in IEEE 

International symposium on Circuits and 

Systems (Hong Kong), pp. 2285-2288.,1995 

[3] A. Shawki and Z. J., A smart reconfigurable 

visual system for theblind, Proceedings of the 

Tunisian-German Conference on: Smart 

Systems and Devices, 2001. 

[4] C. M. Higgins and V. Pant, Biomimetic VLSI 

sensor for visual tracking of small moving 

targets, IEEE Transactions on Circuits an 

Systems, vol. 51, pp. 2384–2394, 2004. 

[5] F. Daerden and D. Lefeber, The concept and 

design of pleated pneumatic artificial 

muscles. International Journal of Fluid 

Power, vol. 2, no. 3, 2001, pp. 41–45 

[6] http://msdn.microsoft.com/enus/library/default.aspx 

[7] K. R. Castleman, Digital Image Processing, 

Pearson Education, 1996. 

[8] M. A. Maziddi, PIC microcontroller and 

Embedded Systems, 2008. 

[9] http://electronics.howstuffworks.com/gadgets 

/high-tech-gadgets/speech-recognition.htm 

[10] D. Murray and A. Basu, ‘Motion tracking 

with an active camera’, IEEE Trans. Pattern 

Analysis and Machine Intelligence, Vol 16, 

No. 5, pp. 449-459, 1994. 

[11] http://www.voicerecognition.com/ 

[12] N. Otsu. A threshold selection method from 

gray-level histogram, IEEE Trans. System, 

Man, and Cybernetic. vol. 9, no.1, pp. 62-66, 

1979. 

728

Voice Controlled Motorized Wheelchair with Real Time Obstacle ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?