16.06.2013 Views

OCR Based Mapless Navigation Method of Robot

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

VISVESVARAYA TECHNOLOGICAL UNIVERSITY<br />

Belgaum, Karnataka–590 018<br />

A project dissertation<br />

on<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> <strong>of</strong> <strong>Robot</strong><br />

for the award <strong>of</strong> the degree <strong>of</strong><br />

MASTER OF TECHNOLOGY<br />

in<br />

INDUSTRIAL AUTOMATION AND ROBOTICS<br />

by<br />

Mr. SAYYAN N. SHAIKH<br />

(USN: 4SN11MAR10)<br />

Under the guidance <strong>of</strong><br />

Dr. NEELAKANTHA V. LONDHE Mr. BASAVARAJ<br />

Pr<strong>of</strong>essor Project guide<br />

Dept <strong>of</strong> Mechanical Engg. BCS Innovations<br />

Srinivas Institute <strong>of</strong> Technology Bangalore- 560 054.<br />

Valachil, Mangalore-574 143.<br />

DEPARTMENT OF MECHANICAL ENGINEERING<br />

SRINIVAS INSTITUTE OF TECHNOLOGY<br />

Mangalore, Karnataka, India-574 143<br />

2012-2013


Srinivas Institute <strong>of</strong> Technology<br />

Mangalore, Karnataka, India-574 143<br />

(Affiliated to Visvesvaraya Technological University, Belgaum)<br />

Department <strong>of</strong> Mechanical Engineering<br />

CERTIFICATE<br />

This is to certify that the report entitled “<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> <strong>of</strong><br />

<strong>Robot</strong>” is a bonafide document <strong>of</strong> project work carried out by Mr. SAYYAN N. SHAIKH<br />

bearing USN 4SN11MAR10, submitted in partial fulfillment for the award <strong>of</strong> Master <strong>of</strong><br />

Technology in Industrial Automation and <strong>Robot</strong>ics <strong>of</strong> the Visvesvaraya Technological<br />

University, Belgaum, during the year 2012-2013. It is certified that all corrections/suggestions<br />

indicated have been incorporated in the dissertation report deposited in the department library.<br />

The dissertation report has been approved, as it satisfies the academic requirements in respect <strong>of</strong><br />

project work regulations prescribed for the said Degree.<br />

Signature <strong>of</strong> Guide Signature <strong>of</strong> HOD Signature <strong>of</strong> Principal<br />

Dr. Neelakantha V. Londhe Dr. Thomas Pinto Dr. Shrinivasa Mayya D<br />

Name <strong>of</strong> the Examiners: Signature with date<br />

1. _____________________________ -------------------------------------<br />

2. _____________________________ -------------------------------------<br />

ii


iii


DECLARATION<br />

I, Sayyan N. Shaikh, bearing USN 4SN11MAR10, a student <strong>of</strong> M.Tech in the<br />

Department <strong>of</strong> Mechanical Engineering, Srinivas Institute <strong>of</strong> Technology, Mangalore,<br />

hereby declare that the project work entitled “<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong><br />

<strong>of</strong> <strong>Robot</strong>” embodies the report <strong>of</strong> my project work carried out under the guidance <strong>of</strong><br />

Dr. Neelakantha V. Londhe, Pr<strong>of</strong>essor, Department <strong>of</strong> Mechanical Engineering,<br />

Srinivas Institute <strong>of</strong> Technology, Mangalore and Mr. Basavaraj C.H, Co-guide at BCS<br />

Innovations, Bangalore. This project has been submitted in partial fulfillment <strong>of</strong> the<br />

requirements for the award <strong>of</strong> Master <strong>of</strong> Technology in Industrial Automation and<br />

<strong>Robot</strong>ics by the Visvesvaraya Technological University, Belgaum.<br />

This work contained in this thesis has not been submitted in part or full to any<br />

other university or institution or pr<strong>of</strong>essional body for the award <strong>of</strong> any other Degree or<br />

Diploma or any Fellowship.<br />

Date: Sayyan N. Shaikh<br />

Place: Mangalore USN: 4SN11MAR10<br />

iv<br />

Industrial Automation and <strong>Robot</strong>ics<br />

Department <strong>of</strong> Mechanical Engineering<br />

Srinivas Institute <strong>of</strong> Technology,<br />

Valachil, Mangalore-574143


ACKNOWLEDGEMENT<br />

This project work was successfully completed with the help and guidance<br />

received. The elation and gratification <strong>of</strong> this project would be incomplete without<br />

mentioning the people who helped to make it possible, whose encouragement and support<br />

is valuable to me.<br />

At the outset I would like to express my gratitude to my guide Dr. Neelakantha<br />

V. Londhe, Pr<strong>of</strong>essor, Department <strong>of</strong> Mechanical Engineering, Srinivas Institute <strong>of</strong><br />

Technology, Mangalore, for his guidance, encouragement and support for successful<br />

completion <strong>of</strong> this project work.<br />

I pay my deepest thanks to Mr. Basavaraj C.H, Co-guide at BCS Innovations for<br />

being such a patient and understanding project guardian. His foresight, intuition, and care<br />

were instrumental in shaping this work. I want to thank him for his role as both a teacher<br />

and an advisor. He taught me how to dig deeper and provided me with the guidance<br />

needed to get started.<br />

I am highly indebted to Dr. Thomas Pinto, Pr<strong>of</strong>essor and HOD, Department <strong>of</strong><br />

Mechanical Engineering, Srinivas Institute <strong>of</strong> Technology, Mangalore, for his excellent<br />

guidance, encouragement and support throughout the course. I consider it to be an honour<br />

for working under him.<br />

I am immensely thankful and I would like to express my deep sense <strong>of</strong> gratitude to<br />

Mr. C. G. Ramachandra, Associate Pr<strong>of</strong>essor, Department <strong>of</strong> Mechanical Engineering,<br />

Srinivas Institute <strong>of</strong> Technology, Mangalore, who has been a great source <strong>of</strong> inspiration<br />

and for his help & encouragement.<br />

My sincere thanks to Dr. Srinivasa Mayya D, Principal, Srinivas Institute <strong>of</strong><br />

Technology, Mangalore for providing me the necessary facilities and encouragement to<br />

carry the project successfully.<br />

I would like to thank our Management, „A. Shama Rao Foundation’ for their cooperation<br />

and inspiration during my course.<br />

I would also like to thank the almighty, for always being there for me and guided<br />

me to work on the right path <strong>of</strong> life. My greatest thanks are to my parents who bestowed<br />

ability and strength in me to complete the work.<br />

To all my friends, thank you for your understanding and encouragement in my<br />

moment <strong>of</strong> crisis. Your friendship makes my life a wonderful experience. I can't list all<br />

the names here, but you are always in my mind.<br />

Finally, I would like to thanks to all my well-wishers who helped me directly and<br />

indirectly throughout this project.<br />

v<br />

(Sayyan N. Shaikh)


ABSTRACT<br />

This proposed method for a robot can locate and track landmarks via a colour-<br />

based region segmentation algorithm, within proper distance. It can extract text or sign<br />

from regions <strong>of</strong> landmark and inputs them into the <strong>OCR</strong> engine for recognition.<br />

Simultaneously, a projection analysis <strong>of</strong> signs or text on the landmark is conducted, and<br />

semanteme <strong>of</strong> arrows or text is identified. Finally, by combining the semanteme <strong>of</strong> arrows<br />

and texts which extract from landmark, the robot can find the routes to the destinations<br />

automatically.<br />

The hardware implementation <strong>of</strong> the system includes a navigator robot with an<br />

Android Phone, receives a signal to the phone through Bluetooth as it comes across any<br />

RF card in its path. The phone, on receiving the specific signal invokes its camera to<br />

capture the snap <strong>of</strong> the landmark and sends the image to the server through the internet.<br />

The server processes the received image to find out the actual meaning <strong>of</strong> the symbol or<br />

text like petrol pump, restaurant, left turn, right turn etc. through <strong>OCR</strong> mechanism. In<br />

<strong>OCR</strong> (Optical Character Recognition) it has shown how the text detection and recognition<br />

system, combined with several other ingredients, allows a robot to recognize named<br />

locations specified by a user. This meaning is reverted back to the phone which speaks up<br />

the meaning through its „Text To Speech‟ synthesizer, and transfers the signal to the<br />

navigator through Bluetooth for the possible move.<br />

The objective <strong>of</strong> this robot system is to show that this method can be used to<br />

accomplish mapless navigation task perfectly in the indoors or customised environment.<br />

Being able to navigate with landmarks in real life directly without generating maps or<br />

resetting new navigation sign for robots. Specially, this method can be applied in the field<br />

<strong>of</strong> service robots rapidly, which can enhance their adaptability and viability significantly.<br />

vi


CHAPTER<br />

CONTENTS<br />

DESCRIPTION PAGE NO.<br />

Acknowledgements…………………………………. v<br />

Abstract…………………………………………….... vi<br />

List <strong>of</strong> figures ……………………………………...... x<br />

List <strong>of</strong> tables ………………………………………... xii<br />

1 INTRODUCTION<br />

1.1 Introduction to Landmark <strong>Navigation</strong>…….……...…. 1<br />

1.2 Problem Statement……………………………....….. 3<br />

1.3 Existing System ………………………….……....…. 3<br />

1.4 Proposed System ……………………………..…....... 3<br />

1.5 Objectives <strong>of</strong> the Project………………….……….... 5<br />

1.6 Scope <strong>of</strong> the Project……………………..………....... 5<br />

1.7 Optical Character Recognition.................................... 5<br />

1.7.1 What Is <strong>OCR</strong>? ……………………………….... 5<br />

1.7.2 History <strong>of</strong> <strong>OCR</strong> …………………………….…. 6<br />

2 LITERATURE REVIEW AND SURVEY<br />

2.1 Research Paper Review………………..……………. 8<br />

2.2 Java…………………………………….…...….……. 10<br />

2.2.1 Features <strong>of</strong> Java………………….……….…… 10<br />

2.3 Apache Tomcat………………………….……….…. 11<br />

2.3.1 Benefits <strong>of</strong> Tomcat Server……….….………. 12<br />

2.4 Android………………………………….…………... 12<br />

2.4.1 Android Versions………………….…………. 13<br />

2.4.2 Features <strong>of</strong> Android………………………...... 13<br />

2.5 Embedded C ………………………………….…….. 14<br />

2.5.1 Features <strong>of</strong> Embedded C……………….….…. 14<br />

2.6 Outcome <strong>of</strong> the Literature Survey…………….…….. 15<br />

3 DESIGN ANALYSIS AND METHODOLOGY<br />

3.1 Data Flow Diagram………………………..………... 16<br />

3.2 Flow Chart …..…………………………….………... 17<br />

4 IMAGE EXTRACTION ALGORITHM<br />

4.1 Landmark Image Extraction........................................ 18<br />

4.1.1 Binarization..………………………….….….. 18<br />

4.1.2 Smearing…………………………….…….…. 18<br />

4.2 Landmark Location Finder ………………….……… 19<br />

4.3 Segmentation …………………………………….…. 19<br />

4.3.1 Filtering……………………………….….….. 19<br />

4.3.2 Dilation………………………………..……... 20<br />

4.3.3 Individual Image Separation………….……... 20<br />

vii


4.3.4 Normalization……………………….…….…. 20<br />

4.4 Template Matching………………………….…….… 21<br />

5 IMAGE RECOGNITION USING KOHONEN<br />

5.1 Introduction to the Network……………………….... 22<br />

5.2 General Image Recognition Procedure…………….... 22<br />

5.3 Image Recognition Procedures with Kohonen…….... 23<br />

5.4 Data Collection……………………………….……... 24<br />

5.5 Image Pre-processing……………………………….. 25<br />

6.5.1 RGB to Grayscale Image Conversion …….… 26<br />

6.5.2 Grayscale to Binary Image Conversion….…... 26<br />

5.6 Feature Extraction ……………………………….….. 27<br />

6.6.1 Pixel Grabbing From Image ……………….... 28<br />

6.6.2 Finding Probability <strong>of</strong> Making Square …….... 28<br />

6.6.3 Mapped To Sampled Area …………………... 29<br />

5.7<br />

6.6.4 Creating Vector ………………………….…... 30<br />

6.6.5 Representing Character with a Model Number 30<br />

Kohonen Neural Network............................................ 31<br />

5.7.1 Introduction to Kohonen Network………….... 31<br />

5.7.2 The Structure <strong>of</strong> Kohonen Network……….… 32<br />

5.7.3 Sample Input to Kohonen Network……….…. 33<br />

5.7.4 Normalizing the Input…………………….….. 33<br />

5.7.5 Calculating Each Neuron‟s Output……….….. 34<br />

5.7.6 Mapping to Bipolar……………………….….. 34<br />

5.7.7 Choosing a Winner………………………..…. 35<br />

5.7.8 Kohonen Network Learning Procedure ……... 35<br />

5.7.9 Learning Algorithm Flowchart…………….… 36<br />

5.7.10 Learning Rate ………………………….……. 36<br />

5.7.11 Adjusting Weight …………………….…...…. 37<br />

5.7.12 Calculating the Errors ………………….….… 38<br />

5.7.13 Recognition with Kohonen Network ……….. 38<br />

6 HARDWARE DESCRIPTION AND IMPLIENTATION<br />

6.1 Circuit Diagram……………………………..…….… 39<br />

6.1.1 RFID Reader…………………….…….…….. 39<br />

6.1.2 Bluetooth Module……………………….…… 41<br />

6.1.3 PIC16F873A Microcontroller ……….………. 42<br />

6.1.4 ULN 2003…………………………….……… 43<br />

6.1.5 Relay …………………………………...……. 44<br />

6.1.6 DC Motor………………………………..…… 46<br />

7 OVERALL DESCRIPTION & IMPLEMENTATION<br />

7.1 Syatem Perspective ……………….…….…….…….. 48<br />

7.2 Operating Environment …………………….….…… 48<br />

7.3 Design and Implementation Constraints ……..…….. 48<br />

7.4 User Documentation ………………………….…….. 49<br />

viii


7.5 Hardware Interfaces ………….……….…………….. 49<br />

7.6 S<strong>of</strong>tware Interfaces …………………………………. 49<br />

7.7 S<strong>of</strong>tware Requirements……………………….…….. 49<br />

7.8 PC Requirements……………………………….…… 50<br />

7.9 How to Execute…………………………….……….. 50<br />

8 RESULTS AND DISCUSSION<br />

8.1 Experimental Results …………………….….……… 55<br />

8.2 Accuracy Rates ……………………………….…….. 55<br />

8.3 Drawbacks…………………………………….…….. 55<br />

9 CONCLUSION<br />

9.1 Conclusion ………………………………….………. 56<br />

10 SCOPE FOR FUTURE WORK<br />

10.2 Future Work ……………………………………….... 57<br />

REFERENCES............................................................................ 58<br />

Appendix A: Document Conventions......................................... 60<br />

ix


LIST OF FIGURES<br />

FIGURE NO. DESCRIPTION PAGE NO.<br />

1.1 Landmarks in Our Daily Life............................................ 1<br />

1.2 Proposed System............................................................... 4<br />

3.1 DFD for the Proposed System........................................... 16<br />

3.2 Flow Chart Model <strong>of</strong> the Proposed System...................... 17<br />

4.1 Captured Image.................................................................<br />

4.2 Binarized Image................................................................ 18<br />

4.3 Landmark Involving Text.................................................. 19<br />

4.4 Landmark Text Region...................................................... 19<br />

4.5 Separated Text Images After Dilation Process................. 20<br />

4.6 Individual Image Cut......................................................... 20<br />

4.7 Image after Normalization................................................. 20<br />

4.8 Database Template Images................................................ 21<br />

5.1 General Character Recognition Procedure........................ 23<br />

5.2 Image Recognition Procedure Using Kohonen Network 24<br />

5.3 Cropped Input Image from Landmark Region.................. 25<br />

5.4 Captured Input Image........................................................ 25<br />

5.5 Computer Image................................................................ 25<br />

5.6 RGB Image........................................................................ 26<br />

5.7 Grayscale Image................................................................ 26<br />

5.8 Grayscale Image with Histogram...................................... 27<br />

5.9 Binary Image with Histogram........................................... 27<br />

5.10 Sampled Image.................................................................. 29<br />

5.11 Vector Representation....................................................... 30<br />

5.12 Kohonen Neural Network................................................. 32<br />

5.13 Simple Kohonen Network with 2 I/p And 2 O/p Neurons 32<br />

5.14 Flow Chart Model for Learning Algorithm....................... 36<br />

6.1 Circuit Diagram <strong>of</strong> the Proposed System.......................... 39<br />

6.2 RFID Reader...................................................................... 40<br />

6.3 Block Diagram <strong>of</strong> LF DT125R Module............................ 41<br />

6.4 Bluetooth SMD Module - RN-42...................................... 41<br />

6.5 Pin Configuration <strong>of</strong> PIC16F873A................................... 42<br />

x<br />

18


6.6 ULN 2003.......................................................................... 43<br />

6.7 Pin Configuration <strong>of</strong> ULN 2003........................................ 44<br />

6.8 Relay.................................................................................. 44<br />

6.9 Relay Control Circuit......................................................... 45<br />

6.10 Relay Energized (ON)....................................................... 45<br />

6.11 Relay De-Energized (OFF)................................................ 45<br />

6.12 Relay Operation................................................................. 46<br />

6.13 Working <strong>of</strong> DC Motor....................................................... 46<br />

6.14 Hardware Implementation <strong>of</strong> Project................................. 47<br />

7.1 Snapshot <strong>of</strong> the Apache Tomcat Server............................ 50<br />

7.2 Snapshot <strong>of</strong> the Apache Tomcat Server under Execution 51<br />

7.3 Snapshot <strong>of</strong> the Apache Tomcat Server under Execution 52<br />

7.4 Snapshot <strong>of</strong> <strong>OCR</strong> Application waiting for the Input Data 53<br />

7.5 Snapshot <strong>of</strong> <strong>OCR</strong> Application Reading the Data.............. 54<br />

xi


LIST OF TABLES<br />

TABLE NO. DESCRIPTION PAGE NO.<br />

2.1 A Brief History <strong>of</strong> Android Versions................................ 13<br />

5.1 Binary Converted Grid Values.......................................... 28<br />

5.2 Aligned Marked Values..................................................... 29<br />

5.3 Sample Inputs to a Kohonen Neural Network.................. 33<br />

5.4 Connection Weights in the Sample Kohonen Network.... 33<br />

8.1 Accuracy Rates <strong>of</strong> the system............................................ 55<br />

xii


INTRODUCTION<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Chapter 1<br />

This chapter gives a brief description about <strong>OCR</strong> based <strong>Mapless</strong> navigation<br />

method <strong>of</strong> robot. It also includes the need <strong>of</strong> this project, existing system and brief<br />

introduction <strong>of</strong> the project its objective and scope.<br />

1.1 Introduction to Landmark <strong>Navigation</strong><br />

There are enormous landmarks in the environment we live (Figure 1.1), which are<br />

set to help people‟s travel, and they <strong>of</strong>ten contain various text and direction. With these<br />

signs, people can locate their positions and target their destinations easily. Even in an<br />

utterly strange place, people can find their way with landmark system. For example, in a<br />

public place like an airport or a station, people can reach their destinations successfully<br />

by just following the right landmark without being familiar with that place. Compared<br />

with robots, humans firstly labels the environment with landmarks, then directly draw the<br />

labels during the navigation process. As a result, humans are less dependent on maps. In<br />

other words, humans‟ navigation system is partially transported into the environment<br />

instead <strong>of</strong> being completely endogenous.<br />

Fig. 1.1 Landmarks Used in Our Daily Life<br />

A number <strong>of</strong> potential markets are slowly emerging for mobile robotic systems.<br />

Entertainment applications and household or <strong>of</strong>fice assistants are the primary targets in<br />

this area <strong>of</strong> development. These types <strong>of</strong> robots are designed to move around within an<br />

<strong>of</strong>ten highly unstructured and unpredictable environment. Existing and future applications<br />

for these types <strong>of</strong> autonomous systems have one key problem in common: navigation.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 1


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Vision is one <strong>of</strong> the most powerful and popular sensing method used for<br />

autonomous navigation. When compared with other on-board sensing techniques, vision<br />

based approaches to navigation continue to demand a lot <strong>of</strong> attention from the mobile<br />

robot research community, due to its ability to provide detailed information about the<br />

environment, which may not be available using combinations <strong>of</strong> other types <strong>of</strong> sensors.<br />

The past decade has seen the rapid development <strong>of</strong> vision based sensing for indoor<br />

navigation tasks. For example, 20 years ago it would have been impossible for an indoor<br />

mobile robot to find its way in a cluttered hallway, and even now it still remains a<br />

challenge. Vision-based indoor navigation for mobile robots is still an open research area.<br />

Autonomous robots operating in an unknown and uncertain environment have to cope<br />

with dynamic changes in the environment, and for a robot to navigate successfully to its<br />

goals while avoiding both static and dynamic obstacles is a major challenge. Most current<br />

techniques are based on complex mathematical equations and models <strong>of</strong> the working<br />

environment, however following a predetermined path may not require a complicated<br />

solution, and the following proposed methodology should be more robust. [1]<br />

Basically vision based navigation falls into three main groups depends on<br />

localization methods.<br />

Map <strong>Based</strong> <strong>Navigation</strong>: This consists <strong>of</strong> providing the robot with a model <strong>of</strong> the<br />

environment. These models may contain different degrees <strong>of</strong> detail, varying from<br />

a complete CAD model <strong>of</strong> the environment to a simple graph <strong>of</strong> interconnections<br />

or interrelationships between the elements in the environment.<br />

Map Building <strong>Based</strong> <strong>Navigation</strong>: In this approach first a 2D or 3D model <strong>of</strong> the<br />

environment is first constructed by the robot using its on-board sensors after<br />

which the model is used for navigation in the environment.<br />

<strong>Mapless</strong> <strong>Navigation</strong>: This category contains all systems in which navigation is<br />

achieved without any prior description <strong>of</strong> the environment. The required robot<br />

motions are determined by observing and extracting relevant information about<br />

the elements in the environment, such as walls, desks, doorways, etc. It is not<br />

necessary that the absolute position <strong>of</strong> these objects is known as further navigation<br />

to be carried out.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 2


1.2 Problem Statement<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Traditional robot navigation techniques mostly are map based method, and<br />

mapless method based on the VSRR (View Sequenced Route Representation) model. The<br />

map based method need to build the map first, which can be done either by human or the<br />

robot itself. While the VSRR-based approach need the robot roam around the scene,<br />

extract and save the feature at each position, and localize it by matching during the<br />

navigation. Both <strong>of</strong> the two approaches are vulnerable to wider scenes therefore can‟t be<br />

applied in a strange environment. GPRS is another widely used navigation tool, but it<br />

fails to function in indoor environments. Despite we can label the environment for robots<br />

(ex. RFID) just like we set landmarks for ourselves, however, it‟s not efficient either in<br />

terms <strong>of</strong> time or money. So, to solve this problem, we propose a method using a landmark<br />

system for human directly to navigate robots.<br />

1.3 Existing System<br />

Service robots need to have maps that support their tasks. Traditional robot<br />

mapping solutions are well-suited to supporting navigation and obstacle avoidance tasks<br />

by representing occupancy information. However, it can be difficult to enable higher-<br />

level understanding <strong>of</strong> the world‟s structure using occupancy-based mapping solutions.<br />

One <strong>of</strong> the most important competencies for a service robot is to be able to accept<br />

commands from a human user. Many such commands will include instructions that<br />

reference objects, structures, or places, so a new mapping system should be designed with<br />

this in mind.<br />

1.4 Proposed System<br />

This proposed system that allows a robot to discover a path automatically by<br />

detecting and reading textual information or signs located on the landmark by using <strong>OCR</strong>.<br />

In this proposed method it has shown that the text-extraction component developed is<br />

valuable on mobile robots. In particular, this system allows the robot to identify named<br />

locations or Landmark placed sideways on a road with high reliability, allowing it to<br />

satisfy requests from a user that refer to these places by name. Just remember that <strong>OCR</strong><br />

(Optical Character Recognition) is, as <strong>of</strong> now, an inexact science and you won't get<br />

flawless transcription in all cases.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 3


Fig. 1.4 Proposed System<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

<strong>OCR</strong> based mapless navigation method is an essential prerequisite for successful<br />

service robots. In contexts such as homes and <strong>of</strong>fices, landmarks are placed sideways <strong>of</strong><br />

the road, places are <strong>of</strong>ten identified by text or signs posted throughout the environment,<br />

by using the concept <strong>of</strong> the <strong>OCR</strong>, textual data can be extracted from the image <strong>of</strong><br />

landmark and navigate the robot. Landmarks such as signs make labeling particularly<br />

easy, as the appropriate label can be read directly from the landmark using Optical<br />

Character Recognition (<strong>OCR</strong>), without the need for human assistance.<br />

The navigation hardware is connected with Android phone through the Bluetooth<br />

module for transfer <strong>of</strong> data for its ordered movement. On the other hand the Android<br />

phone is connected to the server through the internet (GPRS) by its specific IP address<br />

SOCKET connection for transfer <strong>of</strong> image to the server and later for receiving the<br />

interpreted information conveyed by the image after its processing through the <strong>OCR</strong><br />

module. The received data are spoken by the „Text to Speech‟ module on the phone for<br />

human interface and the related byte code is sent to the robot based upon which the robot<br />

navigates through the path.<br />

Optical Character Recognition (<strong>OCR</strong>) also referred as the Optical Image Reader is<br />

a system that provides a full alphanumeric recognition <strong>of</strong> printed or handwritten images at<br />

electronic speed by simply scanning the form. Forms can be scanned through a scanner<br />

and then the recognition engine <strong>of</strong> the <strong>OCR</strong> system interpret the images and turn images<br />

<strong>of</strong> handwritten or printed images into ASCII data (machine-readable images).<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 4


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

This technology provides a complete form processing and document capture<br />

solution. The basic programming language used in the development <strong>of</strong> this project is<br />

JAVA and ANDROID.<br />

The Java APIs (application program interface) that are used include Bluetooth,<br />

Android Text to Speech, Sockets.<br />

1.5 Objectives <strong>of</strong> the Project<br />

The main objective <strong>of</strong> the project is to locate and tracks landmarks via a colour<br />

based region segmentation algorithm. And then, with proper distance find the landmark<br />

region by colour and shape, extracts texts or sign from regions <strong>of</strong> landmark and inputs<br />

them into the <strong>OCR</strong> engine. Simultaneously, a projection analysis <strong>of</strong> signs on the landmark<br />

is conducted, identifying the semanteme <strong>of</strong> signs. Finally, by combining the semanteme<br />

<strong>of</strong> signs and texts which extract from landmark, and control the robot to march forward<br />

and track this region simultaneously. All those procedures will do repeatedly until the<br />

robot reaches the destination.<br />

1.6 Scope <strong>of</strong> the Project<br />

<strong>Mapless</strong> navigation robots are slowly finding their way into outdoor and open<br />

environments and as well currently receiving an increasing attention <strong>of</strong> the scientific<br />

community as in the industry. These robots have many potential applications in routine or<br />

dangerous environment conditions such as operations in a nuclear plant, delivery <strong>of</strong><br />

supplies in hospitals and cleaning <strong>of</strong> Offices, in Labs, in Parks, Zoos, in Hotels as service<br />

robot.<br />

Compared with traditional robot navigation method, this method‟s has advantages<br />

which are as follows:<br />

1. It is <strong>Mapless</strong>.<br />

2. Direct utilization <strong>of</strong> existing landmarks. No need to reset.<br />

3. Adaptability to strange environment. No need to roam first.<br />

1.7 OPTICAL CHARACTER RECOGNITION<br />

1.7.1 What Is <strong>OCR</strong>?<br />

<strong>OCR</strong> is the acronym for Optical Character Recognition. This technology allows a<br />

machine to automatically recognize image through an optical mechanism. Human beings<br />

recognize many objects in this manner our eyes are the "optical mechanism." But while<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 5


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

the brain "sees" the input, the ability to comprehend these signals varies with each person<br />

according to many factors. By reviewing these variables, we can understand the<br />

challenges faced by the technologist developing an <strong>OCR</strong> system.<br />

First, if we read a page in a language other than our own, we may recognize the<br />

various images, but be unable to recognize words. However, on the same page, we are<br />

usually able to interpret numerical statements - the symbols for numbers are universally<br />

used. This explains why many <strong>OCR</strong> systems recognize numbers only, while relatively<br />

few understand the full alphanumeric image range.<br />

Second, there is similarity between many numerical and alphabetical symbol<br />

shapes. For example, while examining a string <strong>of</strong> images combining letters and numbers,<br />

there is very little visible difference between a capital letter "O" and the numeral "0". As<br />

humans, we can re-read the sentence or entire paragraphs to help us determine the<br />

accurate meaning. This procedure, however, is much more difficult for a machine.<br />

Third, we rely on contrast to help us recognize images. We may find it very<br />

difficult to read text which appears against a very dark background, or is printed over<br />

other words or graphics. Again, programming a system to interpret only the relevant data<br />

and disregard the rest is a difficult task for <strong>OCR</strong> engineers. There are many other<br />

problems which challenge the developers <strong>of</strong> <strong>OCR</strong> systems. In this report, we will review<br />

the history, advancements, abilities and limitations <strong>of</strong> existing systems. This analysis<br />

should help determine if <strong>OCR</strong> is the correct application for your company's needs, and if<br />

so, which type <strong>of</strong> system to implement.<br />

1.7.2 History <strong>of</strong> <strong>OCR</strong><br />

The engineering attempts at automated recognition <strong>of</strong> printed images started prior<br />

to World War II. But it was not until the early 1950's that a commercial venture was<br />

identified that justified necessary funding for research and development <strong>of</strong> the<br />

technology. They challenged all the major equipment manufacturers to come up with a<br />

"Common Language" to automatically process checks. After the war, check processing<br />

had become the single largest paper processing application in the world. Although the<br />

banking industry eventually chose Magnetic Ink Recognition (MICR), some vendors had<br />

proposed the use <strong>of</strong> an optical recognition technology. However, <strong>OCR</strong> was still in its<br />

infancy at the time and did not perform as acceptably as MICR. The advantage <strong>of</strong> MICR<br />

was that it is relatively impervious to change, fraudulent alteration and interference from<br />

non-MICR inks.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 6


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The "eye'' <strong>of</strong> early <strong>OCR</strong> equipment utilized lights, mirrors, fixed slits for the<br />

reflected light to pass through, and a moving disk with additional slits. The reflected<br />

image was broken into discrete bits <strong>of</strong> black and white data, presented in a photo-<br />

multiplier tube, and converted to electronic bits. The "brain's" logic required the presence<br />

or absence <strong>of</strong> "black'' or "white" data bits at prescribed intervals. This allowed it to<br />

recognize a very limited, specially designed image set. To accomplish this, the units<br />

required sophisticated transports for documents to be processed. The documents were<br />

required to run at a consistent speed and the printed data had to occur in a fixed location<br />

on each and every form.<br />

The next generation <strong>of</strong> equipment, introduced in the mid to late 1960's, used a<br />

cathode ray tube, a pencil <strong>of</strong> light, and photo multipliers in a technique called "curve<br />

following". These systems <strong>of</strong>fered more flexibility in both the location <strong>of</strong> the data and the<br />

font or design <strong>of</strong> the images that could be read. It was this technique that introduced the<br />

concept that handwritten images could be automatically read, particularly if certain<br />

constraints were utilized. This technology also introduced the concept <strong>of</strong> blue, non-<br />

reading inks as the system was sensitive to the ultraviolet spectrum. The third generation<br />

<strong>of</strong> recognition devices, introduced in the early 1970's, consisted <strong>of</strong> photo-diode arrays.<br />

These tiny little sensors were aligned in an array so the reflected image <strong>of</strong> a document<br />

would pass by at a prescribed speed. These devices were most sensitive in the infra-red<br />

portion <strong>of</strong> the visual spectrum so "red" inks were used as non-reading inks.<br />

General Applications <strong>of</strong> <strong>OCR</strong> include:<br />

Data entry for business documents, e.g. checks clearing.<br />

Automatic number plate recognition.<br />

Importing business card information into a contact list.<br />

More quickly make textual versions <strong>of</strong> printed documents, e.g. book scanning for<br />

Project Gutenberg.<br />

Make electronic images <strong>of</strong> printed documents searchable, e.g. Google Books.<br />

Converting handwriting in real time to control a computer (pen computing).<br />

Defeating CAPTCHA anti-bot systems, though these are specifically designed to<br />

prevent <strong>OCR</strong>.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 7


LITERATURE REVIEW<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Chapter 2<br />

This chapter gives information about the literature review taken from different<br />

websites, industry manuals, IEEE papers related to <strong>OCR</strong>, Landmark recognition and<br />

<strong>Mapless</strong> navigation method.<br />

2.1 Research Paper Review<br />

As we know, current researches on landmark recognition and the method <strong>of</strong><br />

mapless navigation mainly focuses on the area <strong>of</strong> Intelligent Transportation system and<br />

neural network, especially the application <strong>of</strong> Artificial Intelligence or DAS (Driver<br />

Assistance System)<br />

Some <strong>of</strong> the distinguished ones which are relevant and carry basic information for<br />

this paper have been highlighted briefly.<br />

F. Moutarde, A. Bargeton, A. Herbin, and L. Chanussot, in their paper on<br />

“Robust on vehicle real-time visual detection <strong>of</strong> American and European speed<br />

limit signs, with a modular traffic sign recognition system” they have discussed<br />

techniques for pattern-match to extract specific landmark region, but not the <strong>OCR</strong><br />

technique to identify texts on the landmark for further reasoning. For example, the<br />

author identifies numbers on the landmark via neural network and uses the result<br />

to judge whether the landmark is a speed limit landmark, without using these<br />

numbers to control speed. [2]<br />

C. Keller, C. Sprunk, C. Bahlmann, J. Giebel, and G. Barat<strong>of</strong>f, in their paper on<br />

“Real-time recognition <strong>of</strong> us speed signs” they have discussed about LDA which<br />

is used to decide whether the number region on the landmark matches with that<br />

stored before, thus judging whether it‟s a speed limit sign or not. [3]<br />

G. Qingji, Y. Yue, and Y. Guoqing, in their paper on “Detection <strong>of</strong> public<br />

information sign in airport terminal based on multi-scales spatio-temporal vision<br />

information” they have discussed SIFT features which are used to identify a<br />

landmark in an airport which also doesn‟t involve reasoning, besides, the SIFT<br />

approaches need to extract and save the feature points, and match them during<br />

recognition. [4]<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 8


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

J. Maye, L. Spinello, R. Triebel, and R. Siegwart, in their paper on “Inferring the<br />

semantics <strong>of</strong> directional signs in public places” They have proposed the system<br />

which is similar to this in the sense that we both make use <strong>of</strong> the text and arrow<br />

information contained in the landmark. However, its author applies the HISM<br />

(Hierarchical Implicit Shape Models) to represent the landmark region, and<br />

identifies the sign via pattern-match method without extraction, recognition or<br />

interpretation <strong>of</strong> texts in the region. [5]<br />

T. Breuer, G. Giorgana Macedo, R. Hartanto, N. Hochgeschwender, D. Holz, F.<br />

Hegger, Z. Jin and G. Kraetzschmar, in their paper “Johnny: An autonomous<br />

service robot for domestic environments,” they have presented the technique <strong>of</strong><br />

extracting text information from scenes is widely combined with application <strong>of</strong><br />

robot platform, but rarely seen in the area <strong>of</strong> navigation. For instance, in the<br />

system <strong>of</strong> service robots that proposes, the text information is extracted from the<br />

image as an auxiliary feature to interpret the semantic information <strong>of</strong> objects<br />

contained in the scene, which is mainly used to reduce the ambiguity <strong>of</strong><br />

interpretation. [6]<br />

In contrast, I. Posner, P. Corke, and P. Newman in their paper on “Using text-<br />

spotting to query the world,” it has been pointed out that the system in fully<br />

interprets the scene with text information and constructs a query system with a<br />

probabilistic model in which once input a string, the robot can locate the scenes<br />

related to this string. What‟s attractive is that the texts in the scenes don‟t have to<br />

fully match with strings you input. That is, when you input „lunch‟, the robot can<br />

return all the scenes that contain „restaurant‟, which resembles humans. [7]<br />

Balkenius C in his paper on “Spatial learning with perceptually grounded<br />

representations”. He has proposed a spatial navigation system based on visual<br />

templates is presented. Templates are created by selecting a number <strong>of</strong> high<br />

contrast features in the image and storing them together with their relative spatial<br />

locations in the image. [8]<br />

Franz, Matthias O, in their paper on “Learning view graphs for robot navigation”<br />

they have developed a vision based system for topological navigation in open<br />

environments. This system represents selected places by local 360º views <strong>of</strong> the<br />

surrounding scenes. The second approach uses objects <strong>of</strong> the environment as<br />

landmarks, with perception algorithms designed specifically for each object. [9]<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 9


2.2 Java<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Beccari, G. Caselli, S. Zanichelli, in their paper on "Qualitative spatial<br />

representations from task-oriented perception and exploratory behaviors" they<br />

have described a series <strong>of</strong> motor and perceptual behaviors used for indoor<br />

navigation <strong>of</strong> mobile robot, walls, doors and corridors are used as landmarks. [10]<br />

Auranuch Lorsakul, Jackrit Suthakorn in their proposed paper “Traffic Sign<br />

Recognition for Intelligent Vehicle/Driver Assistance System Using Neural<br />

Network on OpenCV” they have discussed preprocessing techniques such as,<br />

threshold technique, Gaussian filter, canny edge detection, Contour and Fit<br />

Ellipse. Then, the implementation <strong>of</strong> Neural Networks stages to recognize the<br />

traffic sign patterns. They have also proposed two strategies to reduce complexity<br />

and to decrease the computational cost in order to facilitate the real time<br />

implementation. [11]<br />

Java is a general-purpose, concurrent, class-based, object-oriented computer<br />

programming language that is specifically designed to have as few implementation<br />

dependencies as possible. Java was originally developed by James Gosling at Sun<br />

Microsystems and released in 1995. [13]<br />

Java is intended to let application developers "Write Once, Run Anywhere"<br />

(WORA), meaning that code that runs on one platform does not need to be recompiled to<br />

run on another. Java applications are typically compiled to byte code (class file) that can<br />

run on any Java virtual machine (JVM) regardless <strong>of</strong> computer architecture.<br />

Google and Android, Inc. have chosen to use Java as a key pillar in the creation<br />

<strong>of</strong> the Android operating system, an open-source Smartphone operating system. Besides<br />

the fact that the operating system, built on the Linux kernel, was written largely in C, the<br />

Android SDK uses Java to design applications for the Android platform.<br />

2.2.1 Features <strong>of</strong> Java<br />

Java is a purely OOP Language that is all the Code <strong>of</strong> the Java Language is<br />

written into the classes and Objects.<br />

Java Language is Platform Independent means its program can run on any<br />

platform by using byte code. The Java compiler does not produce native<br />

executable code for a particular machine like a C compiler would. Instead it<br />

produces a special format called byte code and it can be transferable to another<br />

Computer. But byte code still needs an interpreter to execute them on any given<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 10


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

platform. The interpreter reads the byte code and translates it into the native<br />

language <strong>of</strong> the host machine on the fly. Since the byte code is completely<br />

platform independent, only the interpreter and a few native libraries need to be<br />

ported to get Java to run on a new computer or operating system.<br />

Java language that is standardized enough so that executable applications can run<br />

on any computer that contains a Virtual Machine (run-time environment). Virtual<br />

machines can be embedded in web browsers (such as Netscape Navigator,<br />

Micros<strong>of</strong>t Internet Explorer, and IBM Web Explorer) and operating systems.<br />

A standardized set <strong>of</strong> Class Libraries (packages), which support creating graphical<br />

user interfaces, controlling multimedia data and communicating over networks.<br />

Java was designed with security in mind. As Java is intended to be used in<br />

networked/distributor environments so it implements several security mechanisms<br />

to protect against malicious code that might try to invade file system.<br />

Java is a distributed language which means that the program can be designed to<br />

run on computer networks. Java provides an extensive library <strong>of</strong> classes for<br />

communicating, using TCP/IP protocols such as HTTP and FTP. This makes<br />

creating network connections much easier than in C/C++.<br />

Java uses Multithreaded Techniques for execution means like in other in structure<br />

languages code is divided into smaller parts like these codes <strong>of</strong> Java is divided<br />

into the smaller parts those are executed by Java in the sequence and timing<br />

manner this is called as Multithreaded. In this program <strong>of</strong> Java is divided into the<br />

Small parts those are executed by compiler <strong>of</strong> Java itself. Java is called as<br />

interactive because a code <strong>of</strong> Java supports CUI and GUI Programs.<br />

2.3 Apache Tomcat<br />

Tomcat is one <strong>of</strong> the specific open source collaboration from a larger group <strong>of</strong><br />

open source projects that are collectively famous as the Apache Jakarta Project. It is<br />

initially developed as a servlet reference implementation by James Duncan Davidson, at<br />

Sun Microsystems. Later he helped to make the project open source and played a key role<br />

in its donation by Sun Microsystems to the Apache S<strong>of</strong>tware Foundation (ASF).<br />

Tomcat is an application, a product <strong>of</strong> the Apache S<strong>of</strong>tware foundation, which<br />

enables standalone PC to work as a Server. This can help with a lot <strong>of</strong> tasks such as<br />

programming using Java Server Pages (JSP). By installing the Tomcat s<strong>of</strong>tware, PC can<br />

use as a server and do any related task that a server does.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 11


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Tomcat executes Java servlets and renders Web pages that comprise Java Server<br />

Page coding. Described as a “reference implementation” <strong>of</strong> the Java Servlet and the Java<br />

Server Page (JSP) specifications from Sun Microsystems, and provides a "pure Java"<br />

HTTP web server environment for Java code to run, Tomcat is the result <strong>of</strong> an open<br />

involvement <strong>of</strong> developers and is accessible from the Apache Web site in both binary and<br />

source versions. Tomcat can be used as either a separate product with its own internal<br />

Web server or mutually with other Web servers, including Apache, Netscape Enterprise<br />

Server, Micros<strong>of</strong>t Internet Information Server (IIS), and Micros<strong>of</strong>t Personal Web Server.<br />

Tomcat requires a Java Runtime Enterprise Environment that conforms to JRE 1.1 or<br />

above.<br />

2.3.1 Benefits <strong>of</strong> Tomcat Server<br />

The foremost benefit <strong>of</strong> Tomcat Server is its flexibility. For example, if you<br />

wanted to run Apache on one physical server but the Tomcat service and the actual<br />

tomcat JSP and servlets on another machine, you can. Some companies employ this<br />

method to <strong>of</strong>fer an extra level <strong>of</strong> security, with the Tomcat server behind another firewall<br />

only accessible from the Apache server. Stability is one more advantage. If a significant<br />

failure within Tomcat caused it to fail completely, it would not render entire Apache<br />

service unusable, just servlets and JSP pages will get affected.<br />

2.4 Android<br />

Android is a mobile operating system that is based on a modified version <strong>of</strong><br />

Linux. It is designed primarily for touchscreen mobile devices such as smart phones and<br />

tablet computers. It was originally developed by a startup <strong>of</strong> the same name, Android, Inc.<br />

In 2005, as part <strong>of</strong> its strategy to enter the mobile space, Google purchased Android and<br />

took over its development work (as well as its development team). Google wanted<br />

Android to be open and free; hence, most <strong>of</strong> the Android code was released under the<br />

open-source Apache License, which means that anyone who wants to use Android can do<br />

so by downloading the full Android source code. [17]<br />

The main advantage <strong>of</strong> adopting Android is that it <strong>of</strong>fers a unified approach to<br />

application development. Developers need only develop for Android, and their<br />

applications should be able to run on numerous different devices, as long as the devices<br />

are powered using Android. In the world <strong>of</strong> smart phones, applications are the most<br />

important part <strong>of</strong> the success chain. Device manufacturers therefore see Android as their<br />

best hope to challenge the iPhone, which already commands a large base <strong>of</strong> applications.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 12


2.4.1 Android Versions<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Android Version Code Name Release Date<br />

1.5 Cupcake 30-Apr-09<br />

1.6 Donut 15-Sep-09<br />

2.0-2.1 Eclair 26-Oct-09<br />

2.2 Froyo 20-May-10<br />

2.3-2.3.2 Gingerbread 6-Dec-10<br />

2.3.3-2.3.7 Gingerbread 9-Feb-11<br />

3.1 Honeycomb 10-May-11<br />

3.2 Honeycomb 15-Jul-11<br />

4.0.x Ice Cream Sandwich 16-Dec-11<br />

4.1.x Jelly Bean 9-Jul-12<br />

4.2.x Jelly Bean 13-Nov-12<br />

2.4.2 Features <strong>of</strong> Android<br />

Tab. 2.1 A Brief History <strong>of</strong> Android Versions<br />

Android applications have Linux core, which safeguards them against anomalies<br />

and prevent them from crashing which leads you to get robust and stable android<br />

apps by Android programming.<br />

Simple and easy Android application development process.<br />

Hassle free application porting and it facilitate easy to use APIs and development<br />

tools.<br />

It allows fast information gathering and testing <strong>of</strong> the application.<br />

Web Kit engine integration for rich browser facility.<br />

Low mobile application development cost due to its open source nature.<br />

Save time <strong>of</strong> developers which tend to understand their client‟s requirements.<br />

Android is based on Linux Kernel programming language so high performance<br />

stability and security.<br />

Pr<strong>of</strong>essional android app development is a tremendous platform for personal<br />

application development. It makes network development between applications<br />

easy and thereby <strong>of</strong>fers best experience between android apps and end-users.<br />

Android is the best platform for various smart phones which facilitates developer<br />

to develop world class android applications.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 13


2.5 Embedded C<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The most common programming languages for embedded systems are C, BASIC<br />

and assembly languages. C used for embedded systems is slightly different compared to C<br />

used for general purpose (under a PC platform). Programs for embedded systems are<br />

usually expected to monitor and control external devices and directly manipulate and use<br />

the internal architecture <strong>of</strong> the processor such as interrupt handling, timers, serial<br />

communications and other available features.<br />

Two salient features <strong>of</strong> Embedded Programming are code speed and code size.<br />

Code speed is governed by the processing power, timing constraints, whereas code size is<br />

governed by available program memory and use <strong>of</strong> programming language. Goal <strong>of</strong><br />

embedded system programming is to get maximum features in minimum space and<br />

minimum time.<br />

Most embedded C compilers (as well as ordinary C compilers) have been<br />

developed to support the ANSI (American National Standard for Information) but<br />

compared to ordinary C they may differ in terms <strong>of</strong> the outcome <strong>of</strong> some <strong>of</strong> the<br />

statements. Standard C compiler communicates with the hardware components via the<br />

operating system <strong>of</strong> the machine but the C compiler for the embedded system must<br />

communicate directly with the processor and its components.<br />

2.5.1 Features <strong>of</strong> Embedded C<br />

Embedded C is small and reasonably simpler to learn, understand, program and<br />

debug.<br />

C Compilers are available for almost all embedded devices in use today, and there<br />

is a large pool <strong>of</strong> experienced C programmers.<br />

Unlike assembly, Embedded C has an advantage <strong>of</strong> processor-independence and is<br />

not specific to any particular microprocessor/ microcontroller or any system. This<br />

makes it convenient for a user to develop programs that can run on most <strong>of</strong> the<br />

systems.<br />

As Embedded C combines functionality <strong>of</strong> assembly language and features <strong>of</strong><br />

high level languages, Hence Embedded C is treated as a „middle-level computer<br />

language‟ or „high level assembly language‟.<br />

Embedded C is very efficient programming language.<br />

Embedded C supports access to I/O and provides ease <strong>of</strong> management <strong>of</strong> large<br />

embedded projects.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 14


2.6 Outcome <strong>of</strong> the Literature Survey<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

From our literature survey, it is evident that no fully sophisticated mapless<br />

navigating service robots are experimented till date which uses its own intelligence for<br />

navigation. In this project we are proposing the basic idea <strong>of</strong> mapless navigation using a<br />

robot to detect landmarks for indoor or customized environment. In this method we are<br />

implementing <strong>OCR</strong> (optical character recognition) to track the landmark by using the<br />

Kohonen Neural Network. The main reason to select a Kohonen Neural Network is to<br />

reduce the computational cost in order to facilitate the real time implementation. After<br />

identification landmark, we extract the semantic information <strong>of</strong> texts or arrows contained<br />

in those signs, and use the result to guide the robot to the destination and produce speech<br />

output.<br />

Implementation <strong>of</strong> this method results mapless navigation <strong>of</strong> robots which is<br />

similar to a human navigation system that uses landmark for locating its positions and<br />

target its destinations easily.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 15


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

DESIGN ANALYSIS AND METHODOLOGY<br />

Chapter 3<br />

This chapter will describe the design methodology and flow <strong>of</strong> control in this<br />

project using data flow diagrams and flow chart analysis.<br />

The design phase expands the details <strong>of</strong> an analysis model by taking into account<br />

all technical implementations and restrictions. The purpose <strong>of</strong> the design is to specify a<br />

working solution that can be easily translated into programming code and implementation<br />

models.<br />

3.1 Data Flow Diagram<br />

Fig. 3.1 DFD for the Proposed System<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 16


3.2 Flow Chart<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Fig. 3.2 Flow Chart Model <strong>of</strong> the Proposed System<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 17


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

IMAGE EXTRACTION ALGORITHM<br />

Chapter 4<br />

This chapter will describe the detail technique for extracting images from<br />

landmark using image extraction algorithm.<br />

4.1 Landmark Image Extraction<br />

4.1.1 Binarization<br />

Landmark extraction is the first stage <strong>of</strong> this algorithm. Image captured from the<br />

camera is first converted to the binary image consisting <strong>of</strong> only 1‟s and 0‟s (only black<br />

and white) by thresholding the pixel values <strong>of</strong> 0 (black) for all pixels in the input image<br />

with luminance less than threshold value and 1 (white) for all other pixels. Captured<br />

image (original image) and binarized image are shown in Figure 4.1 and 4.2 respectively.<br />

4.1.2 Smearing<br />

Fig. 4.1 Captured Image Fig 4.2 Binarized Image<br />

To find the Landmark region, the smearing algorithm is used. Smearing is a<br />

method for the extraction <strong>of</strong> text or sign areas on a mixed image. With the smearing<br />

algorithm, the image is processed along vertical and horizontal runs (scan-lines). If the<br />

number <strong>of</strong> white pixels is less than a desired threshold or greater than any other desired<br />

threshold, white pixels are converted to black. In this system, threshold values are<br />

selected as 10 and 100 for both horizontal and vertical smearing.<br />

If the number <strong>of</strong> „white‟ pixels < 10; pixels become „black‟ and removing the noises and<br />

unwanted spots.<br />

Else; no change<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 18


If the number <strong>of</strong> „white‟ pixels > 100; pixels become „black‟<br />

Else; no change<br />

4.2 Landmark Location Finder<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

After smearing, a morphological operation, dilation, is applied to the image for<br />

specifying the landmark location. However, there may be more than one candidate region<br />

for landmark location. To find the exact region <strong>of</strong> landmark and eliminate the other<br />

regions, some criteria tests are applied to the image by smearing and filtering operation.<br />

After this stage the processed image <strong>of</strong> a landmark with the text is shown in Figure 4.3.<br />

Fig. 4.3 Landmark Involving Text<br />

After obtaining Landmark location, region involving only text or sign is cut from<br />

the landmark as shown in Figure 4.4.<br />

4.3 Segmentation<br />

Fig. 4.4 Landmark Text Region<br />

In the segmentation <strong>of</strong> Landmark image, text/sign landmark region is segmented<br />

into its constituent parts obtaining the images individually. It includes the following steps<br />

4.3.1 Filtering<br />

Firstly, the image is filtered for enhancing the image and removing the noises and<br />

unwanted spots. Images are <strong>of</strong>ten corrupted by random variations in intensity,<br />

illumination, or have poor contrast and can‟t be used directly. In Filtering we transform<br />

pixel intensity values to reveal certain image characteristics. Filtering is a necessary<br />

process because the presence <strong>of</strong> noise or unwanted spot will produce wrong results.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 19


4.3.2 Dilation<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Dilation operation is applied to the image for separating the images from each other if<br />

the images are close to each other. After this operation, horizontal and vertical smearing is<br />

applied for finding the image regions. The result <strong>of</strong> this segmentation is in Figure 4.5.<br />

Fig. 4.5 Separated Text Images after Dilation Process<br />

4.3.3 Individual Image Separation<br />

The next step is to cut the landmark image. It is done by finding starting and end<br />

points <strong>of</strong> each individual image in the horizontal direction. The individual images cuttings<br />

from the landmark are as follows in Figure 4.6.<br />

4.3.4 Normalization<br />

Fig. 4.6 Individual Image Cut<br />

Normalization is to refine the images into a block containing no extra white<br />

spaces (pixels) in all the four sides <strong>of</strong> the images. Then each image fits to equal size as<br />

shown in Figure 4.7.<br />

Fig. 4.7 Image after Normalization<br />

Fitting approach is necessary for template matching. For matching the images to<br />

the database, input images must be equal sized with the database images. Here the images<br />

are fit to 36 * 18. The extracted images cut from landmark and the images on the database<br />

are now equal sized. The next step is template matching.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 20


4.4 Template Matching<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Template matching is an effective algorithm for recognition <strong>of</strong> images. The image<br />

is compared with the ones in the database and the best similarity is measured. Template<br />

matching is a technique <strong>of</strong> image processing for finding small parts <strong>of</strong> an image which<br />

match a template image or data base image. A basic method <strong>of</strong> template matching uses<br />

a convolution mask (template), tailored to a specific feature <strong>of</strong> the search image, which<br />

we want to detect. This technique can be easily performed on grey images or edge<br />

images.<br />

Fig. 4.8 Database Template Images<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 21


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

IMAGE RECOGNITION USING KOHONEN<br />

Chapter 5<br />

This chapter will describe the Introduction and detail techniques for recognizing<br />

image using the Kohonen Neural Network.<br />

5.1 Introduction to the Network<br />

Optical Image Recognition abbreviated as <strong>OCR</strong> means that converting some text<br />

image into computer editable text format. For example we can say about ASCII code. But<br />

in this thesis Unicode is considered as converted text. Lots <strong>of</strong> recognition systems are<br />

available in computer science and also <strong>OCR</strong> plays a prominent role in the field <strong>of</strong><br />

character Recognition. This recognition system works well for simple language like<br />

English. It has only 26 image sets. And for standard text there are 42 numbers <strong>of</strong> images<br />

including capital and small letters. But a complex but organized sign language like a<br />

landmark in <strong>OCR</strong> system is still in preliminary level. The reasons <strong>of</strong> its complexities are<br />

its image shapes, its top bars and end bars more over it has some modified, vowel and<br />

compound images. In this project we recognize some landmarks and words which are<br />

useful in robot navigation by using neural network. We are using Kohonen Neural<br />

Network for training and recognition procedure which means classification stage. At the<br />

beginning grayscale and then BW image conversion takes place for producing binary<br />

data. These preprocessing steps are described further. After that the image containing<br />

landmark is need to be converted into a trainable form by means <strong>of</strong> processing steps.<br />

These processing steps are described below.<br />

5.2 General Image Recognition Procedure<br />

As like all other recognition procedure character recognition is nothing but a<br />

recognition process. Here a simple and general character recognition procedure in Figure<br />

5.1 is described below.<br />

First <strong>of</strong> all we need a large number <strong>of</strong> raw data or collected data which will be<br />

processed and later trained by the system. It is very important to collect a specific data.<br />

Later on we need to compare with similar kind <strong>of</strong> data. And also we have to think the<br />

complexity level <strong>of</strong> collected data because the next steps will be dependent on input data<br />

type. It can be scanned documents or handwritten documents.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 22


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Secondly we have to consider pre-processing stage. Here mainly image processing<br />

procedures take place. Like grayscale image conversion, binary image conversion, skew<br />

correction. This processing stage depends on pre-processing stages. So we need to design<br />

this pre-processing steps with great care.<br />

Thirdly the processing steps are occurring. Thinning, Edge Detection, Chain<br />

Code, Pixel Mappings, Histogram Analysis is some feature <strong>of</strong> processing stages. This<br />

stage basically converts raw data into trainable components.<br />

Finally the training and recognition in short classification stage take place. The<br />

pre-processed and processed data is trained by means <strong>of</strong> taught the system about the<br />

incoming data. So later on it can easily recognize an input data.<br />

Fig. 5.1 General Character Recognition Procedure<br />

5.3 Image Recognition Procedures with Kohonen<br />

So far we have described about the image recognition procedure. Now we are<br />

describing the procedure used in my image recognition system (Figure 5.3). Steps are<br />

described below:<br />

a. Landmark image which is taken as raw data for input.<br />

b. Extracted landmark image is now grayscale and then converted into a BW image<br />

in the preprocessing stage.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 23


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

c. Pixels are grabbed and mapped into specific areas and vector is extracted from the<br />

image containing given word or image. This part is considered as processing<br />

stage.<br />

d. Lastly Kohonen Neural Network is taken as classification stage.<br />

Fig. 5.2 Image Recognition Procedure Using Kohonen Neural Network<br />

5.4. Data Collection<br />

As we said we have to choose the input data type with great care because we need<br />

to develop the system according to raw data or collected data. Here we are talking about<br />

character or sign recognition in landmark so obviously we need Landmark Image and also<br />

text characters. But we are also considering some given word as well. But no word level<br />

or character level segmentation is considered here rather whole single character image or<br />

single word image is taken as raw input. And before entering into the system the image is<br />

resized into 250 X 250 pixels to satisfy the procedure. No matter, whether it‟s text<br />

character or sign image. We took the computer image for my system to be trained. But no<br />

skew correction took place here. So when we need to capture image we have to be careful<br />

about the image size and shape.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 24


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Here the whole text or sign is taken and trained. Because there are lots <strong>of</strong> features,<br />

irregular shapes and curvatures in image <strong>of</strong> landmarks as we have seen. And still no<br />

general formula is generated for feature extraction from the landmark. So rather than<br />

extracting character or individual image from the landmark, we will take whole text or<br />

whole landmark image as input data in the first phase <strong>of</strong> the image recognition system.<br />

Fig. 5.3 Cropped Input Image from Landmark Region<br />

One thing is really important here. That is the character size <strong>of</strong> the image. The<br />

character shouldn‟t be partially present on the landmark image and it shouldn‟t be too<br />

small in size. In this proposed system we are considering above 36 font size which is bold<br />

for each individual text or image.<br />

5.5 Image Pre-processing<br />

A digital text image that contains character is generally an RGB image. The figure<br />

below shows two types <strong>of</strong> images containing digital English character „T‟. The character<br />

on Figure 5.4 is captured and resized into 250 X 250 pixels. The same thing for the Figure<br />

5.5 is digital computer image.<br />

Fig. 5.4 Captured Input Image Fig. 5.5 Computer Image<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 25


5.5.1 RGB to Grayscale Image Conversion<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

In the pre-processing first stage we are converting the input RGB image into<br />

grayscale image. Here we are considering the Othu‟s algorithm for RGB to grayscale<br />

conversion. The algorithm is given below:<br />

1. Count the number <strong>of</strong> pixels according to colour (256 colours) and save it to<br />

matrix count.<br />

2. Calculate probability matrix P <strong>of</strong> each colour, Pi = counti / sum <strong>of</strong> the count,<br />

where i= 1, 2, … … 256.<br />

3. Find matrix omega, omegai = cumulative sum <strong>of</strong> Pi, where i= 1, 2,..... 256.<br />

4. Find matrix mu, mui = cumulative sum <strong>of</strong> Pi *i, where i= 1, 2,.............256<br />

and mu_t = cumulative sum <strong>of</strong> P256 * 256<br />

5. Calculate matrix sigma_b_squared<br />

where, sigma_b_squaredi = (mu_t × omegai − mui) 2 /omegai - (1- omegai )<br />

6. Find the location, idx, <strong>of</strong> the maximum value <strong>of</strong> sigma_b_squared. The maximum<br />

may extend over several bins, so average together the locations.<br />

7. If the maximum is not a number, meaning that sigma_b_squared is all not a<br />

number, and then the threshold is 0.<br />

8. If the maximum is a finite number, threshold = (idx - 1) / (256 - 1);<br />

Figure 5.6 below shows an RGB image and Figure 5.7 shows a grayscale<br />

converted image.<br />

Fig. 5.6 RGB Image Fig. 5.7 Grayscale Image<br />

5.5.2 Grayscale to Binary Image Conversion<br />

In the pre-processing second stage we are converting the grayscale image into a<br />

binary image. In a grayscale image there are 256 combinations <strong>of</strong> black and white colours<br />

where 0 means pure black and 255 means pure white. This image is converted to binary<br />

image by checking whether or not each pixel value is greater than 255•level (level, found<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 26


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

by Otsu's <strong>Method</strong>). If the pixel value is greater than or equal to 255•level then the value is<br />

set to 1 i.e. white otherwise 0 i.e. black. Figure 5.8 shows a grayscale image with 0-255<br />

level <strong>of</strong> histogram and figure 5.9 shows a BW or binary image with two levels <strong>of</strong> the<br />

histogram. [14]<br />

Fig. 5.8 Grayscale Image with Histogram Fig. 5.9 Binary Image with Histogram<br />

5.6 Feature Extraction<br />

Next and the most important feature <strong>of</strong> the given character recognition is feature<br />

extraction. In this system we are considering a few steps for extracting a vector. Our main<br />

target is finding a vector from the image. So the image is processed and then binary<br />

image is created. So we have only two types <strong>of</strong> data on the image. Those are 1 for the<br />

white space and 0 for the black space. Now we have to pass the following steps for<br />

creating 625 length vectors for a particular character or image. Those are: [15]<br />

1. Pixel grabbing.<br />

2. Finding the probability <strong>of</strong> making a square.<br />

3. Mapped to sampled area.<br />

4. Creating vector.<br />

5. Representing character with a model no.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 27


5.6.1 Pixel Grabbing From Image<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

As we are considering the binary image and we also fixed the image size, so we<br />

can easily get 250 X 250 pixels from a particular image containing given text character or<br />

sign. One thing is clear that we can grab and separate only character portion <strong>of</strong> the digital<br />

image. In specific, we took a given character contained image. And obviously it‟s a<br />

binary image. As we specified that the pixel containing the value 1 is a white spot and 0<br />

for a black one, so naturally the 0 portioned spots are the original character.<br />

5.6.2 Finding Probability <strong>of</strong> Making Square<br />

Now we are going to sample the entire image into a specified portion so that we<br />

can get the vector easily. We specified an area <strong>of</strong> 25 X 25 pixels. For this we need to<br />

convert the 250 X 250 image into the 25 X 25 area. So for each sampled area we need to<br />

take 10 X 10 pixels from binary image. We can give a short example for that. Table 2 is<br />

the original binary image <strong>of</strong> 25 X 15 pixels. We need to sample it 5 X 3 pixel areas. So,<br />

for each area we will consider 5 X 5 pixels from the binary image. Table 3 will show how<br />

pixels are classified for finding the probability <strong>of</strong> making a square.<br />

Tab. 5.1 Binary Converted Grid Values<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 28


5.6.3 Mapped To Sampled Area<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

We can recall the previous example from Table 4 Now the same sample pixel<br />

from binary image after separating is shown in Table 5. Now we will find out for each 5<br />

X 5 pixels from the separated pixel portion and gives a unique number for each separated<br />

pixel class. And this number will be equal to the 5 X 3 sampled areas. Now we need not<br />

consider whether 5 X 5 pixels will make a black area or square or a white area or square.<br />

We will take the priority <strong>of</strong> 0s or 1s from 5 X 5 pixels. And from there we can say, if the<br />

0s get the priority from 5X5 in i th location then we will make a black square on i th<br />

position <strong>of</strong> the sample area. Table 4 is having a unique number <strong>of</strong> 5 X 5 separated pixels<br />

and in Table 5 covering black or white depending on probabilistic manner.<br />

Table 5.2 Aligned Marked Values<br />

Here is an example <strong>of</strong> how 250 X 250 pixels <strong>of</strong> English character „T‟ are sampled<br />

into 25 X 25 sampled areas.<br />

Fig. 5.10 Sampled Image<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 29


5.6.4 Creating Vector<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Once we have sampled the binary image we have black area and white are. Now<br />

we will put a single 1 (one) for each black square and 0 (zero) for each white square. And<br />

Figure 5.10 from above is represented by 1s and 0s combination in Figure 5.11 below.<br />

Fig. 5.11 Vector Representation<br />

Now we will collect each row, combine together and it will make a vector. Vector<br />

for Figure 5.11 is given as below.<br />

111001001011111111111111111010000001111111111111111111111111101000000001<br />

011111111111111111111110111110000000111111111011111111111111111010000000<br />

000000000000000111100000000000000000000000000000011111000000000000000000<br />

000000000000001111100000000000000010000000000000001111100000000000000000<br />

000000000000000111110000000000000000000000000000001111100000000010000000<br />

010000100000000011110000000000000100000000000000000001111100000000000000<br />

000000000000000000111110000000100000000000000000001000011110100000000000<br />

000000000000000000001111000000000000000000000000000000000011110000000000<br />

000000000000000000000011101000000000010000001000000000000000111100000000<br />

0001000000000000000000000011110000000000000001000000000000000.<br />

5.6.5 Representing a Character with a Model Number<br />

One thing need to mention here. That is, the system is assigned by numerical<br />

number and special symbols as a model number for each vector and also the<br />

corresponding given input word or text character from that particular model. This is<br />

because given character is unique length. But we are also considering word which has an<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 30


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

irregular length. When we need to train, we will train it with its model no. and model<br />

number knows its corresponding input character. So in short we can say that, a particular<br />

model has a unique vector with the length <strong>of</strong> 625 characters <strong>of</strong> 1s and 0s and a unique<br />

character.<br />

5.7 Kohonen Neural Network<br />

We did lots <strong>of</strong> activities in pre-processing stages and as well as in the processing<br />

stage. The main idea is to make it simple and acceptable for the Kohonen Neural<br />

Network. The Kohonen neural network contains no hidden layer. This network<br />

architecture is named after its creator, Tuevo Kohonen. The Kohonen neural network<br />

differs from the feed forward back propagation neural network in several important ways.<br />

5.7.1 Introduction to Kohonen Network<br />

The Kohonen neural network differs considerably from the feed forward back<br />

propagation neural network. The Kohonen neural network differs both in how it is trained<br />

and how it recalls a pattern. The Kohohen neural network does not use any sort <strong>of</strong><br />

activation function. Further, the Kohonen neural network does not use any sort <strong>of</strong> a bias<br />

weight. [12]<br />

Output from the Kohonen neural network does not consist <strong>of</strong> the output <strong>of</strong> several<br />

neurons. When a pattern is presented to a Kohonen network one <strong>of</strong> the output neurons is<br />

selected as a "winner". This "winning" neuron is the output from the Kohonen network.<br />

Often these "winning" neurons represent groups in the data that is presented to the<br />

Kohonen network. For example, in this system we consider 10 digits, 5 vowels, 21<br />

consonants, and some signs from the total model. The most significant difference between<br />

the Kohonen neural network and the feed forward back propagation neural network is that<br />

the Kohonen network trained in an unsupervised mode. This means that the Kohonen<br />

network is presented with data, but the correct output that corresponds to that data is not<br />

specified. Using the Kohonen network this data can be classified into groups. We will<br />

begin a review <strong>of</strong> the Kohonen network by examining the training process.<br />

Consider the vector length is 625 so input layer has 625 neurons. But in this<br />

output layer the number <strong>of</strong> neurons depends on the number <strong>of</strong> character trained with the<br />

network. As we take 625 as input and n for output character, we can draw this suitable<br />

Kohonen Neural Network model as shown in Figure 5.12.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 31


Fig. 5.12 Kohonen Neural Network<br />

5.7.2 The Structure <strong>of</strong> Kohonen Network<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The Kohonen neural network contains only an input and output layer <strong>of</strong> neurons.<br />

There is no hidden layer in a Kohonen neural network. First we will examine the input<br />

and output to a Kohonen neural network. [12]<br />

The input to a Kohonen neural network is given to the neural network using the<br />

input neurons. These input neurons are each given the floating point numbers that make<br />

up the input pattern to the network. A Kohonen neural network requires that these inputs<br />

be normalized to the range between -1 and 1. Presenting an input pattern to the network<br />

will cause a reaction from the output neurons. In a Kohonen neural network only one <strong>of</strong><br />

the output neurons actually produces a value. Additionally, this single value is either true<br />

or false. When the pattern is presented to the Kohonen neural network, one single output<br />

neuron is chosen as the output neuron. Therefore, the output <strong>of</strong> the Kohonen neural<br />

network is usually the index <strong>of</strong> the neuron that fired. The structure <strong>of</strong> a typical Kohonen<br />

neural network is shown in Figure 5.13.<br />

Fig. 5.13 Simple Kohonen Network with 2 Input and 2 Output Neurons<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 32


5.7.3 Sample Input to Kohonen Network<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

As we understand the structure <strong>of</strong> the Kohonenneural network, we will examine<br />

how the network processes information. To examine this process we will step through the<br />

calculation process. For this example we will consider a very simple Kohonen neural<br />

network. This network will have only two input and two output neurons. The input given<br />

to the two input neurons is shown in Table 5.3.<br />

Table 5.3 Sample Inputs to a Kohonen Neural Network<br />

We must also know the connection weights between the neurons. These<br />

connection weights are given in Table 5.4.<br />

Table 5.4 Connection Weights in the Sample Kohonen Neural Network<br />

Using these values we will now examine which neuron would win and produce<br />

output. We will begin by normalizing the input.<br />

5.7.4 Normalizing the Input<br />

The requirements that the Kohonen neural network places on its input data are one<br />

<strong>of</strong> the most severe limitations <strong>of</strong> the Kohonen neural network. Input to the Kohonen<br />

neural network should be between the values -1 and 1. In addition, each <strong>of</strong> the inputs<br />

should fully use the range. If one, or more, <strong>of</strong> the input neurons were to use only the<br />

numbers between 0 and 1, the performance <strong>of</strong> the neural network would suffer. To<br />

normalize the input we must first calculate the "vector length" <strong>of</strong> the input data, or vector.<br />

This is done by summing the squares <strong>of</strong> the input vector. In this case it would be. ((0.5 *<br />

0.5) + (0.75 * 0.75)). This would result in a "vector length" <strong>of</strong> 0.8125. If the length<br />

becomes too small, say less than the length is set to that same arbitrarily small value. In<br />

this case the "vector length" is a sufficiently large number. Using this length we can now<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 33


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

determine the normalization factor. The normalization factor is the reciprocal <strong>of</strong> the<br />

square root <strong>of</strong> the length. For this value the normalization factor is calculated as follows,<br />

and this results in a normalization factor <strong>of</strong> 1.1094. This normalization process<br />

will be used in the next step where the output layer is calculated.<br />

5.7.5 Calculating Each Neuron’s Output<br />

To calculate the output the input vector and neuron connection weights must both<br />

be considered. First the "dot product" <strong>of</strong> the input neurons and their connection weights<br />

must be calculated. To calculate the dot product between two vectors you must multiply<br />

each <strong>of</strong> the elements in the two vectors. We will now examine how this is done.<br />

The Kohonen algorithm specifies that we must take the dot product <strong>of</strong> the input<br />

vector and the weights between the input neurons and the output neurons. The result <strong>of</strong><br />

this is as follows.<br />

| . . |•| . . |= ( . ∗ . )+ ( . ∗ . )<br />

As we can see from the above calculation the dot product would be 0.395. This<br />

calculation will be performed for the first output neuron. This calculation will have to be<br />

done for each <strong>of</strong> the output neurons. Through this example we will only examine the<br />

calculations for the first output neuron. The calculations necessary for the second output<br />

neuron are calculated in the same way.<br />

This output must now be normalized by multiplying it by the normalization factor<br />

that was determined in the previous step. We must now multiply the dot product <strong>of</strong> 0.395<br />

by the normalization factor <strong>of</strong> 1.1094. This results in an output <strong>of</strong> 0.438213. Now that the<br />

output has been calculated and normalized it must be mapped to a bipolar number.<br />

5.7.6 Mapping to Bipolar<br />

In the bipolar system the binary zero maps to -1 and the binary remains a 1.<br />

Because the input to the neural network normalized to this range we must perform a<br />

similar normalization to the output <strong>of</strong> the neurons. To make this mapping we add one and<br />

divide the result in half. For the output <strong>of</strong> 0.438213 this would result in a final output <strong>of</strong><br />

0.7191065. The value 0.7191065 is the output <strong>of</strong> the first neuron. This value will be<br />

compared with the outputs <strong>of</strong> the other neuron. By comparing these values we can<br />

determine a "winning" neuron.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 34


5.7.7 Choosing a Winner<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

We have seen how to calculate the value for the first output neuron. If we are to<br />

determine a winning output neuron we must also calculate the value for the second output<br />

neuron. We will now quickly review the process to calculate the second neuron. For a<br />

more detailed description you should refer to the previous section.<br />

The second output neuron will use exactly the same normalization factor as was<br />

used to calculate the first output neuron. As you recall from the previous section the<br />

normalization factor is 1.1094. If we apply the dot product for the weights <strong>of</strong> the second<br />

output neuron and the input vector we get a value <strong>of</strong> 0.45. This value is multiplied by the<br />

normalization factor <strong>of</strong> 1.1094 to give the value <strong>of</strong> 0.0465948. We can now calculate the<br />

final output for neuron 2 by converting the output <strong>of</strong> 0.0465948 to bipolar yields 0.49923.<br />

As we can see we now have an output value for each <strong>of</strong> the neurons. The first<br />

neuron has an output value <strong>of</strong> 0.7191065 and the second neuron has an output value <strong>of</strong><br />

0.49923. To choose the winning neuron we choose the output that has the largest output<br />

value. In this case the winning neuron is the first output neuron with an output <strong>of</strong><br />

0.7191065, which beats neuron two's output <strong>of</strong> 0.49923.<br />

We have now seen how the output <strong>of</strong> the Kohonen neural network was derived.<br />

As we can see the weights between the input and output neurons determine this output.<br />

5.7.8 Kohonen Network Learning Procedure<br />

The training process for the Kohonen neural network is competitive. For each<br />

training set one neuron will "win". This winning neuron will have its weight adjusted so<br />

that it will react even more strongly to the input the next time. As different neurons win<br />

for different patterns, their ability to recognize that particular pattern will be increased.<br />

We will first examine the overall process involving training the Kohonen neural network.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 35


5.7.9 Learning Algorithm Flowchart<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Fig. 5.14 Flow Chart Model for Learning Algorithm<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 36


5.7.10 Learning Rate<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The learning rate is a constant that will be used by the learning algorithm. The<br />

learning rate must be a positive number less than 1. Typically the learning rate is a<br />

number such as .4 or .5. In the following section the learning rate will be specified by the<br />

symbol alpha. Generally setting the learning rate to a larger value will cause the training<br />

to progress faster. Though setting the learning rate to too large a number could cause the<br />

network to never converge. This is because the oscillations <strong>of</strong> the weight vectors will be<br />

too great for the classification patterns to ever emerge. Another technique is to start with a<br />

relatively high learning rate and decrease this rate as training progresses. This allows<br />

initial rapid training <strong>of</strong> the neural network that will be "fine-tuned" as training progresses.<br />

The learning rate is just a variable that is used as part <strong>of</strong> the algorithm used to adjust the<br />

weights <strong>of</strong> the neurons.<br />

5.7.11 Adjusting Weight<br />

The entire memory <strong>of</strong> the Kohonen neural network is stored inside <strong>of</strong> the<br />

weighted connections between the input and output layer. The weights are adjusted in<br />

each epoch. An epoch occurs when training data are presented to the Kohonen neural<br />

network and the weights are adjusted based on the results <strong>of</strong> this item <strong>of</strong> training data.<br />

The adjustments to the weights should produce a network that will yield more favourable<br />

results the next time the same training data is presented. Epochs continue as more and<br />

more data is presented to the network and the weights are adjusted. Eventually the return<br />

on these weight adjustments will diminish to the point that it is no longer valuable to<br />

continue with this particular set <strong>of</strong> weights. When this happens the entire weight matrix is<br />

reset to new random values. This forms a new cycle. The final weight matrix that will be<br />

used will be the best weight matrix determined from each <strong>of</strong> the cycles. We will now<br />

examine how these weights are transformed. The original method for calculating the<br />

changes to weights, which was proposed by Kohonen, is <strong>of</strong>ten called the additive method.<br />

This method uses the following equation,<br />

The variable x is the training vector that was presented to the network. The<br />

variable w t is the weight <strong>of</strong> the winning neuron, and the variable w t+1 is the new weight.<br />

The double vertical bars represent the vector length.<br />

The additive method generally works well for Kohonen neural networks. Though<br />

in cases where the additive method shows excessive instability, and fails to converge, an<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 37


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

alternate method can be used. This method is called the subtractive method. The<br />

subtractive method uses the following equations. = − and = + + .<br />

These two equations show you the basic transformation that will occur on the weights <strong>of</strong><br />

the network.<br />

5.7.12 Calculating the Errors<br />

Before we can understand how to calculate the error for chronic neural network<br />

must first understand what the error means. The coming neural network is trained in an<br />

unsupervised fashion so the definition <strong>of</strong> the error is somewhat different involving they<br />

normally think <strong>of</strong> as an error. The purpose <strong>of</strong> the Kohonen neural network is to classify<br />

the input into several sets. The error for the Kohonen neural network, therefore, must be<br />

able to measure how well the network is classifying these items.<br />

5.7.13 Recognition with Kohonen Network<br />

So for a given pattern we can easily find out the vector and can send it through the<br />

Kohonen Neural Network. And for that particular pattern any one <strong>of</strong> the neuron will be<br />

fired. As for all input pattern the weight is normalized so the input pattern will be<br />

calculated with the normalized weight. As a result the fired neuron is the best answer for<br />

that particular input pattern.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 38


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

HARDWARE DESCRIPTION AND<br />

IMPLEMENTATION<br />

Chapter 6<br />

This chapter gives the brief description about the hardware implementation <strong>of</strong> the<br />

proposed system, components used and its circuit logic.<br />

6.1 Circuit Diagram<br />

Fig. 6.1 Circuit Diagram <strong>of</strong> the Proposed System<br />

The hardware implementation <strong>of</strong> this project includes the following components.<br />

6.1.1 RFID Reader<br />

RFID stands for Radio Frequency Identification. RFID is one member in the<br />

family <strong>of</strong> Automatic Identification and Data Capture (AIDC) technologies and is a fast<br />

and reliable means <strong>of</strong> identifying objects. There are two main components: The<br />

Interrogator (RFID Reader) which transmits and receives the signal and the Transponder<br />

(tag) that is attached to the object. An RFID tag is composed <strong>of</strong> a miniscule microchip<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 39


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

and an antenna. RFID tags can be passive or active and come in a wide variety <strong>of</strong> sizes,<br />

shapes, and forms. Communication between the RFID Reader and tags occurs wirelessly<br />

and generally does not require a line <strong>of</strong> sight between the devices.<br />

An RFID Reader can read through most anything with the exception <strong>of</strong> conductive<br />

materials like water and metal, but with modifications and positioning, even these can be<br />

overcome. The RFID Reader emits a low-power radio wave field which is used to power<br />

up the tag so as to pass on any information that is contained on the chip. In addition,<br />

readers can be fitted with an additional interface that converts the radio waves returned<br />

from the tag into a form that can then be passed on to another system, like a computer or<br />

any programmable logic controller. Passive tags are generally smaller, lighter and less<br />

expensive than those that are active and can be applied to objects in harsh environments,<br />

are maintenance free and will last for years. These transponders are only activated when<br />

within the response range <strong>of</strong> an RFID Reader. Active tags differ in that they incorporate<br />

their own power source, whereas the tag is a transmitter rather than a reflector <strong>of</strong> radio<br />

frequency signals which enables a broader range <strong>of</strong> functionality like programmable and<br />

read/write capabilities<br />

Fig. 6.2 RFID Reader<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 40


Fig. 6.3 Block Diagram <strong>of</strong> LF DT125R Module<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The LF DT125R reader consists <strong>of</strong> RF front end interfaced with the baseband<br />

processor that operates with +5V power supply. An antenna interfaces with the RF front<br />

end, and tuned at 125 kHz to detect a tag (transponder) that comes in the vicinity <strong>of</strong> the<br />

reader field. The data read from the tag by the front end is detected and decoded by the<br />

baseband processor and is then sent to the UART interface. The DT125R is designed for a<br />

reading range <strong>of</strong> 50 mm to 100 mm. An LED and a beeper can be interfaced to the print<br />

to indicate the tag read status. DT125R has a built-in circuitry for noise reduction.<br />

6.1.2 Bluetooth Module<br />

The concept behind Bluetooth is to provide a universal short-range wireless<br />

capability. Using the 2.4 GHz band, available globally for unlicensed low-power uses,<br />

two Bluetooth devices within 10 m <strong>of</strong> each other can share up to 720 Kbps <strong>of</strong> capacity.<br />

Bluetooth is designed to operate in an environment <strong>of</strong> many users. Up to eight devices<br />

can communicate in a small network called a piconet. Networks are usually formed ad-<br />

hoc from portable devices such as cellular phones, Handhelds and laptops. Unlike the<br />

other popular wireless technology, Wi-Fi, Bluetooth <strong>of</strong>fers higher level service pr<strong>of</strong>iles,<br />

e.g., FTP-like file servers, file pushing, voice transport, serial line emulation, and more.<br />

Fig. 6.4 Bluetooth SMD Module - RN-42<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 41


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

In this project we are using Bluetooth SMD Module - RN-42. This module from<br />

Roving Networks is powerful, small, and very easy to use. This Bluetooth module is<br />

designed to replace serial cables. The Bluetooth stack is completely encapsulated. The<br />

end user just sees serial characters being transmitted back and forth. Press the 'A'<br />

character from a terminal program on the computer and an 'A' will be pushed out the TX<br />

pin <strong>of</strong> the Bluetooth module.<br />

The RN-42 is perfect for short range, battery powered applications. The RN-42<br />

uses only 26µA in sleep mode while still being discoverable and connectable. Multiple<br />

user configurable power modes allow the user to dial in the lowest power pr<strong>of</strong>ile for a<br />

given application.<br />

6.1.3 PIC16F873A Microcontroller<br />

PIC Microcontrollers are quickly replacing computers when it comes to<br />

programming robotic devices. These microcontrollers are small and can be programmed<br />

to carry out a number <strong>of</strong> tasks and are ideal for school and industrial projects. A simple<br />

program is written using a computer; it is then downloaded to a microcontroller which in<br />

turn can control a robotic device. In this project we are using Microcontroller<br />

PIC16F873A because it is very easy for using PIC16F873A and use FLASH memory<br />

technology so that can be write-erase until thousand times. The superiority this RISC<br />

Microcontroller compared to with other microcontroller 8-bit especially at a speed <strong>of</strong> and<br />

its code compression. PIC16F873A have 40 pins by 33 paths <strong>of</strong> I/O and all 40 pins are<br />

divided into 5 ports.<br />

Fig. 6.5 Pin Configuration <strong>of</strong> PIC16F873A<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 42


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The 40 pins make it easier to use the peripherals as the functions are spread out<br />

over the pins. This makes it easier to decide what external devices to attach without<br />

worrying too much if there enough pins to do the job. One <strong>of</strong> the main advantages <strong>of</strong> this<br />

is each pin is only shared between two or three functions so it‟s easier to decide what the<br />

pin functions.<br />

PIC16F873A perfectly fits many uses, from automotive industries and controlling<br />

home appliances to industrial instruments, remote sensors, electric door locks and safety<br />

devices. It is also ideal for smart cards as well as for battery supplied devices because <strong>of</strong><br />

its low consumption. EEPROM memory makes it easier to apply microcontrollers to<br />

devices where permanent storage <strong>of</strong> various parameters is needed (codes for transmitters,<br />

motor speed, receiver frequencies, etc.). Low cost, low consumption, easy handling and<br />

flexibility make PIC16F873A applicable even in areas where microcontrollers had not<br />

previously been considered (example: timer functions, interface replacement in larger<br />

systems, coprocessor applications, etc.).<br />

6.1.4 ULN 2003<br />

ULN2003 is a high voltage and high current Darlington array IC. It contains seven<br />

open collector Darlington pairs with common emitters. A Darlington pair is an<br />

arrangement <strong>of</strong> two bipolar transistors.<br />

Fig. 6.6 ULN 2003<br />

The ULN2003AP/AFW Series are high-voltage, high current Darlington drivers<br />

comprised <strong>of</strong> seven NPN Darlington pairs. ULN2003 belongs to the family <strong>of</strong> ULN200X<br />

series <strong>of</strong> ICs. Different versions <strong>of</strong> this family interface to different logic families.<br />

ULN2003 is for 5V TTL, CMOS logic devices. These ICs are used when driving a wide<br />

range <strong>of</strong> loads and are used as relay drivers, display drivers, line drivers etc. ULN2003 is<br />

also commonly used while driving Stepper Motors. Refer Stepper Motor interfacing using<br />

ULN2003.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 43


Fig. 6.7 Pin Configuration <strong>of</strong> ULN 2003<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Each channel or Darlington pair in ULN2003 is rated at 500mA and can withstand<br />

peak current <strong>of</strong> 600mA. The inputs and outputs are provided opposite to each other in the<br />

pin layout. Each driver also contains a suppression diode to dissipate voltage spikes while<br />

driving inductive loads.<br />

These versatile devices are useful for driving a wide range <strong>of</strong> loads including<br />

solenoids, relays DC motors; LED display filament lamps, thermal print heads and high<br />

power buffers.<br />

6.1.5 Relay<br />

A relay is an electrically operated switch. Many relays use an electromagnet to<br />

operate a switching mechanism mechanically, but other operating principles are also<br />

used. Relays are used where it is necessary to control a circuit by a low-power signal<br />

(with complete electrical isolation between control and controlled circuits), or where<br />

several circuits must be controlled by one signal.<br />

Fig. 6.8 Relay<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 44


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

All relays operate using the same basic principle. In this example will use<br />

commonly used 4 pin relay. Relays have two circuits: A control circuit (shown in<br />

GREEN) and a load circuit (shown in RED). The control circuit has small control coil<br />

while the load circuit has a switch. The coil controls the operation <strong>of</strong> the switch.<br />

Fig. 6.9 Relay Control Circuit<br />

Current flowing through the control circuit coil (pins 1 and 3) creates a small<br />

magnetic field which causes the switch to close, pins 2 and 4. The switch, which is part <strong>of</strong><br />

the load circuit, is used to control an electric circuit that may connect to it. Current now<br />

flows through pins 2 and 4 shown in RED, when relay is energized.<br />

Fig. 6.10 Relay Energized (ON)<br />

When current stop flowing through the control circuit, pins 1 and 3, the relay<br />

become de-energized. Without the magnetic field, the switch opens and the current is<br />

prevented from flowing through pins 2 and 4. The relay is now OFF.<br />

Fig. 6.11 Relay De-Energized (OFF)<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 45


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

When no voltage is applied to pin 1, there is no current flow through the coil. No<br />

current means no magnetic field is developed, and the switch is open. When voltage is<br />

supplied to pin 1, current flow through the coil creates a magnetic field needed to close<br />

the switch allowing continuity between pins 2 and 4.<br />

6.1.6 DC Motor<br />

Fig. 6.12 Relay Operation<br />

A DC motor is a mechanically commutated electric motor powered from direct<br />

current (DC). The stator is stationary in space by definition and therefore it‟s current. The<br />

current in the rotor is switched by the commutator to also be stationary in space. This is<br />

how the relative angle between the stator and rotor magnetic flux is maintained near 90<br />

degrees, which generates the maximum torque.<br />

Operation <strong>of</strong> a DC motor is based on the principle that when a current carrying<br />

conductor is placed in a magnetic field, the conductor experiences a mechanical force.<br />

The direction <strong>of</strong> this force is given by Fleming‟s left hand rule and magnitude is given by:<br />

F = BIℓ newtons<br />

Fig. 6.13 Working <strong>of</strong> DC Motor<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 46


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

DC motors have a rotating armature winding (winding in which a voltage is<br />

induced) but non-rotating armature magnetic field and a static field winding (winding that<br />

produce the main magnetic flux) or permanent magnet. Different connections <strong>of</strong> field and<br />

armature winding provide different inherent speed/torque regulation characteristics.<br />

When the terminals <strong>of</strong> the motor are connected to an external source <strong>of</strong> direct current<br />

supply:<br />

(i) The field magnets are excited to develop alternate N and S poles.<br />

(ii) The armature conductors carry currents. All conductors under N-pole carry currents in<br />

one direction while all the conductors under S-pole carry currents in the opposite<br />

direction.<br />

Since each armature conductor is carrying current and is placed in the magnetic<br />

field, mechanical force acts on it. And applying Fleming‟s left hand rule, it is clear that<br />

force on each conductor is tending to rotate the armature in anticlockwise direction. All<br />

these forces add together to produce a driving torque which sets the armature rotating.<br />

When the conductor moves from one side <strong>of</strong> a brush to the other, the current in that<br />

conductor is reversed and at the same time it comes under the influence <strong>of</strong> next pole<br />

which is <strong>of</strong> opposite polarity. Consequently, the direction <strong>of</strong> force on the conductor<br />

remains the same.<br />

The speed <strong>of</strong> a DC motor can be controlled by changing the voltage applied to the<br />

armature or by changing the field current. The introduction <strong>of</strong> variable resistance in the<br />

armature circuit or field circuit allowed speed control.<br />

Fig. 6.14 Hardware Implementation <strong>of</strong> Project<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 47


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Chapter 7<br />

OVERALL DESCRIPTION & IMPLEMENTATION<br />

7.1 System Perspective<br />

This project provides the way to navigate the robot without any human<br />

intervention. A robot serves with the purpose here. Mount the camera on the robot. The<br />

communication between the robot and the PC is though GPRS. So, the distance between<br />

the control unit and the robot does not matter and distance between cell phone and robot<br />

vehicle should be minimized because here we are communicating through Bluetooth.<br />

Java Application runs on the server side and Android application in mobile.<br />

Initially robot will be moving in a particular direction. If a robot comes across RF Card<br />

then robot stops immediately and takes the snap and sends it to the server. The server<br />

processes the image and sends the instructions to the robot.<br />

As soon as an RF card reader gets the data, micro controller stops the robot and<br />

sends instructions to Cell phone through Bluetooth to capture the image. Cell phone takes<br />

the image and sends this image on GPRS to the server for processing. The server receives<br />

the image from the cell phone, applies <strong>OCR</strong> to extract the data. <strong>Based</strong> on the extracted<br />

data, the server sends the instruction to the robot. The robot moves according to<br />

instruction. If the data such as Restaurant, Petrol pump, Men at work etc. then server<br />

sends instructions to robot to speak up the received data, then the robot waits for the next<br />

instruction.<br />

7.2 Operating Environment<br />

The mobile application is created using Android and the server application is<br />

created using Java which is platform independent hence this application runs in the all<br />

types <strong>of</strong> platforms. Android Cell phone with Android OS 2.1 and above is needed and it<br />

should be GPRS Enabled.<br />

7.3 Design and Implementation Constraints<br />

This system is developed using Android, Java programming environment. On<br />

client side application we are using inbuilt Text to speech facility to produce speech<br />

output. Almost all Android phones will come with this feature.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 48


7.4 User Documentation<br />

This project consists <strong>of</strong> two parts.<br />

1. Server: In Server there are two applications running on it<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

a) Web Application: To receive the captured image from mobile, received<br />

image will be stored in the local disk.<br />

b) Java Application: <strong>OCR</strong> will be applied to receive image and identifies the<br />

particular landmark text/sign and sends the instructions to mobile to<br />

navigate the robot.<br />

2. Client (Mobile): Android application running on it, when the data comes from<br />

the robot it takes the snap <strong>of</strong> the sign board and it sends to a server through GPRS<br />

for processing and it receives the instructions from the server for robot navigation.<br />

7.5 Hardware Interfaces<br />

1. Bluetooth and GPRS enabled Android cell phone.<br />

2. <strong>Robot</strong> vehicle model<br />

3. Micro controller<br />

4. Regulators<br />

5. Battery<br />

6. Bluetooth Module<br />

7. RF ID Reader<br />

8. RF Cards<br />

7.6 S<strong>of</strong>tware Interfaces<br />

1. Application: Java, Android.<br />

2. Network: App depends on the internet and Bluetooth.<br />

3. Mobile Operating System: Android OS 2.1 or higher version.<br />

4. Text to speech: Used to convert the text to voice.<br />

7.7 S<strong>of</strong>tware Requirements<br />

1. Jdk1.6.1_01 or above<br />

2. Android sdk<br />

3. Eclipse Galileo<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 49


7.8 PC Requirements<br />

1. Dual core Processor<br />

2. 40GB HDD<br />

3. 1GB RAM<br />

4. Static IP GPRS connection.<br />

7.9 How to Execute<br />

On Server Part<br />

Step1:<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

1. Copy Auto <strong>Navigation</strong> to “C: \ Apache Tomcat 6.0.16\webapps ”<br />

2. Remove read only attribute for this folder.<br />

3. Copy C:\Apache Tomcat 6.0.16\webapps\Auto_<strong>Navigation</strong> in C:\ Drive only<br />

Fig. 7.1 Snapshot <strong>of</strong> the Apache Tomcat Server in C drive<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 50


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Step 2: C:\Apache Tomcat 6.0.16\bin\run.bat double click on run.bat. The server will be<br />

started.<br />

Step 3: Once Server is started it will wait for the file form the mobile.<br />

Fig. 7.2 Snapshot <strong>of</strong> the Apache Tomcat Server under Execution<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 51


Step 4: Then run the <strong>OCR</strong> application in the server<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Fig. 7.3 Snapshot <strong>of</strong> the <strong>OCR</strong> Server Side Application<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 52


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Step 5: Application will be waiting for the image from the mobile. Once it receives the image it<br />

applies the <strong>OCR</strong> and identifies the landmark.<br />

Fig. 7.4 Snapshot <strong>of</strong> <strong>OCR</strong> Server Side Application Waiting for the Input Data from<br />

Mobile<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 53


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Step 6: Once the sign has been identified, the server sends instructions to mobile for<br />

navigation <strong>of</strong> robot.<br />

Fig. 7.5 Snapshot <strong>of</strong> <strong>OCR</strong> Server Side Application Which Reads the Data<br />

On Mobile Part<br />

We need to install an android application (.apk) file on mobile (client) and by<br />

entering Bluetooth mac address and server IP address to the application we establish the<br />

connection. The mobile device should pair with the robot Bluetooth module. The mobile<br />

handset should be GPRS and Bluetooth enabled for the transfer <strong>of</strong> data.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 54


8.1 Experimental Results<br />

Text/Sign Accuracy rate<br />

Trained 100%<br />

Untrained 97%<br />

Irregular font size 98%<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

RESULTS AND DISCUSSION<br />

Chapter 8<br />

The objective <strong>of</strong> this project is to develop a mapless navigation method for service<br />

robots, where this project fulfills this purpose. According to this system, we have<br />

conducted a study on the accuracy level for different input data. It is important because it<br />

is an identifier <strong>of</strong> feasibility and efficiency. We consider both accuracy rates and also<br />

some drawbacks respected to <strong>OCR</strong> based mapless navigation method.<br />

8.2 Accuracy Rates<br />

The experiments show consistent results with accurate classifications <strong>of</strong> landmark<br />

sign patterns with speech output. Table 8.2 shows some accuracy level for different input<br />

data. The data are considered as trained text or significant, untrained similar, irregular<br />

font size.<br />

8.3 Drawbacks<br />

Table 8.2 Accuracy Rates the system<br />

As this method is new and we are at the preliminary level <strong>of</strong> the Optical Character<br />

Recognition so the main drawback we can consider is that we need to modify and make it<br />

more accurate. Again like all other Neural Network training time increase with the increase in<br />

number <strong>of</strong> characters or words and more landmark signs in Kohonen Neural Network.<br />

Besides I defined fixed picture size 250 X 250. So it will not work for different images. So it<br />

needs to be more generalized. Finally, the system can‟t work for a small text or sign which is<br />

captured with long distance. This is because; system needs to grab the pixels at first from the<br />

original image and then map it. But in the case <strong>of</strong> small text or sign images it's not possible<br />

for the system to grab the pixel from an original image. So it creates problem for capturing<br />

landmark from the long distance and need to focus it manually for better input.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 55


CONCLUSION<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Chapter 9<br />

This chapter will provide the brief conclusion <strong>of</strong> the <strong>OCR</strong> based mapless<br />

navigation method.<br />

9.1 Conclusion<br />

This report tries to emphasis on a way for mapless navigation method in the<br />

simplest possible manner. So we propose a method that applies human navigation<br />

system—landmark to fulfill mapless navigation <strong>of</strong> robots. In this project we have<br />

implemented <strong>OCR</strong> (optical character recognition) to track the landmark by using the<br />

Kohonen Neural Network. But there are also lots <strong>of</strong> ways to implement <strong>OCR</strong> that could<br />

be more efficient than a Kohonen Neural Network. The main reason to select a Kohonen<br />

Neural Network is to reduce the computational cost in order to facilitate the real time<br />

implementation. After the locating and tracking <strong>of</strong> the landmark, we extract the semantic<br />

information <strong>of</strong> texts and arrows contained in those signs, and use the result to guide the<br />

robot to the destination.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 56


SCOPE FOR FUTURE WORK<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Chapter 10<br />

This chapter will give the scope for future work in the research area <strong>of</strong> <strong>OCR</strong><br />

procedure and mapless navigation method.<br />

10.1 Future Work<br />

Currently in this project, <strong>OCR</strong> based mapless navigation method we are using a<br />

single low-cost mobile camera to capture the image and we will focus it on landmark by<br />

manually. But visual sensing will be essential for mobile robots to progress in the<br />

direction <strong>of</strong> increased robustness, reduced cost, reduced power consumption and<br />

reliability. If robots can make use <strong>of</strong> computationally efficient algorithms and <strong>of</strong>f-the-<br />

shelf cameras with an extra features (e.g., high resolution, auto focus, auto detect, night<br />

vision, shake free, capturing images during motion), then the opportunity exists for robots<br />

to be widely deployed in outdoor environment also.<br />

Considerably more work will need to be done in future to determine a perfect<br />

<strong>OCR</strong> based mapless navigation strategy. We need to increase its limits by increasing<br />

number <strong>of</strong> character or sign to recognize a wide variety <strong>of</strong> landmarks. We need to<br />

consider applying a better machine learning algorithm for robot self-learning. A future we<br />

can use the artificial intelligence to make robot to take self-decisions during the<br />

navigation. Finally, we hope it could be applied to the field <strong>of</strong> service robot, so that they<br />

could be more adaptive to the human daily environment and <strong>of</strong>fer better service.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 57


REFERENCES<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

[1] R.C. Gonzalez and R.E. Woods, Digital Image Processing, second edition, Pearson<br />

Education, 2005.<br />

[2] F. Moutarde, A. Bargeton, A. Herbin, and L. Chanussot, “Robust on vehicle real-<br />

time visual detection <strong>of</strong> American and European speed limit signs, with a modular<br />

traffic signs recognition system,” in Intelligent Vehicles Symposium, IEEE 2007.<br />

[3] C. Keller, C. Sprunk, C. Bahlmann, J. Giebel, and G. Barat<strong>of</strong>f, “Real-time<br />

recognition <strong>of</strong> us speed signs,” in Intelligent Vehicles Symposium, IEEE 2008.<br />

[4] G. Qingji, Y. Yue, and Y. Guoqing, “Detection <strong>of</strong> public information sign in airport<br />

terminal based on multi-scales spatio-temporal vision information,” International<br />

Conference on IEEE 2008.<br />

[5] J. Maye, L. Spinello, R. Triebel, and R. Siegwart, “Inferring the semantics <strong>of</strong><br />

direction signs in public places,” in <strong>Robot</strong>ics and Automation (ICRA), IEEE 2010.<br />

[6] T. Breuer, G. Giorgana Macedo, R. Hartanto, N. Hochgeschwender, D. Holz, F.<br />

Hegger, Z. Jin and G. Kraetzschmar “Johnny: An autonomous service robot for<br />

domestic environments,” Journal <strong>of</strong> Intelligent &amp; <strong>Robot</strong>ic Systems, Jul. 2011.<br />

[7] I. Posner, P. Corke, and P. Newman, “Using text-spotting to query the world,” in<br />

Intelligent <strong>Robot</strong>s and Systems (IROS), 2010 IEEE/RSJ International Conference<br />

on IEEE 2010.<br />

[8] Balkenius, C. “Spatial learning with perceptually grounded representations". <strong>Robot</strong>ics<br />

and Autonomous Systems, Vol.25, IEEE 1998.<br />

[9] Franz, Matthias O, “Learning view graphs for robot navigation”. Autonomous<br />

robots, vol. 5, IEEE 1998.<br />

[10] Beccari. G., Caselli, S., Zanichelli, F., “Qualitative spatial representations from task-<br />

oriented perception and exploratory behaviors". <strong>Robot</strong>ics and Autonomous Systems,<br />

Vol. 25, IEEE 1998.<br />

[11] Auranuch Lorsakul , Jackrit Suthakorn “Traffic Sign Recognition for Intelligent<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 58


<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

Vehicle/Driver Assistance System Using Neural Network on OpenCV”. International<br />

Conference on Ubiquitous <strong>Robot</strong>s and Ambient Intelligence (URAI 2007).<br />

[12] J. Heaton, Introduction to Neural Network in Java, HR publication, 2010.<br />

[13] E.Balagurusamy, Programming with JAVA, Third edition, McRraw-Hill<br />

publications, 2008.<br />

[14] N. Otsu, “A Threshold Selection <strong>Method</strong> from Gray-Level Histograms”, IEEE<br />

Transactions on Systems, Man, and Cybernetics, 2001.<br />

[15] A.K. Jain and T. Taxt, Feature Extraction <strong>Method</strong>s for Character Recognition - a<br />

survey, Michigan State University, 2001.<br />

[16] Adnan Md. Shoeb Shatil, “Research Report on Bangla Optical Character<br />

Recognition Using Kohonen Network”, BRAC University, 2005.<br />

[17] Beginning Android by Mark L Murphy Apress publication, 2008.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 59


DOCUMENT CONVENTIONS<br />

<strong>OCR</strong> <strong>Based</strong> <strong>Mapless</strong> <strong>Navigation</strong> <strong>Method</strong> Of <strong>Robot</strong><br />

The following are the list <strong>of</strong> conventions and acronyms used in this report.<br />

AIDC: Automatic Identification and Data Capture<br />

API: Application Programming Interface.<br />

APPENDIX A<br />

ANDROID: This is a programming language used for mobile based applications.<br />

ASF: Apache S<strong>of</strong>tware Foundation.<br />

Data Flow Diagram (DFD): It shows the data flow between the entities.<br />

DAS: Driver Assistance System.<br />

DC: Direct Current.<br />

Eclipse: Tool used to develop application.<br />

GPRS: General packet radio service.<br />

IEEE: Institute <strong>of</strong> Electrical and Electronics Engineers.<br />

JAVA: Is a widely-used general-purpose application programming language.<br />

JSP: Java Server Pages.<br />

JVM: Java Virtual Machine.<br />

TCP/IP: Internet Protocol.<br />

Text to Speech: An Engine used for converting text to speech.<br />

Tomcat Server: is a web container where we are running this application.<br />

TTL: Transistor- Transistor Logic.<br />

MICR: Magnetic Ink Recognition<br />

Mobile: Where android application running in it and guides the robot.<br />

Pixel: Smallest physical element in a raster image.<br />

<strong>OCR</strong>: Optical Character Recognizer<br />

SDK: Used to develop and run the android application.<br />

Server: Application runs on server PC which will wait for the input data from the mobile.<br />

VSRR: View Sequenced Route Representation.<br />

M-Tech [IAR], Dept. <strong>of</strong> Mechanical Eng., SIT, Mangalore Page | 60

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!