HALCON Application Note 1D Metrology - MVTec Software GmbH

Application Guide 

MVTec Software GmbH 

Building Vision for Business

Application Guide for HALCON, Version 7.1.4. 

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or 

transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, 

without prior written permission of the publisher. 

Edition 1 June 2002 (HALCON 6.1) 

Edition 1a May 2003 (HALCON 6.1.2) 

Edition 2 December 2003 (HALCON 7.0) 

Edition 2a July 2004 (HALCON 7.0.1) 

Edition 2b April 2005 (HALCON 7.0.2) 

Edition 3 July 2005 (HALCON 7.1) 

Edition 3a April 2006 (HALCON 7.1.1) 

Edition 3b December 2006 (HALCON 7.1.2) 

Copyright c○ 2002-2008 by MVTec Software GmbH, München, Germany MVTec Software GmbH 

Microsoft, Windows, Windows NT, Windows 2000, Windows XP, Windows 2003, Windows Vista, Visual 

Studio, and Visual Basic are either trademarks or registered trademarks of Microsoft Corporation. 

Linux is a trademark of Linus Torvalds. 

All other nationally and internationally recognized trademarks and tradenames are hereby recognized. 

More information about HALCON can be found at: 

http://www.halcon.com/

Contents 

1. Application Note on Image Acquisition 

The Art of Image Acquisition 

2. Application Note on Shape-Based Matching 

How to Use Shape-Based Matching to Find and Localize Objects 

3. Application Note on 2D Data Codes 

How to Read 2D Data Codes 

4. Application Note on 1D Metrology 

1D Metrology 

5. Application Note on 3D Machine Vision 

Machine Vision in World Coordinates

Provided Functionality 

HALCON Application Note 

The Art of Image Acquisition 

⊲ Connecting to simple and complex configurations of frame grabbers and cameras 

⊲ Acquiring images in various timing modes 

⊲ Configuring frame grabbers and cameras online 

Involved Operators 

open_framegrabber 

info_framegrabber 

grab_image, grab_image_async, grab_image_start 

set_framegrabber_param, get_framegrabber_param 

close_framegrabber, close_all_framegrabbers 

gen_image1, gen_image3, gen_image1_extern 

Copyright c○ 2002-2008 by MVTec Software GmbH, München, Germany MVTec Software GmbH

Overview 

Obviously, the acquisition of images is a task to be solved in all machine vision applications. Unfortunately, 

this task mainly consists of interacting with special, non-standardized hardware in form of the 

frame grabber board. To let you concentrate on the actual machine vision problem, HALCON already 

provides interfaces performing this interaction for a large number of frame grabbers (see section 1 on 

page 4). 

Within your HALCON application, the task of image acquisition is thus reduced to a few lines of code, 

i.e., a few operator calls, as can be seen in section 2 on page 5. What’s more, this simplicity is not 

achieved at the cost of limiting the available functionality: Using HALCON, you can acquire images 

from various configurations of frame grabbers and cameras (see section 3 on page 7) in different timing 

modes (see section 5 on page 16). 

Unless specified otherwise, the example programs can be found in the subdirectory image_acquisition 

of the directory HALCONROOT \examples\application_guide. Note that most 

programs are preconfigured to work with a certain HALCON frame grabber interface; in this case, 

the name of the program contains the name of the interface. To use the program with another frame 

grabber, please adapt the parts which open the connection to the frame grabber. More example programs 

for the different HALCON frame grabber interfaces can be found in the subdirectory hdevelop\Image\Framegrabber 

of the directory %HALCONROOT%\examples. 

Please refer to the Programmer’s Guide, chapter 6 on page 57 and chapter 14 on page 107, for information 

about how to compile and link the C++ and C example programs; among other things, they describe how 

to use the example UNIX makefiles which can be found in the subdirectories c and cpp of the directory 

%HALCONROOT%\examples. Under Windows, you can use Visual Studio workspaces containing the 

examples, which can be found in the subdirectory i586-nt4 parallel to the source files. 

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in 

any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written 

permission of the publisher. 



Microsoft, Windows, Windows NT, Windows 2000, Windows XP, Visual Studio, and Visual Basic are either trademarks 

or registered trademarks of Microsoft Corporation. 

Linux is a trademark of Linus Torvalds. 




Contents 

1 The Philosophy Behind the HALCON Frame Grabber Interfaces 4 

2 A First Example 5 

3 Connecting to Your Frame Grabber 7 

3.1 Opening a Connection to a Specified Configuration . . . . . . . . . . . . . . . . . . . . 7 

3.2 Connecting to Multiple Boards and Cameras . . . . . . . . . . . . . . . . . . . . . . . . 8 

3.3 Requesting Information About the Frame Grabber Interface . . . . . . . . . . . . . . . . 11 

4 Configuring the Acquisition 12 

4.1 General Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 

4.2 Special Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 

4.3 Fixed vs. Dynamic Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

5 The Various Modes of Grabbing Images 16 

5.1 Real-Time Image Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

5.2 Using an External Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

5.3 Acquiring Images From Multiple Cameras . . . . . . . . . . . . . . . . . . . . . . . . . 27 

6 Miscellaneous 29 

6.1 Acquiring Images From Unsupported Frame Grabbers . . . . . . . . . . . . . . . . . . 29 

6.2 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

6.3 Line Scan Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 

A HALCON Images 37 

A.1 The Philosophy of HALCON Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 

A.2 Image Tuples (Arrays) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 

A.3 HALCON Operators for Handling Images . . . . . . . . . . . . . . . . . . . . . . . . . 38 

B Parameters Describing the Image 40 

B.1 Image Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 

B.2 Frames vs. Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 

B.3 Image Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

3

4 Application Note on Image Acquisition 

1 The Philosophy Behind the HALCON Frame Grabber Interfaces 

From the point of view of a user developing software for a machine vision application, the acquisition 

of images is only a prelude to the actual machine vision task. Of course it is important that images are 

acquired at the correct moment or rate, and that the camera and the frame grabber are configured suitably, 

but these tasks seem to be elementary, or at least independent of the used frame grabber. 

The reality, however, looks different. Frame grabbers differ widely regarding the provided functionality, 

and even if their functionality is similar, the SDKs (software development kit) provided by the frame 

grabber manufacturers do not follow any standard. Therefore, switching to a different frame grabber 

probably requires to rewrite the image acquisition part of the application. 

HALCON’s answer to this problem are its frame grabber interfaces (HFGI) which are provided for 

currently more than 50 frame grabbers in form of dynamically loadable libraries (Windows: DLLs; 

UNIX: shared libraries). HALCON frame grabber interfaces bridge the gap between the individual 

frame grabbers and the HALCON library, which is independent of the used frame grabber, computer 

platform, and programming language (see figure 1). In other words, they 

• provide a standardized interface to the HALCON user in form of 11 HALCON operators, and 

• encapsulate details specific to the frame grabber, i.e., the interaction with the frame grabber SDK 

provided by the manufacturer. 

Therefore, if you decide to switch to a different frame grabber, all you need to do is to install the corresponding 

driver and SDK provided by the manufacturer and to use different parameter values when 

calling the HALCON operators; the operators themselves stay the same. 

camera 

computer 

frame 

grabber 

software 

Figure 1: From the camera to a HALCON application. 

HALCON application 

HDevelop / C / C++ / C# / Visual Basic 

HALCON image processing library 

halcon.dll & halconc/cpp/dotnet/x.dll 

HALCON xyz acquisition interface 

hAcqxyz.dll 

device driver & SDK 

In fact, the elementary tasks of image acquisition are covered by two HALCON operators: 

• open_framegrabber connects to the frame grabber and sets general parameters, e.g., the type of 

the used camera or the port the camera is connected to, then 

• grab_image (or grab_image_async, see section 5.1 on page 16 for the difference) grabs images. 

If a frame grabber provides additional functionality, e.g., on-board modification of the image signal, special 

grabbing modes, or digital output lines, it is available via the operator set_framegrabber_param 

(see section 4 on page 12).

a) 

Figure 2: a) Acquired image; b) processed image (automatic segmentation). 

b) 


Note, that for some frame grabbers the full functionality is not available within HALCON; please refer 

to the corresponding online documentation which can be found in the directory %HALCONROOT% 

\doc\html\manuals or via the HALCON folder in the Windows start menu (if you installed the documentation). 

The latest information can be found under http://www.halcon.com/framegrabber. 

If the frame grabber you want to use is not (yet) supported by HALCON, you can nevertheless use it 

together with HALCON. Please refer to section 6.1 on page 29 for more details. 

2 A First Example 

In this section we start with a simple image acquisition task, which uses the frame grabber in its default 

configuration and the standard grabbing mode. The grabbed images are then segmented. To follow the 

example actively, start the HDevelop program hdevelop\first_example_acquisition_ids.dev, 

then press Run once to initialize the application. Note that the program is preconfigured for the HALCON 

frame grabber interface IDS; to use it with a different frame grabber, please adapt the parts which open 

the connection. 

Step 1: Connect to the frame grabber 

open_framegrabber (FGName, 1, 1, 0, 0, 0, 0, ’default’, -1, ’gray’, -1, 

’false’, ’ntsc’, ’default’, -1, -1, FGHandle) 

When opening the connection to your frame grabber using the operator open_framegrabber, the main 

parameter is the Name of the corresponding HALCON frame grabber interface. As a result, you obtain 

a so-called handle (FGHandle), by which you can access the frame grabber, e.g., in calls to the operator 

grab_image. 

In the example, default values are used for most other parameters (’default’ or -1); section 4.1 on 

page 13 takes a closer look at this topic. How to connect to more complex frame grabber and camera 

configurations is described in section 3 on page 7. 

First Example


Step 2: Grab an image 

grab_image (Image, FGHandle) 

After successfully connecting to your frame grabber you can grab images by calling the operator 

grab_image with the corresponding handle FGHandle. More advanced modes of grabbing images are 

described in section 5 on page 16. 

Step 3: Grab and process images in a loop 

while (Button # 1) 


auto_threshold (Image, Regions, 4) 

connection (Regions, ConnectedRegions) 

get_mposition (WindowHandleButton, Row, Column, Button) 

endwhile 

In the example, the grabbed images are then automatically segmented using the operator 

auto_threshold (see figure 2). This is done in a loop which can be exited by clicking into a window 

with the left mouse button.

3 Connecting to Your Frame Grabber 

3 Connecting to Your Frame Grabber 7 

In this section, we show how to connect to different configurations of frame grabber(s) and camera(s), 

ranging from the simple case of one camera connected to one frame grabber board to more complex 

ones, e.g., multiple synchronized cameras connected to one or more boards. 

3.1 Opening a Connection to a Specified Configuration 

With the operator open_framegrabber you open a connection to a frame grabber, or to be more exact, 

via a frame grabber to a camera. This connection is described by four parameters (see figure 3): First, 

you select a frame grabber (family) with the parameter Name. If multiple boards are allowed, you can 

select one with the parameter Device; depending on the frame grabber interface, this parameter can 

contain a string describing the board or simply a number (in form of a string!). 

Typically, the camera can be connected to the frame grabber at different ports, whose number can be 

selected via the parameter Port (in rare cases LineIn). The parameter CameraType describes the 

connected camera: For analog cameras, this parameter usually specifies the used signal norm, e.g., 

’ntsc’; more complex frame grabber interfaces use this parameter to select a camera configuration file. 

As a result, open_framegrabber returns a handle for the opened connection in the parameter FGHandle. 

Note that if you use HALCON’s COM or C ++ interface and call the operator via the classes 

HFramegrabberX or HFramegrabber, no handle is returned because the instance of the class itself acts 

as your handle. 

In HDevelop, you can quickly check an opened connection by double-clicking FGHandle in the Variable 

Window as shown in figure 4. A dialog appears which describes the status of the connection. If you 

AcqHandle 

which interface? 

SDK & IAI A 

SDK & IAI B 

which device? 

frame 

grabber 

board 0 

frame 

grabber 

board 1 

port 0 

port 1 

port 0 

port 1 

which port? 

Name Device Port CameraType 

camera type abc 

camera type xyz 

which camera? 

Figure 3: Describing a connection with the parameters of open_framegrabber . 

Connecting


Figure 4: Online grabbing in HDevelop . 

check box to start online grabbing 

double−click handle to open dialog 

check the corresponding box, images are grabbed online and displayed in the Graphics Window. This 

mode is very useful to setup your vision system (illumination, focus, field of view). 

3.2 Connecting to Multiple Boards and Cameras 

Most HALCON frame grabber interfaces allow to use multiple frame grabber boards and cameras. However, 

there is more than one way to connect cameras and boards and to access these configurations from 

within HALCON. Below, we describe the different configurations; please check the online documentation 

of the HALCON interface for your frame grabber (see %HALCONROOT% \doc\html\manuals, 

the HALCON folder in the Windows start menu, or http://www.halcon.com/framegrabber) which 

configurations it supports. 

3.2.1 Single Camera 

Figure 5a shows the simplest configuration: a single camera connected to a single board, accessible 

via a single handle. Some frame grabbers, especially digital ones, only support this configuration; as 

described in the following section, you can nevertheless use multiple cameras with such frame grabbers 

by connecting each one to an individual board.

a) 

c) 

e) 

handle 0 

handle 0 

handle 1 

handle 2 

handle 0 

HImage[2] 

frame grabber 

board 0 

frame grabber 

board 0 

frame grabber 

board 1 

frame grabber 

board 0 

port 0 

port 0 

port 1 

port 0 

port 0 

port 1 

b) 

d) 

f) 

handle 0 frame grabber port 0 

board 0 

handle 1 frame grabber port 0 

handle 0 

port switch 

handle 0 

HImage[3] 

board 1 

frame grabber 

board 0 

frame grabber 

board 0 

frame grabber 

board 1 

3.2.2 Multiple Boards 9 

Figure 5: a) single board with single camera; b) multiple boards with one camera each; c) multiple boards 

with one or more cameras; d) single board with multiple cameras and port switching; e) single 

board with multiple cameras and simultaneous grabbing; f) simultaneous grabbing with multiple 

boards and cameras. 

3.2.2 Multiple Boards 

Figure 5b shows a configuration with multiple cameras, each connected to a separate board. In this case 

you call the operator open_framegrabber once for each connection as in the HDevelop example program 

hdevelop\multiple_boards_px.dev. Note that the program is preconfigured for the HALCON 

PX interface; to use it with a different frame grabber, please adapt the parts which open the connection. 

port 0 

port 1 

port 0 

port 1 

port 0 

Connecting


open_framegrabber (FGName, 1, 1, 0, 0, 0, 0, ’default’, -1, ’default’, -1, 

’default’, ’default’, Board0, -1, -1, FGHandle0) 


’default’, ’default’, Board1, -1, -1, FGHandle1) 

In this example, the two calls differ only in the value for the parameter Device (’0’ and ’1’); of course, 

you can use different values for other parameters as well, and even connect to different frame grabber 

interfaces. 

To grab images from the two cameras, you simply call the operator grab_image once with the two 

handles returned by the two calls to open_framegrabber: 

grab_image (Image0, FGHandle0) 


3.2.3 Multiple Handles Per Board 

Many frame grabbers provide multiple input ports and thus allow to connect more than one camera to the 

board. Depending on the HALCON frame grabber interface, this configuration is accessed in different 

ways which are described in this and the following sections. 

The standard HALCON method to connect to the cameras is depicted in figure 5c: Each connection 

gets its own handle, i.e., open_framegrabber is called once for each camera with different values 

for the parameter Port, like in the HDevelop example program hdevelop\multiple_ports_px.dev 

(preconfigured for the HALCON PX interface, please adapt the parts which open the connection for your 

own frame grabber): 


’default’, ’default’, ’default’, Port0, -1, FGHandle0) 


’default’, ’default’, ’default’, Port1, -1, FGHandle1) 



As figure 5c shows, you can also use multiple boards with multiple connected cameras. 

3.2.4 Port Switching 

Some frame grabber interfaces do not access the cameras via multiple handles, but by switching the 

input port dynamically (see figure 5d). Therefore, open_framegrabber is called only once, like in 

the HDevelop example program hdevelop\port_switching_inspecta.dev (preconfigured for the 

HALCON INSPECTA interface, please adapt the parts which open the connection for your own frame 

grabber): 


’default’, MyCamType, ’default’, 0, -1, FGHandle) 

Between grabbing images you switch ports using the operator set_framegrabber_param (see section 

4.2 on page 14 for more information about this operator):

set_framegrabber_param (FGHandle, ’port’, Port0) 

grab_image (Image0, FGHandle) 


grab_image (Image1, FGHandle) 

3.2.5 Simultaneous Grabbing 11 

Note that port switching only works for compatible (similar) cameras because open_framegrabber 

is only called once, i.e., the same set of parameters values is used for all cameras. In contrast, when 

using multiple handles as described above, you can specify different parameter values for the individual 

cameras (with some board-specific limitations). 

3.2.5 Simultaneous Grabbing 

In the configurations described above, images were grabbed from the individual cameras by multiple 

calls to the operator grab_image. In contrast, some frame grabber interfaces allow to grab images 

from multiple cameras with a single call to grab_image, which then returns a multi-channel image (see 

figure 5e; appendix A.1 on page 37 contains more information about multi-channel images). This mode 

is called simultaneous grabbing (or parallel grabbing); like port switching, it only works for compatible 

(similar) cameras. For example, you can use this mode to grab synchronized images from a stereo camera 

system. 

In this mode, open_framegrabber is called only once, as can be seen in the HDevelop example program 

hdevelop\simultaneous_grabbing_inspecta.dev (preconfigured for the HALCON INSPECTA interface, 

please adapt the parts which open the connection for your own frame grabber): 


’default’, MyCamType, ’default’, 0, -1, FGHandle) 

You can check the number of returned images (channels) using the operator count_channels 

grab_image (SimulImages, FGHandle) 

count_channels (SimulImages, num_channels) 

and extract the individual images, e.g., using decompose2, decompose3 etc., depending on the number 

of images: 

if (num_channels = 2) 

decompose2 (SimulImages, Image0, Image1) 

Alternatively, you can convert the multi-channel image into an image array using image_to_channels 

and then select the individual images via select_obj. 

Note that some frame grabber interfaces allow simultaneous grabbing also for multiple boards (see figure 

5f). Please refer to section 5.3.2 on page 28 for additional information. 

3.3 Requesting Information About the Frame Grabber Interface 

As mentioned already, the individual HALCON frame grabber interfaces are described in detail on 

HTML pages which can be found in the directory %HALCONROOT% \doc\html\manuals or in the HAL- 

CON folder in the Windows start menu (if you installed the documentation). Another way to access 

information about a frame grabber interface is to use the operator info_framegrabber. 

Connecting


Figure 6: An example result of the operator info_framegrabber . 

In the HDevelop example program hdevelop\info_framegrabber_ids.dev (preconfigured for the 

HALCON IDS interface, please adapt the interface name for your own frame grabber) this operator is 

called multiple times to query the version number of the interface, the available boards, port numbers, 

camera types, and the default values for all parameters of open_framegrabber; the result, i.e., the 

values displayed in the HDevelop Variable Windows, is depicted in figure 6. 

info_framegrabber (FGName, ’general’, GeneralInfo, GeneralValue) 

info_framegrabber (FGName, ’revision’, RevisionInfo, RevisionValue) 

info_framegrabber (FGName, ’info_boards’, BoardsInfo, BoardsValue) 

info_framegrabber (FGName, ’ports’, PortsInfo, PortsValue) 

info_framegrabber (FGName, ’camera_types’, CamTypeInfo, CamTypeValue) 

info_framegrabber (FGName, ’defaults’, DefaultsInfo, DefaultsValue) 

The operator info_framegrabber can be called before actually connecting to a frame grabber with 

open_framegrabber. The only condition is that the HALCON frame grabber interface and the frame 

grabber SDK and driver have been installed. 

4 Configuring the Acquisition 

As explained in section 1 on page 4, the intention of HALCON’s frame grabber interfaces is to provide 

the user with a common interface for many different frame grabbers. This interface is kept as simple 

as possible; as shown, you can connect to your frame grabber and grab a first image using only two 

operators. 

However, HALCON’s second goal is to make the full functionality of a frame grabber available to the 

user. As frame grabbers differ widely regarding the provided functionality, this is a difficult task to realize

4.1 General Parameters 13 

within a simple, common interface. HALCON solves this problem by dividing the task of configuring 

a frame grabber connection into two parts: Those parameters which are common to most frame grabber 

interfaces (therefore called general parameters) are set when calling the operator open_framegrabber. 

In contrast, the functionality which is not generally available can be configured by setting so-called 

special parameters using the operator set_framegrabber_param. 

4.1 General Parameters 

When opening a connection via open_framegrabber, you can specify the following general parameters: 

HorizontalResolution, 

VerticalResolution 

ImageWidth, ImageHeight, 

StartRow, StartColumn 

spatial resolution of the transferred image in relation to the 

original size (see appendix B.1 on page 40) 

size and upper left corner of the transferred image in relation 

to the original size (see appendix B.1 on page 40) 

Field grabbing mode for analog cameras, e.g., interlaced-scan, 

progressive-scan, field grabbing (see appendix B.2 on page 41) 

BitsPerChannel, ColorSpace data contained in a pixel (number of bits, number of channels, 

color encoding, see appendix B.3 on page 44) 

Gain amplification factor for the video amplifier on the frame grabber 

board (if available) 

ExternalTrigger hooking the acquisition of images to an external trigger signal 

(see also section 5.2 on page 25) 

CameraType, Device, Port, 

LineIn 

configuration of frame grabber(s) and camera(s) from which 

images are to be acquired (see section 3.1 on page 7) 

In section 3.1 on page 7, we already encountered the parameters describing the frame grabber / camera 

configuration. Most of the other parameters of open_framegrabber specify the image format; they are 

described in more detail in appendix B on page 40. The parameter ExternalTrigger activates a special 

grabbing mode which is described in detail in section 5.2 on page 25. Finally, the parameter Gain can be 

used to manipulate the acquired images on the frame grabber board by configuring the video amplifier. 

Note that when calling open_framegrabber you must specify values for all parameters, even if your 

frame grabber interface does not support some of them or uses values specified in a camera configuration 

file instead. To alleviate this task, the HALCON frame grabber interfaces provide suitable default values 

which are used if you specify ’default’ or -1 for string or numeric parameters, respectively. The actually 

used default values can be queried using the operator info_framegrabber as shown in section 3.3 

on page 11. 

After connecting to a frame grabber, you can query the current value of general parameters using the 

operator get_framegrabber_param; some interfaces even allow to modify general parameters dynamically. 

Please refer to section 4.3 on page 15 for more information about these topics. 

Configuring


4.2 Special Parameters 

Even the functionality which is not generally available for all frame grabber can be accessed and 

configured with a general mechanism: by setting corresponding special parameters via the operator 

set_framegrabber_param. Typical parameters are, for example: 

’grab_timeout’ timeout after which the operators grab_image and 

grab_image_async stop waiting for an image and return 

an error (see also section 5.2.1 on page 27 and section 6.2 

on page 30) 

’volatile’ enable volatile grabbing (see also section 5.1.3 on page 

18) 

’continuous_grabbing’ switch on a special acquisition mode which is necessary 

for some frame grabbers to achieve real-time performance 

(see also section 5.1.5 on page 22) 

’trigger_signal’ signal type used for external triggering, e.g., rising or 

falling edge 

’image_width’, ’image_height’, 

’start_row’, ’start_column’, 

’gain’, ’external_trigger’, 

’port’ 

“duplicates” of some of the general parameters described 

in section 4.1 on page 13, allowing to modify them dynamically, 

i.e., after opening the connection (see also section 

4.3) 

Depending on the frame grabber, various other parameters may be available, which allow, e.g., to add 

an offset to the digitized video signal or modify the brightness or contrast, to specify the exposure time 

or to trigger a flash. Some frame grabbers also offer special parameters for the use of line scan cameras 

(see also section 6.3 on page 34), or parameters controlling digital output and input lines. 

Which special parameters are provided by a frame grabber interface is described in the already mentioned 

online documentation. You can also query this information by calling the operator info_framegrabber 

as shown below; figure 7 depicts the result of double-clicking ParametersValue in the Variable Window 

after executing the line: 

info_framegrabber (FGName, ’parameters’, ParametersInfo, ParametersValue) 

To set a parameter, you call the operator set_framegrabber_param, specifying the name of the parameter 

to set in the parameter Param and the desired value in the parameter Value. For example, in 

section 3.2.4 on page 10 the following line was used to switch to port 0: 


You can also set multiple parameters at once by specifying tuples for Param and Value as in the following 

line: 

set_framegrabber_param (FGHandle, [’image_width’,’image_height’], [256, 

256]) 

For all parameters which can be set with set_framegrabber_param, you can query the current value 

using the operator get_framegrabber_param. Some interfaces also allow to query additional infor-

Figure 7: Querying available special parameters via info_framegrabber . 

4.3 Fixed vs. Dynamic Parameters 15 

mation like minimum and maximum values for the parameters. For example, the HALCON Fire-i 

interface allows to query the minimum and maximum values for the brightness: 

get_framegrabber_param (FGHandle, ’brightness_min_value’, MinBrightness) 

get_framegrabber_param (FGHandle, ’brightness_max_value’, MaxBrightness) 

Thus, you can check a new brightness value against those boundaries before setting it: 

get_framegrabber_param (FGHandle, ’brightness’, CurrentBrightness) 

NewBrightness := CurrentBrightness + 10 

if (NewBrightness > MaxBrightness) 

NewBrightness := MaxBrightness 

endif 

set_framegrabber_param (FGHandle, ’brightness’, NewBrightness) 

4.3 Fixed vs. Dynamic Parameters 

The distinction between fixed and dynamic parameters is made relating to the lifetime of a frame 

grabber connection. Fixed parameters, e.g., the CameraType, are set once when opening the 

connection with open_framegrabber. In contrast, those parameters which can be modified via 

set_framegrabber_param during the use of the connection are called dynamic parameters. 

As already noted in section 4.2 on page 14, some frame grabber interfaces allow to modify general 

parameters like ImageWidth or ExternalTrigger dynamically via set_framegrabber_param, by 

providing a corresponding special parameter with the same name but written with small letters and 

underscores, e.g., ’image_width’ or ’external_trigger’. 

Independent of whether a general parameter can be modified dynamically, you can query its current value 

by calling the operator get_framegrabber_param with its “translated” name, i.e., capitals replaced by 

small letters and underscores as described above. 

Configuring


! 

5 The Various Modes of Grabbing Images 

Section 2 on page 5 showed that grabbing images is very easy in HALCON – you just call grab_image! 

But of course there’s more to image grabbing than just to get an image, e.g., how to assure an exact 

timing. This section therefore describes more complex grabbing modes. 

5.1 Real-Time Image Acquisition 

As a technical term, the attribute real-time means that a process guarantees that it meets given deadlines. 

Please keep in mind that none of the standard operating systems, i.e., neither Windows nor Linux, 

are real-time operating systems. This means that the operating system itself does not guarantee that 

your application will get the necessary processing time before its deadline expires. From the point of 

view of a machine vision application running under a non-real-time operating system, the most you can 

do is assure that real-time behavior is not already prevented by the application itself. 

In a machine vision application, real-time behavior may be required at multiple points: 

Image delay: The camera must “grab” the image, i.e., expose the chip, at the correct moment, i.e., while 

the part to be inspected is completely visible. 

Frame rate: The most common real-time requirement for a machine vision application is to “reach 

frame rate”, i.e., acquire and process all images the camera produces. 

Processing delay: The image processing itself must complete in time to allow a reaction to its results, 

e.g., to remove a faulty part from the conveyor belt. As this point relates only indirectly to the 

image acquisition it is ignored in the following. 

5.1.1 Non-Real-Time Grabbing Using grab_image 

Figure 8 shows the timing diagram for the standard grabbing mode, i.e., if you use the operator 

grab_image from within your application. This operator call is “translated” by the HALCON frame 

grabber interface and the SDK into the corresponding signal to the frame grabber board (marked with 

’Grab’). 

The frame grabber now waits for the next image. In the example, a free-running analog progressive-scan 

camera is used, which produces images continuously at a fixed frame rate; the start of a new image is 

indicated by a so-called vertical sync signal. The frame grabber then digitizes the incoming analog image 

signal and transforms it into an image matrix. If a digital camera is used, the camera itself performs the 

digitizing and transfers a digital signal which is then transformed into an image matrix by the frame 

grabber. Please refer to appendix B.2 on page 41 for more information about interlaced grabbing. 

The image is then transferred from the frame grabber into computer memory via the PCI bus using DMA 

(direct memory access). This transfer can either be incremental as depicted in figure 8, if the frame 

grabber has only a FIFO buffer, or in a single burst as depicted in figure 9 on page 19, if the frame 

grabber has a frame buffer on board. The advantage of the incremental transfer is that the transfer is 

concluded earlier. In contrast, the burst mode is more efficient; furthermore, if the incremental transfer 

via the PCI bus cannot proceed for some reason, a FIFO overflow results, i.e., image data is lost. Note

camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

wait for 

vsync 

delay 

image 

original 

frame rate 

create 

HImage 

original 

frame rate 

5.1.1 Non-Real-Time Grabbing Using grab_image 17 

original 

frame rate 

expose expose expose expose 

digitize 

wait for 

vsync 

digitize 

Grab Grab 

grab_image 

wait for 

image 

delay image 

wait for 

image 

frame rate 

processing 

create 

HImage 

grab_image 

process process 

Figure 8: Standard timing using grab_image (configuration: free-running progressive-scan camera, frame 

grabber with incremental image transfer). 

that in both modes the transfer performance depends on whether the PCI bus is used by other devices as 

well! 

When the image is completely stored in the computer memory, the HALCON frame grabber interface 

transforms it into a HALCON image and returns the control to the application which processes the image 

and then calls grab_image again. However, even if the processing time is short in relation to the frame 

rate, the camera has already begun to transfer the next image which is therefore “lost”; the application 

can therefore only process every second image. 

You can check this behavior using the HDevelop example program hdevelop\real_time_grabbing_ids.dev 

(preconfigured for the HALCON IDS interface, please 

adapt the parts which open the connection for your own frame grabber), which determines achievable 

frame rates for grabbing and processing (here: calculating a difference image) first separately and then 

together as follows: 

t 

t 

t 

t 

t 

t 

Grabbing


grab_image (BackgroundImage, FGHandle) 

count_seconds (Seconds1) 

for i := 1 to 20 by 1 


sub_image (BackgroundImage, Image, DifferenceImage, 1, 128) 

endfor 


TimeGrabImage := (Seconds2-Seconds1)/20 

FrameRateGrabImage := 1 / TimeGrabImage 

To see the non-deterministic image delay, execute the operator grab_image in the step mode by pressing 

Step; the execution time displayed in HDevelop’s status bar will range between once and twice the 

original frame period. Please note that on UNIX systems, the time measurements are performed with a 

lower resolution than on Windows systems. 

5.1.2 Grabbing Without Delay Using Asynchronously Resettable Cameras 

If you use a free-running camera, the camera itself determines the exact moment an image is acquired 

(exposed). This leads to a delay between the moment you call grab_image and the actual image acquisition 

(see figure 8 on page 17). The delay is not deterministic, but at least it is limited by the frame 

rate; for example, if you use an NTSC camera with a frame rate of 30 Hz, the maximum delay can be 33 

milliseconds. 

Of course, such a delay is not acceptable in an application that is to inspect parts at a high rate. The 

solution is to use cameras that allow a so-called asynchronous reset. This means that upon a signal 

from the frame grabber, the camera resets the image chip and (almost) immediately starts to expose it. 

Typically, such a camera does not grab images continuously but only on demand. 

An example timing diagram is shown in figure 9. In contrast to figure 8, the image delay is (almost) zero. 

Furthermore, because the application now specifies when images are to be grabbed, all images can be 

processed successfully; however, the achieved frame rate still includes the processing time and therefore 

may be too low for some machine vision applications. 

5.1.3 Volatile Grabbing 

As shown in figure 8 on page 17, after the image has been transferred into the computer memory, the 

HALCON frame grabber interface needs some time to create a corresponding HALCON image which is 

then returned in the output parameter Image of grab_image. Most of this time (about 3 milliseconds on 

a 500 MHz Athlon K6 processor for a gray value NTSC image) is needed to copy the image data from 

the buffer which is the destination of the DMA into a newly allocated area. 

You can switch off the copying by using the so-called volatile grabbing, which can be enabled via the 

operator set_framegrabber_param (parameter ’volatile’): 

set_framegrabber_param (FGHandle, ’volatile’, ’enable’) 

Then, the time needed by the frame grabber interface to create the HALCON image is significantly 

reduced as visualized in figure 9. Note that usually volatile grabbing is only supported for gray value 

images!

camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

application 

Expose 

wait for 

vsync 

Grab 

software grab_image 

delay 

image 

= 0 

original 

frame rate 

expose expose 

digitize 

wait for 

image 

create 

HImage 

Expose 

wait for 

vsync digitize 

Grab 

wait for 

image 

frame rate 

processing 

create 

HImage 

grab_image 


5.1.3 Volatile Grabbing 19 

Figure 9: Using a asynchronously resettable camera together with grab_image (configuration: 

progressive-scan camera, frame grabber with burst transfer, volatile grabbing). 

The drawback of volatile grabbing is that grabbed images are overwritten by subsequent grabs. To be 

more exact, the overwriting depends on the number of image buffers allocated by the frame grabber interface 

or SDK. Typically, at least two buffers exist; therefore, you can safely process an image even if the 

next image is already being grabbed as in figure 11 on page 23. Some frame grabber interfaces allow to 

use more than two buffers, and even to select their number dynamically via set_framegrabber_param 

(parameter ’num_buffers’). 

You can check this behavior using the HDevelop example program hdevelop\volatile_grabbing_ids.dev 

(preconfigured for the HALCON IDS interface, please 

adapt the parts which open the connection for your own frame grabber). After grabbing a first image 

and displaying it via 

grab_image (FirstImage, FGHandle) 

dev_open_window (0, 0, Width/2, Height/2, ’black’, FirstWindow) 

dev_display (FirstImage) 

change the scene and grab a second image which is displayed in an individual window: 

t 

t 

t 

t 

t 

t 

Grabbing


! 

grab_image (SecondImage, FGHandle) 

dev_open_window (0, Width/2 + 8, Width/2, Height/2, ’black’, SecondWindow) 

dev_display (SecondImage) 

Now, images are grabbed in a loop and displayed in a third window. The two other images are also 

displayed each time. If you change the scene before each grab you can see how the first two images are 

overwritten in turn, depending on the number of buffers. 

dev_open_window (Height/2 + 66, Width/4 + 4, Width/2, Height/2, ’black’, 

ThirdWindow) 

for i := 1 to 10 by 1 

grab_image (CurrentImage, FGHandle) 

dev_set_window (ThirdWindow) 

dev_display (CurrentImage) 

dev_set_window (FirstWindow) 

dev_display (FirstImage) 

dev_set_window (SecondWindow) 

dev_display (SecondImage) 

endfor 

5.1.4 Real-Time Grabbing Using grab_image_async 

The main problem with the timing using grab_image is that the two processes of image grabbing and 

image processing run sequentially, i.e., one after the other. This means that the time needed for processing 

the image is included in the resulting frame rate, with the effect that the frame rate provided by the 

camera cannot be reached by definition. 

This problem can be solved by using the operator grab_image_async. Here, the two processes are 

decoupled and can run asynchronously, i.e., an image can be processed while the next image is already 

being grabbed. Figure 10 shows a corresponding timing diagram: The first call to grab_image_async 

is processed similar to grab_image (compare figure 8 on page 17). The difference becomes apparent 

after the transfer of the image into computer memory: Almost immediately after receiving the image, 

the frame grabber interface automatically commands the frame grabber to acquire a new image. Thus, 

the next image is grabbed while the application processes the previous image. After the processing, the 

application calls grab_image_async again, which waits until the already running image acquisition is 

finished. Thus, the full frame rate is now reached. Note that some frame grabbers fail to reach the full 

frame rate even with grab_image_async; section 5.1.5 on page 22 shows how to solve this problem. 

In the HDevelop example program hdevelop\real_time_grabbing_ids.dev, which was already 

described in section 5.1.1 on page 16, the reached frame rate for asynchronous processing is determined 

as follows:

camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

Grab 

wait for 

vsync 

delay 

image 

original 

frame rate 

delay 

image 

"negative" 

create 

HImage 

5.1.4 Real-Time Grabbing Using grab_image_async 21 

original 

frame rate 

frame rate 

processing 

original 

frame rate 


grab_image_async 

wait for 

image 

wait for 

vsync 

wait for 

vsync 

digitize digitize digitize 

Grab Grab Grab 

wait for 

image 

create 

HImage 

wait for 

image 

create 

HImage 

grab_image_async grab_image_async 

process process process 

Figure 10: Grabbing and processing in parallel using grab_image_async . 

grab_image (BackgroundImage, FGHandle) 


for i := 1 to 20 by 1 

grab_image_async (Image, FGHandle, -1) 

sub_image (BackgroundImage, Image, DifferenceImage, 1, 128) 

endfor 


TimeGrabImageAsync := (Seconds2-Seconds1)/20 

FrameRateGrabImageAsync := 1 / TimeGrabImageAsync 

As can be seen in figure 10, the first call to grab_image_async has a slightly different effect than 

the following ones, as it also triggers the first grab command to the frame grabber. As an alternative, 

you can use the operator grab_image_start which just triggers the grab command; then, the first call 

to grab_image_async behaves as the other ones. This is visualized, e.g., in figure 11; as you can 

see, the advantage of this method is that the application can perform some processing before calling 

grab_image_async. 

In the example, the processing was assumed to be faster than the acquisition. If this is not the case, 

the image will already be ready when the next call to grab_image_async arrives. In this case, you can 

specify how “old” the image is allowed to be using the parameter MaxDelay. Please refer to section 5.1.7 

on page 24 for details. 

t 

t 

t 

t 

t 

t 

Grabbing


Please note that when using grab_image_async it is not obvious anymore which image is returned by 

the operator call, because the call is decoupled from the command to the frame grabber! In contrast 

to grab_image, which always triggers the acquisition of a new image, grab_image_async typically 

returns an image which has been exposed before the operator was called, i.e., the image delay is negative 

(see figure 10)! Keep this effect in mind when changing parameters dynamically; contrary to intuition, 

the change will not affect the image returned by the next call of grab_image_async but by the following 

ones! Another problem appears when switching dynamically between cameras (see section 5.3.1 on page 

28). 

5.1.5 Continuous Grabbing 

For some frame grabbers, grab_image_async fails to reach the frame rate because the grab command 

to the frame grabber comes too late, i.e., after the camera has already started to transfer the next image 

(see figure 11a). 

As a solution to this problem, some frame grabber interfaces provide the so-called continuous grabbing 

mode, which can be enabled only via the operator set_framegrabber_param (parameter ’continuous_grabbing’): 

set_framegrabber_param (FGHandle, ’continuous_grabbing’, ’enable’) 

In this mode, the frame grabber reads images from a free-running camera continuously and transfers 

them into computer memory as depicted in figure 11b. Thus, the frame rate is reached. If your 

frame grabber supports continuous grabbing, you can test this effect in the example program hdevelop\real_time_grabbing_ids.dev, 

which was already described in the previous sections; the 

program measures the achievable frame rate for grab_image_async without and with continuous grabbing. 

We recommend to use continuous grabbing only if you want to process every image; otherwise, images 

are transmitted over the PCI bus unnecessarily, thereby perhaps blocking other PCI transfers. 

Note that some frame grabber interfaces provide additional functionality in the continuous grabbing 

mode, e.g., the HALCON BitFlow interface. Please refer to the corresponding documentation for more 

information. 

5.1.6 Using grab_image_async Together With Asynchronously Resettable Cameras 

As described in section 5.1.2 on page 18, you can acquire images without delay by using an asynchronously 

resettable camera. Figure 12 shows the resulting timing when using such a camera together 

with grab_image_async. When comparing the diagram to the one in figure 9 on page 19, you can see 

that a higher frame rate can now be reached, because the processing time is not included anymore.

a) 

camera 

b) 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

Grab 

wait for 

vsync 

etc 

grab_image_start 

5.1.6 Using grab_image_async Together With Asynchronously Resettable Cameras 23 

original 

frame rate 

original 

frame rate 

create 

HImage 

create 

HImage 

frame rate 

processing 

frame rate 

processing 

original 

frame rate 


wait for 

image 


Grab 

etc 


set ’continuous_grabbing’ 

wait for 

image 


digitize 

wait for 

vsync 

digitize 

Grab 

wait for 

image 

digitize digitize 

digitize 

wait for 

image 

wait for 

image 

create 

HImage 



create 

HImage 

Grab 

Grab Grab Grab 

create 

HImage 


process process process 

Figure 11: a) grab_image_async fails to reach frame rate; b) problem solved using continuous grabbing. 

t 

t 

t 

t 

t 

t 

t 

t 

t 

t 

t 

Grabbing


camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

Expose 

wait for 

vsync 

Grab 

delay 

image 

= 0 

original 

frame rate 

expose expose 

digitize 

wait for 

image 


Expose 

Grab 

create 

HImage 

wait for 

vsync 

digitize 

frame rate 

processing 

wait for 

image 

create 

HImage 



Figure 12: Using a asynchronously resettable camera together with grab_image_async (configuration as 

in figure 9 on page 19 . 

5.1.7 Specifying a Maximum Delay 

In contrast to grab_image, the operator grab_image_async has an additional parameter MaxDelay, 

which lets you specify how “old” an already grabbed image may be in order to be accepted. Figure 13 visualizes 

the effect of this parameter. There are two cases to distinguish: If the call to grab_image_async 

arrives before the next image has been grabbed (first call in the example), the parameter has no effect. 

However, if an image has been grabbed already (second and third call in the example), the elapsed time 

since the last grab command to the frame grabber is compared to MaxDelay. If it is smaller (second call 

in the example), the image is accepted; otherwise (third call), a new image is grabbed. 

Please note that the delay is not measured starting from the moment the image is exposed, as you might 

perhaps expect! Currently, only a few frame grabber SDKs provide this information; therefore, the last 

grab command from the interface to the frame grabber is used as the starting point instead. 

Grab 

t 

t 

t 

t 

t 

t

camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

expose expose 

expose expose 

Grab 



wait for 

image 

Grab 

Grab 

> MaxDelay? NO > MaxDelay? YES 

create 

HImage 

create 

HImage 

5.2 Using an External Trigger 25 

wait for 

image 

process process process process 


Figure 13: Specifying a maximum delay for grab_image_async (using continuous grabbing). 

5.2 Using an External Trigger 

In the previous section, the software performing the machine vision task decided when to acquire an 

image (software trigger). In industrial applications, however, the moment for image acquisition is typically 

specified externally by the process itself, e.g., in form of a hardware trigger signal indicating the 

presence of an object to be inspected. Most frame grabber boards are therefore equipped with at least 

one input line for such signals, which are called external triggers. 

From HALCON’s point of view, external triggers are dealt with by the frame grabber board, the only 

thing to do is to inform the frame grabber to use the trigger. You can do this simply by setting the parameter 

ExternalTrigger of open_framegrabber to ’true’. Some frame grabber interfaces also allow 

to enable or disable the trigger dynamically using the operator set_framegrabber_param (parameter 

’external_trigger’). 

Figure 14a shows the timing diagram when using an external trigger together with grab_image and a 

free-running camera. After the call to grab_image, the frame grabber board waits for the trigger signal. 

When it appears, the procedure described in the previous section follows: The frame grabber waits for 

the next image, digitizes it, and transfers it into computer memory; then, the HALCON frame grabber 

interface transforms it into a HALCON image and returns the control to the application which processes 

the image and then calls grab_image again, which causes the frame grabber board to wait for the next 

trigger signal. 

Grab 

t 

t 

t 

t 

t 

t 

Grabbing


a) 

b) 

camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

trigger 

camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

software 

application 

trigger 

Grab 

expose expose 

expose expose 

wait for 

trigger 

wait for 

image 

grab_image 

Grab 

etc 


wait for 

trigger 

Expose 

Trigger 

Trigger 

delay 

image 

= 0 

wait for 

vsync 

delay 

image 

expose 

wait for 

vsync 

wait for 

image 


digitize 

digitize 

Trigger 

delay 

image 

= 0 

expose 

create 

HImage 

process 

create 

HImage 

process 

wait for 

image 

delay 

image 

= 0 

wait for 

trigger 

Trigger Trigger 

wait for 

image 

expose 

Expose wait for Expose wait for Expose 

vsync 

vsync 

Grab 

digitize 

Grab 

grab_image 

Grab 

Trigger 

create 

HImage 

process 

digitize 

wait for 

image 


Figure 14: Using an external trigger together with: a) free-running camera and grab_image; b) asynchronously 

resettable camera and grab_image_async . 

The (bad) example in figure 14a was chosen on purpose to show an unsuitable configuration for using 

an external trigger: First of all, because of the free-running camera there is a non-deterministic delay 

Grab 

Trigger 

t 

t 

t 

t 

t 

t 

t 

t 

t 

t 

t 

t

5.2.1 Special Parameters for External Triggers 27 

between the arrival of the trigger signal and the exposure of the image, which may mean that the object 

to be inspected is not completely visible anymore. Secondly, because grab_image is used, trigger 

signals which arrive while the application is processing an image are lost. 

Both problems can easily be solved by using an asynchronously resettable camera together with the 

operator grab_image_async as depicted in figure 14b. 

The C++ example program cpp\error_handling_timeout_picport.cpp (preconfigured for the 

HALCON Leutron interface) shows how simple it is to use an external trigger: The connection is 

opened with ExternalTrigger set to ’true’: 

HFramegrabber framegrabber; 

framegrabber.OpenFramegrabber(fgname, 1, 1, 0, 0, 0, 0, "default", -1, 

"gray", -1, "true", camtype, device, 

-1, -1); 

Then, images are grabbed: 

HImage image; 

do 

{ 

image = framegrabber.GrabImageAsync(-1); 

} while (button == 0); 

The example contains a customized error handler which checks whether there is an external trigger; this 

part is described in detail in section 6.2.3 on page 32. 

5.2.1 Special Parameters for External Triggers 

Most frame grabber interfaces allow to further configure the use of external triggering via the operator 

set_framegrabber_param. As mentioned in section 4.2 on page 14, some interfaces allow to enable 

and disable the external trigger dynamically via the parameter ’external_trigger’. Another useful 

parameter is ’grab_timeout’, which sets a timeout for the acquisition process (some interfaces provide 

an additional parameter ’trigger_timeout’ just for triggered grabbing). Without such a timeout, the 

application would hang if for some reason no trigger signal arrives. In contrast, if a timeout is specified, 

the operators grab_image and grab_image_async only wait the specified time and then return an error 

code or raise an exception, depending on the programming language used. Section 6.2 on page 30 shows 

how to handle such errors. 

Other parameters allow to further specify the form of the trigger signal (’trigger_signal’), e.g., 

whether the falling or the rising edge is used as the trigger, select between multiple trigger input lines, 

or even filter trigger signals. Some frame grabber interfaces also allow to influence the exposure via the 

trigger signal. 

5.3 Acquiring Images From Multiple Cameras 

The timing diagrams shown in the previous sections depicted the case of a single camera. Below we 

discuss some issues which arise when acquiring images from multiple cameras (see section 3.2 on page 

8 for possible configurations). 

Grabbing


5.3.1 Dynamic Port Switching and Asynchronous Grabbing 

If you switch dynamically between multiple cameras connected to a single board as described in section 

3.2.4 on page 10, you must be careful when using grab_image_async: By default, the frame 

grabber interface commands the frame grabber board to grab the next image automatically after it received 

the current image — but before the next call of grab_image_async! If you switched to another 

camera before this call, the frame grabber might already be busy grabbing an image from the first camera. 

Some frame grabber interfaces solve this problem by providing the parameter 

’start_async_after_grab_async’ for the operator set_framegrabber_param which allows 

to disable the automatic grab command to the frame grabber board. 

5.3.2 Simultaneous Grabbing 

Some frame grabber interfaces provide special functionality to grab images simultaneously from multiple 

(synchronized) cameras. Typically, the cameras are connected to a single frame grabber board; 

the Leutron interface also allows to grab simultaneously from cameras connected to multiple boards. 

As described in section 3.2.5 on page 11, the images are grabbed by a single call to grab_image 

or grab_image_async, which return them in form of a multi-channel image. Depending on the 

frame grabber interface, it may be necessary to switch on the simultaneous grabbing via the operator 

set_framegrabber_param. 

Please keep in mind that even if a HALCON frame grabber interface supports simultaneous grabbing, 

this might not be true for every frame grabber board the interface supports! In order to grab multiple 

images simultaneously, a frame grabber board must be equipped with multiple “grabbing units”; for 

example, an analog frame grabber board must be equipped with multiple A/D converters. Please check 

this in the documentation of your frame grabber board. 

Even if a HALCON frame grabber interface does not provide the special simultaneous grabbing mode, 

you can realize a similar behavior “manually”, e.g., by connecting each (asynchronously resettable) 

camera to a single frame grabber board and then using a common external trigger signal to synchronize 

the grabbing.

6 Miscellaneous 

6.1 Acquiring Images From Unsupported Frame Grabbers 


If you want to use a frame grabber that is currently not supported by HALCON, i.e., for which no HAL- 

CON interface exists, there exist two principal ways: First, you can create your own HALCON frame 

grabber interface; how to do this is described in detail in the Frame Grabber Integration Programmer’s 

Manual. 

As an alternative, you can pass externally created images, i.e., the raw image matrix, to HALCON 

using the operators gen_image1, gen_image3, or gen_image1_extern, which create a corresponding 

HALCON image. The main difference between the operators gen_image1 and gen_image1_extern is 

that the former copies the image matrix when creating the HALCON image, whereas the latter doesn’t, 

which is useful if you want to realize volatile grabbing as described in section 5.1.3 on page 18. 

The C example program c\use_extern_image.c shows how to use the operator gen_image1_extern 

to pass standard gray value images to HALCON. In this case, the image matrix consists of 8 bit pixels 

(bytes), which can be represented by the data type unsigned char. At the beginning, the program 

calls a procedure which allocates memory for the images to be “grabbed”; in a real application this 

corresponds to the image buffer(s) used by the frame grabber SDK. 

unsigned char *image_matrix_ptr; 

long width, height; 

InitializeBuffer(&image_matrix_ptr, &width, &height); 

The example program “simulates” the grabbing of images with a procedure which reads images from 

an image sequence and copies them into the image buffer. Then, the content of the image buffer is 

transformed into a HALCON image (type byte) via gen_image1_extern. The parameter ClearProc 

is set to 0 to signal that the program itself takes care of freeing the memory. The created HALCON 

image is then displayed. The loop can be exited by clicking into the HALCON window with any mouse 

button. 

Hobject image; 

long window_id; 

open_window (0, 0, width, height, 0, "visible", "", &window_id); 

while (!ButtonPressed(window_id)) 

{ 

MyGrabImage((const unsigned char **) &image_matrix_ptr); 

gen_image1_extern(&image, "byte", width, height, 

(long) image_matrix_ptr, (long) 0); 

disp_obj(image, window_id); 

} 

If your frame grabber supplies images with more than 8 bit pixels, you must adapt both the 

data type for the image matrix and the type of the created HALCON image (parameter Type of 

gen_image1_extern). In case of color images HALCON expects the image data in form of three 

separate image matrices. You can create a HALCON image either by calling the operator gen_image3 

with the three pointers to the matrices, or by calling the operator gen_image1_extern three times and 

Miscellaneous


Figure 15: Popup dialog in HDevelop signaling a timeout. 

then using the operator channels_to_image to combine the three images into a multi-channel image. 

Please refer to appendix A on page 37 for more information about HALCON images in general. 

6.2 Error Handling 

Just as the HALCON frame grabber interfaces encapsulate the communication with a frame grabber 

board, they also encapsulate occurring errors within the HALCON error handling mechanism. How to 

catch and react to these errors is described below for HDevelop programs and also for programs using 

HALCON’s programming language interfaces. 

Some HALCON frame grabber interfaces provide special parameters for set_framegrabber_param 

which are related to error handling. The most commonly used one is the parameter ’grab_timeout’ 

which specifies when the frame grabber should quit waiting for an image. The examples described in the 

following sections show how to handle the corresponding HALCON error. 

Note that all example programs enable the signaling of low level errors via the operator set_system, 

e.g., in HDevelop syntax via 

set_system (’do_low_error’, ’true’) 

In this mode, low level errors occurring in the frame grabber SDK (or in the HALCON interface) are 

signaled by a message box. 

6.2.1 Error Handling in HDevelop 

The HDevelop example hdevelop\error_handling_timeout_picport.dev shows how to handle 

HALCON errors in a HDevelop program. To “provoke” an error, open_framegrabber is called with 

ExternalTrigger = ’true’. If there is no trigger, a call to grab_image results in a timeout; HDevelop 

reacts to this error with the popup dialog shown in figure 15 and stops the program. 


’true’, CameraType, Device, -1, -1, FGHandle) 

set_framegrabber_param (FGHandle, ’grab_timeout’, 2000) 


HALCON lets you modify the reaction to an error with the operator set_check (in HDevelop: 

dev_set_check). If you set it to ’˜give_error’, the program does not stop in case of an error but 

only stores its cause in form of an error code. To access this error code in HDevelop, you must define

6.2.2 Error Handling Using HALCON/C 31 

a corresponding variable using the operator dev_error_var. Note that this variable is updated after 

each operator call; to check the result of a single operator we therefore recommend to switch back to the 

standard error handling mode directly after the operator call as in the following lines: 

dev_error_var (ErrorNum, 1) 

dev_set_check (’~give_error’) 



dev_set_check (’give_error’) 

To check whether a timeout occurred, you compare the error variable with the code signaling a timeout 

(5322); a list of error codes relating to image acquisition can be found in the Frame Grabber Integration 

Programmer’s Manual, appendix B on page 69. In the example, the timeout is handled by disabling the 

external trigger mode via the operator set_framegrabber_param (parameter ’external_trigger’). 

Then, the call to grab_image is tested again. 

if (ErrorNum = 5322) 

set_framegrabber_param (FGHandle, ’external_trigger’, ’false’) 


dev_set_check (’~give_error’) 



dev_set_check (’give_error’) 

endif 

Now, the error variable should contain the value 2 signaling that the operator call succeeded; for this 

value, HDevelop provides the constant H_MSG_TRUE. If you get another error code, the program accesses 

the corresponding error text using the operator get_error_text. 

if (ErrorNum # H_MSG_TRUE) 

get_error_text (ErrorNum, ErrorText) 

endif 

If your frame grabber interface does not provide the parameter ’external_trigger’, you can realize 

a similar behavior by closing the connection and then opening it again with ExternalTrigger set to 

’false’. 

6.2.2 Error Handling Using HALCON/C 

The mechanism for error handling in a program based on HALCON/C is similar to the one in HDevelop; 

in fact, it is even simpler, because each operator automatically returns its error code. However, if a 

HALCON error occurs in a C program, the default error handling mode causes the program to abort. 

The C example program c\error_handling_timeout_picport.c performs the same task as the 

HDevelop program in the previous section; if the call to grab_image succeeds, the program grabs and 

displays images in a loop, which can be exited by clicking into the window. The following lines show 

how to test whether a timeout occurred: 

Miscellaneous


set_check ("~give_error"); 

error_num = grab_image (&image, fghandle); 

set_check ("give_error"); 

switch (error_num) 

{ 

case H_ERR_FGTIMEOUT: 

As you see, in a C program you can use predefined constants for the error codes (see the Frame Grabber 

Integration Programmer’s Manual, appendix B on page 69, for a list of image acquisition error codes and 

their corresponding constants). 

6.2.3 Error Handling Using HALCON/C++ 

If your application is based on HALCON/C++, there are two methods for error handling: If you use 

operators in their C-like form, e.g., grab_image, you can apply the same procedure as described for 

HALCON/C in the previous section. 

In addition, HALCON/C++ provides an exception handling mechanism based on the class HException, 

which is described in the Programmer’s Guide, section 4.3 on page 26. Whenever a HALCON error 

occurs, an instance of this class is created. The main idea is that you can specify a procedure which 

is then called automatically with the created instance of HException as a parameter. How to use this 

mechanism is explained in the C++ example program cpp\error_handling_timeout_picport.cpp, 

which performs the same task as the examples in the previous sections. 

In the example program cpp\error_handling_timeout_picport.cpp (preconfigured for the HAL- 

CON Leutron interface), the procedure which is to be called upon error is very simple: It just raises a 

standard C++ exception with the instance of HException as a parameter. 

void MyHalconExceptionHandler(const Halcon::HException& except) 

{ 

throw except; 

} 

In the program, you “install” this procedure via a class method of HException: 

int main(int argc, char *argv[]) 

{ 

using namespace Halcon; 

HException::InstallHHandler(&MyHalconExceptionHandler); 

Now, you react to a timeout with the following lines: 

try 

{ 

image = framegrabber.GrabImage(); 

} 

catch (HException except) 

{ 

if (except.err == H_ERR_FGTIMEOUT) 

{ 

framegrabber.SetFramegrabberParam("external_trigger", "false");

6.2.4 Error Handling Using HALCON/COM 33 

As already noted, if your frame grabber interface does not provide the parameter ’external_trigger’, 

you can realize a similar behavior by closing the connection and then opening it again with External- 

Trigger set to ’false’: 

if (except.err == H_ERR_FGTIMEOUT) 

{ 

framegrabber.OpenFramegrabber(fgname, 1, 1, 0, 0, 0, 0, "default", 

-1, "gray", -1, "false", camtype, 

"default", -1, -1); 

Note that when calling OpenFramegrabber via the class HFramegrabber as above, the operator checks 

whether it is called with an already opened connection and automatically closes it before opening it with 

the new parameters. 

6.2.4 Error Handling Using HALCON/COM 

The HALCON/COM interface uses the standard COM error handling technique where every 

method call passes both a numerical and a textual representation of the error to the calling 

framework. How to use this mechanism is explained in the Visual Basic example program 

vb\error_handling_timeout_picport\error_handling_timeout_picport.vbp, which 

performs the same task as the examples in the previous sections. 

For each method, you can specify an error handler by inserting the following line at the beginning of the 

method: 

On Error GoTo ErrorHandler 

At the end of the method, you insert the code for the error handler. If a runtime error occurs, Visual 

Basic automatically jumps to this code, with the error being described in the variable Err. However, the 

returned error number does not correspond directly to the HALCON error as in the other programming 

languages, because low error numbers are reserved for COM. To solve this problem HALCON/COM 

uses an offset which must be subtracted to get the HALCON error code. This offset is accessible as a 

property of the class HSystemX: 

ErrorHandler: 

Dim sys As New HSystemX 

ErrorNum = Err.Number - sys.ErrorBaseHalcon 

The following code fragment checks whether the error is due to a timeout. If yes, the program disables 

the external trigger mode and tries again to grab an image. If the grab is successful the program continues 

at the point the error occurred; otherwise, the Visual Basic default error handler is invoked. Note that in 

contrast to the other programming languages HALCON/COM does not provide constants for the error 

codes. 

If (ErrorNum = 5322) Then 

Call FG.SetFramegrabberParam("external_trigger", "false") 

Set Image = FG.GrabImage 

Resume Next 

If the error is not caused by a timeout, the error handler raises it anew, whereupon the Visual Basic 

default error handler is invoked. 

Miscellaneous


Else 

Err.Raise (Err.Number) 

End If 

If your frame grabber interface does not provide the parameter ’external_trigger’, you can realize 

a similar behavior by closing the connection and then opening it again with ExternalTrigger set to 

’false’. Note that the class HFramegrabberX does not provide a method to close the connection; 

instead you must destroy the variable with the following line: 

Set FG = Nothing 

6.3 Line Scan Cameras 

From the point of view of HALCON there is no difference between area and line scan cameras: Both 

acquire images of a certain width and height; whether the height is 1, i.e., a single line, or larger does 

not matter. In fact, in many line scan applications the frame grabber combines multiple acquired lines to 

form a so-called page which further lessens the difference between the two camera types. 

The main problem is therefore whether your frame grabber supports line scan cameras. If yes, you can 

acquire images from it via HALCON exactly as from an area scan camera. With the parameter ImageHeight 

of the operator open_framegrabber you can sometimes specify the height of the page; 

typically, this information is set in the camera configuration file. Some HALCON frame grabber interfaces 

allow to further configure the acquisition mode via the operator set_framegrabber_param. 

The images acquired from a line scan camera can then be processed just like images from area scan 

cameras. However, line scan images often pose an additional problem: The objects to inspect may 

be spread over multiple images (pages). To solve this problem, HALCON provides special operators: 

tile_images allows to merge images into a larger image, merge_regions_line_scan and 

merge_cont_line_scan_xld allow to merge the (intermediate) processing results of subsequent images. 

How to use these operators is explained in the HDevelop example program hdevelop\line_scan.dev. 

The program is based on an image file sequence which is read using the HALCON virtual frame grabber 

interface File; the task is to extract paper clips and calculate their orientation. Furthermore, the gray 

values in a rectangle surrounding each clip are determined. 

An important parameter for the merging is over how many images an object can be spread. In the 

example, a clip can be spread over 4 images: 

MaxImagesRegions := 4 

The continuous processing is realized by a simple loop: At each iteration, a new image is grabbed, and 

the regions forming candidates for the clips are extracted using thresholding. 

while (1) 


threshold (Image, CurrRegions, 0, 80) 

The current regions are then merged with ones extracted in the previous image using the operator 

merge_regions_line_scan. As a result, two sets of regions are returned: The parameter CurrMergedRegions 

contains the current regions, possibly extended by fitting parts of the previously extracted 

regions, whereas the parameter PrevMergedRegions contains the rest of the previous regions.

a) 

b) 

c) 

1 

4 

6.3 Line Scan Cameras 35 

2 

5 6 

Figure 16: Merging regions extracted from subsequent line scan images: state after a) 2, b) 3, c) 4 images 

(large coordinate system: tiled image; small coordinate systems: current image or most recent 

image). 

merge_regions_line_scan (CurrRegions, PrevRegions, CurrMergedRegions, 

PrevMergedRegions, ImageHeight, ’top’, 

MaxImagesRegions) 

connection (PrevMergedRegions, ClipCandidates) 

select_shape (ClipCandidates, FinishedClips, ’area’, ’and’, 4500, 7000) 

The regions in PrevMergedRegions are “finished”; from them, the program selects the clips via their 

area and further processes them later, e.g., determines their position and orientation. The regions in 

CurrMergedRegions are renamed and now form the previous regions for the next iteration. 

copy_obj (CurrMergedRegions, PrevRegions, 1, -1) 

endwhile 

Note that the operator copy_obj does not copy the regions themselves but only the corresponding HAL- 

CON objects, which can be thought of as references to the actual region data. 

1 

1 

4 

2 

2 

3 

3 

3 

Miscellaneous


Before we show how to merge the images let’s take a look at figure 16, which visualizes the whole 

process: After the first two images CurrMergedRegions contains three clip parts; for the first one a 

previously extracted region was merged. Note that the regions are described in the coordinate frame of 

the current image; this means that the merged part of clip no. 1 has negative coordinates. 

In the next iteration (figure 16b), further clip parts are merged, but no clip is finished yet. Note that the 

coordinate frame is again fixed to the current image; as a consequence the currently merged regions seem 

to move into negative coordinates. 

After the fourth image (figure 16c), clips no. 1 and 2 are completed; they are returned in the parameter 

PrevMergedRegions. Note that they are still described in the coordinate frame of the previous image 

(depicted with dashed arrow); to visualize them together with CurrMergedRegions they must be moved 

to the coordinate system of the current image using the operator move_region: 

move_region (FinishedClips, ClipsInCurrentImageCoordinates, 

-ImageHeight, 0) 

Let’s get back to the task of merging images: To access the gray values around a clip, one must merge 

those images over which the PrevMergedRegions can be spread. At the beginning, an empty image is 

created which can hold 4 images: 

gen_image_const (TiledImage, ’byte’, ImageWidth, 

ImageHeight * MaxImagesRegions) 

At the end of each iteration, the “oldest” image, i.e., the image at the top, is cut off the tiled image using 

crop_part, and the current image is merged at the bottom using tile_images_offset: 

crop_part (TiledImage, TiledImageMinusOldest, ImageHeight, 0, 

ImageWidth, (MaxImagesRegions - 1) * ImageHeight) 

ImagesToTile := [TiledImageMinusOldest,Image] 

tile_images_offset (ImagesToTile, TiledImage, [0, 

(MaxImagesRegions-1)*ImageHeight], [0, 0], [-1, 

-1], [-1, -1], [-1, -1], [-1, -1], ImageWidth, 

MaxImagesRegions * ImageHeight) 

As noted above, the regions returned in PrevMergedRegions are described in the coordinate frame of 

the most recent image (depicted with dashed arrows in figure 16c); to extract the corresponding gray values 

from the tiled image, they must first be moved to its coordinate system (depicted with longer arrows) 

using the operator move_region. Then, the surrounding rectangles are created using shape_trans, and 

finally the corresponding gray values are extracted using add_channels: 

move_region (FinishedClips, ClipsInTiledImageCoordinates, 

(MaxImagesRegions-1) * ImageHeight, 0) 

shape_trans (ClipsInTiledImageCoordinates, AroundClips, ’rectangle1’) 

add_channels (AroundClips, TiledImage, GrayValuesAroundClips)

Appendix 

A HALCON Images 

A HALCON Images 37 

In the following, we take a closer look at the way HALCON represents and handles images. Of course, 

we won’t bother you with details about the low-level representation and the memory management; HAL- 

CON takes care of it in a way to guarantee optimal performance. 

A.1 The Philosophy of HALCON Images 

There are three important concepts behind HALCON’s image objects: 

1. Multiple channels 

Typically, one thinks of an image as a matrix of pixels. In HALCON, this matrix is called a 

channel, and images may consist of one or more such channels. For example, gray value images 

consist of a single channel, color images of three channels. 

The advantage of this representation is that many HALCON operators automatically process all 

channels at once; for example, if you want to subtract gray level or color images from another, 

you can apply sub_image without worrying about the image type. Whether an operator processes 

all channels at once can be seen in the parameter description in the reference manual: 

If an image parameter is described as (multichannel-)image or (multichannel-)image(array) 

(e.g., the parameter ImageMinuend of sub_image), all channels are processed; if it is 

described as image or image(-array) (e.g., the parameter Image of threshold), only the first 

channel is processed. 

For more information about channels please refer to appendix A.3.2. 

2. Various pixel types 

Besides the standard 8 bit (type byte) used to represent gray value image, HALCON allows 

images to contain various other data, e.g. 16 bit integers (type int2 or uint2) or 32 bit floating 

point numbers (type real) to represent derivatives. 

Most of the time you need not worry about pixel types, because HALCON operators that output 

images automatically use a suitable pixel type. For example, the operator derivate_gauss 

creates a real image to store the result of the derivation. As another example, if you connect 

to a frame grabber selecting a value > 8 for the parameter BitsPerChannel, a subsequent 

grab_image returns an uint2 image. 

3. Arbitrarily-shaped region of interest 

Besides the pixel information, each HALCON image also stores its so-called domain in form of a 

HALCON region. The domain can be interpreted as a region of interest, i.e., HALCON operators 

(with some exceptions) restrict their processing to this region. 

The image domain inherits the full flexibility of a HALCON region, i.e., it can be of arbitrary 

shape and size, can have holes, or even consist of unconnected points. For more information 

about domains please refer to appendix A.3.3 on page 39. 

HALCON Images


The power of HALCON’s approach lies in the fact that it offers full flexibility but does not require you 

to worry about options you don’t need at the moment. For example, if all you do is grab and process 

standard 8 bit gray value images, you can ignore channels and pixel types. At the moment you decide 

to use color images instead, all you need to do is to add some lines to decompose the image into its 

channels. And if your camera / frame grabber provides images with more than 8 bit pixel information, 

HALCON is ready for this as well. 

A.2 Image Tuples (Arrays) 

Another powerful mechanism of HALCON is the so-called tuple processing: If you want to process 

multiple images in the same way, e.g., to smooth them, you can call the operator (e.g., mean_image) once 

passing all images as a tuple (array), instead of calling it multiple times. Furthermore, some operators 

always return image tuples, e.g., gen_gauss_pyramid or inspect_shape_model. 

Whether an operator supports tuple processing can be seen in the parameter description in the reference 

manual: If an input image parameter is described as image(-array) or (multichannel-)image(array) 

(e.g., the parameter Image of mean_image), it supports tuple processing; if it is described as 

image or (multichannel-)image (e.g., the parameter Image of find_1d_bar_code), only one image 

is processed. 

For information about creating or accessing image tuples please refer to appendix A.3.6. 

A.3 HALCON Operators for Handling Images 

Below you find a brief overview of operators that allow to create HALCON images or to modify “technical 

aspects” like the image size or the number of channels. 

A.3.1 Creation 

HALCON images are created automatically when you use operators like grab_image or read_image. 

You can also create images from scratch using the operators listed in the HDevelop menu Operators 

⊲ Image ⊲ Creation, e.g., gen_image_const or gen_image1_extern (see also section 6.1 on page 

29). 

A.3.2 Channels 

Operators for manipulating channels can be found in the HDevelop menu Operators ⊲ Image ⊲ Channel. 

You can query the number of channels of an image with the operator count_channels. Channels 

can be accessed using access_channel (which extracts a specified channel without copying), image_to_channels 

(which converts a multi-channel image into an image tuple), or decompose2 etc. 

(which converts a multi-channel image into 2 or more single-channel images). Vice versa, you can create 

a multi-channel image using channels_to_image or compose2 etc., and add channels to an image 

using append_channel.

A.3.3 Domain 

A.3.3 Domain 39 

Operators for manipulating the domain of an image can be found in the HDevelop menu Operators ⊲ 

Image ⊲ Domain. Upon creation of an image, its domain is set to the full image size. You can set it to 

a specified region using change_domain. In contrast, the operator reduce_domain takes the original 

domain into account; the new domain is equal to the intersection of the original domain with the specified 

region. Please also take a look at the operator add_channels, which can be seen as complementary to 

reduce_domain. 

A.3.4 Access 

Operators for accessing information about a HALCON image can be found in the HDevelop menu Operators 

⊲ Image ⊲ Access. For example, get_image_pointer1 returns the size of an image and a 

pointer to the image matrix of its first channel. 

A.3.5 Manipulation 

You can change the size of an image using the operators change_format or crop_part, or other operators 

from the HDevelop menu Operators ⊲ Image ⊲ Format. The menu Operators ⊲ Image ⊲ 

Type-Conversion lists operators which change the pixel type, e.g., convert_image_type. Operators 

to modify the pixel values, can be found in the menu Operators ⊲ Image ⊲ Manipulation, e.g., 

paint_gray, which copies pixels from one image into another. 

A.3.6 Image Tuples 

Operators for creating and accessing image tuples can be found in the HDevelop menu Operators 

⊲ Object ⊲ Manipulation. Image tuples can be created using the operators gen_empty_obj and 

concat_obj, while the operator select_obj allows to access an individual image that is part of a 

tuple. 

HALCON Images


B Parameters Describing the Image 

When opening a connection with open_framegrabber, you can specify the desired image format, e.g., 

its size or the number of bits per pixel, using its nine parameters, which are described in the following. 

B.1 Image Size 

The following 6 parameters influence the size of the grabbed images: HorizontalResolution and 

VerticalResolution specify the spatial resolution of the image in relation to the original size. For 

example, if you choose VerticalResolution = 2, you get an image with half the height of the original 

as depicted in figure 17b. Another name for this process is (vertical and horizontal) subsampling. 

With the parameters ImageWidth, ImageHeight, StartRow, and StartColumn you can grab only a 

part of the (possibly subsampled) image; this is called image cropping. In figure 17, the image part to 

be grabbed is marked with a rectangle in the original (or subsampled) image; to the right, the resulting 

image is depicted. Note that the resulting HALCON image always starts with the coordinates (0,0), 

i.e., the information contained in the parameters StartRow and StartColumn cannot be recovered from 

the resulting image. 

Depending on the involved components, both subsampling and image cropping may be executed at different 

points during the transfer of an image from the camera into HALCON: in the camera, in the frame 

grabber, or in the software. Please note that in most cases you get no direct effect on the performance in 

form of a higher frame rate; exceptions are CMOS cameras which adapt their frame rate to the requested 

image size. Subsampling or cropping on the software side has no effect on the frame rate; besides, you 

can achieve a similar result using reduce_domain. If the frame grabber executes the subsampling or 

cropping you may get a positive effect if the PCI bus is the bottleneck of your application and prevents 

a) 

b) 

Figure 17: The effect of image resolution (subsampling) and image cropping (ImageWidth = 200, 

ImageHeight = 100, StartRow = 50, StartColumn = 100): a) HorizontalResolution (HR) 

= VerticalResolution (VR) = 1; b) HR = 1, VR = 2; c) HR = 2, VR = 1; d) HR = VR = 2. 

c) 

d)

B.2 Frames vs. Fields 41 

you from getting the full frame rate. Some frame grabber interfaces allow dynamic image cropping via 

the operator set_framegrabber_param. 

Note that HALCON itself does not differentiate between area and line scan cameras as both produce 

images – the former in form of frames, the latter in form of so-called pages created from successive 

lines (number specified in the parameter ImageHeight). Section 6.3 on page 34 contains additional 

information regarding the use of line scan cameras. 

B.2 Frames vs. Fields 

The parameter Field is relevant only for analog cameras that produce signals following the video standards 

originally developed for TV, e.g., NTSC or PAL. In these standards, the camera transmits images 

(also called frames) in form of two so-called fields, one containing all odd lines of a frame, the other 

all even lines of the next frame. On the frame grabber board, these two fields are then interlaced; the 

resulting frame is transferred via the PCI bus into the computer memory using DMA (direct memory 

access). 

Figure 18 visualizes this process and demonstrates its major drawback: If a moving object is observed 

(in the example a dark square with the letter ’T’), the position of the object changes from field to field, 

the resulting frame shows a distortion at the vertical object boundaries (also called picket-fence effect). 

Such a distortion seriously impairs the accuracy of measurements; industrial vision systems therefore 

often use so-called progressive scan cameras which transfer full frames (see figure 19). Some cameras 

also “mix” interlacing with progressive scan as depicted in figure 20. 

You can also acquire the individual fields by specifying VerticalResolution = 2. Via the parameter 

Field you can then select which fields are to be acquired (see also figure 21): If you select ’first’ or 

camera 

transfer camera 

to frame grabber 

(analog signal) 

frame grabber 

transfer frame 

grabber to 

software (DMA) 

software 

odd field even field odd field even field odd field 

interlacing interlacing 

Figure 18: Interlaced grabbing (Field = ’interlaced’). 

t 

t 

Image Parameters


camera 




frame grabber 


grabber to 


software 

camera 




frame grabber 


grabber to 


software 

full frame full frame full frame 

Figure 19: Progressive scan grabbing (Field = ’progressive’). 


interlacing interlacing 

Figure 20: Special form of interlaced grabbing supported by some cameras. 

’second’, you get all odd or all even fields, respectively; if you select ’next’, you get every field. The 

latter mode has the advantage of a higher field rate, at the cost, however, of the so-called vertical jitter: 

Objects may seem to move up and down (like the square in figure 21), while structures that are one pixel 

wide appear and disappear (like the upper part of the ’T’). 

t 

t 

t 

t

a) 

b) 

c) 

camera 




frame grabber 


grabber to 


software 

frame grabber 


grabber to 


software 

frame grabber 


grabber to 


software 


Figure 21: Three ways of field grabbing: a) ’first’; b) ’second’; c) ’next’ . 

B.2 Frames vs. Fields 43 

By specifying Field = ’first’, ’second’, or ’next’ for a full resolution image (VerticalResolution 

= 1), you can select with which field the interlacing starts. 

Figure 22 shows a timing diagram for using grab_image together with an interlaced-scan camera. Here, 

you can in some cases increase the processing frame rate by specifying ’next’ for the parameter Field. 

The frame grabber then starts to digitize an image when the next field arrives; in the example therefore 

only one field is lost. 

t 

t 

t 

t 

Image Parameters


camera 

transfer 

(analog) 

frame 

grabber 

transfer 

(DMA) 

IAI & SDK 

application 

wait for 

vsync 

Grab 

wait for 

image 

software grab_image 

B.3 Image Data 

original 

frame rate 

expose expose expose expose expose expose expose 

odd field even field odd field even field odd field even field odd field 


wait for 

vsync digitize digitize 

create 

HImage 

Grab 

wait for 

image 

frame rate 

processing 

create 

HImage 

grab_image 


Figure 22: Grabbing interlaced images starting with the ’next’ field. 

The parameters described in the previous sections concentrated on the size of the images. The image 

data, i.e., the data contained in a pixel, is described with the parameters BitsPerChannel and ColorSpace. 

To understand these parameters, a quick look at HALCON’s way to represent images is necessary: 

A HALCON image consists of one or more matrices of pixels, which are called channels. Gray 

value images are represented as single-channel images, while color images consist of three channels, 

e.g., for the red, green, and blue part of an RGB image. Each image matrix (channel) consists of pixels, 

which may be of different data types, e.g., standard 8 bit (type byte) or 16 bit integers (type int2 or 

uint2) or 32 bit floating point numbers (type real). For detailed information about HALCON images 

please refer to appendix A on page 37. 

The two parameters correspond to the two main aspects of HALCON images: With the parameter ColorSpace, 

you can select whether the resulting HALCON image is to be a (single-channel) gray value 

image (value ’gray’) or a (multi-channel) color image (e.g., value ’rgb’). The parameter BitsPer- 

Channel specifies how many bits are transmitted per pixel per channel from the frame grabber to the 

computer; the pixel type of the HALCON image is then chosen to accommodate the transmitted number 

of pixels. 

For example, if a frame grabber is able to transmit 10 bit gray value images, select ColorSpace = 

’gray’ and BitsPerChannel = 10 and you will get a single-channel HALCON image of the type 

t 

t 

t 

t 

t 

t

B.3 Image Data 45 

’uint2’, i.e., 16 bit per channel. Another example concerns RGB images: Some frame grabbers allow 

the values 8 and 5 for BitsPerChannel. In the first case, 3 × 8 = 24 bit are transmitted per pixel, while 

in the second case only 3 × 5 = 15 (padded to 16) bit are transmitted; in both cases, a three-channel 

’byte’ image results. 

Image Parameters

46 Application Note on Image Acquisition


How to Use Shape-Based Matching 

to Find and Localize Objects 


⊲ Finding objects starting based on a single model image 

⊲ Localizing objects with subpixel accuracy 

Typical Applications 

⊲ Object recognition and localization 

⊲ Intermediate machine vision steps, e.g., alignment of ROIs 

⊲ Completeness check 

⊲ Parts inspection 


create_shape_model, create_scaled_shape_model 

inspect_shape_model, get_shape_model_params, determine_shape_model_params 

get_shape_model_contours, set_shape_model_origin, get_shape_model_origin 

find_shape_model, find_shape_models 

find_scaled_shape_model, find_scaled_shape_models 

write_shape_model, read_shape_model 

clear_shape_model, clear_all_shape_models 


Overview 

HALCON’s operators for shape-based matching enable you to find and localize objects based on a single 

model image, i.e., from a model. This method is robust to noise, clutter, occlusion, and arbitrary nonlinear 

illumination changes. Objects are localized with subpixel accuracy in 2D, i.e., found even if they 

are rotated or scaled. 

The process of shape-based matching (see section 1 on page 4 for a quick overview) is divided into two 

distinct phases: In a first phase, you specify and create the model. This model can be stored in a file to be 

reused in different applications. Detailed information about this phase can be found in section 2 on page 

6. In the second phase, the model is used to find and localize an object. Section 3 on page 20 describes 

how to optimize the outcome of this phase by restricting the search space. 

Shape-based matching is a powerful tool for various machine vision tasks, ranging from intermediate 

image processing, e.g., to place ROIs automatically or to align them to a moving part, to complex tasks, 

e.g., recognize and localize a part in a robot vision application. Examples can be found in section 4 on 

page 30. 

Unless specified otherwise, the example programs can be found in the subdirectory shape_matching 

of the directory HALCONROOT \examples\application_guide. 





Edition 1a May 2003 (HALCON 6.1.2) 


Edition 2a April 2005 (HALCON 7.0.2) 


Microsoft, Windows, Windows NT, Windows 2000, and Windows XP are either trademarks or registered trademarks 

of Microsoft Corporation. 




Contents 


2 Creating a Suitable Model 6 

2.1 A Closer Look at the Region of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . 6 

2.2 Which Information is Stored in the Model? . . . . . . . . . . . . . . . . . . . . . . . . 12 

2.3 Synthetic Model Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 

3 Optimizing the Search Process 20 

3.1 Restricting the Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

3.2 Searching for Multiple Instances of the Object . . . . . . . . . . . . . . . . . . . . . . . 23 

3.3 Searching for Multiple Models Simultaneously . . . . . . . . . . . . . . . . . . . . . . 24 

3.4 A Closer Look at the Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 

3.5 How to Optimize the Matching Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 

4 Using the Results of Matching 30 

4.1 Introducing Affine Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

4.2 Creating and Applying Affine Transformations With HALCON . . . . . . . . . . . . . . 30 

4.3 Using the Estimated Position and Orientation . . . . . . . . . . . . . . . . . . . . . . . 32 

4.4 Using the Estimated Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 


5.1 Adapting to a Changed Camera Orientation . . . . . . . . . . . . . . . . . . . . . . . . 46 

5.2 Reusing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 

3

4 Application Note on Shape-Based Matching 


In this section we give a quick overview of the matching process. To follow the example actively, start 

the HDevelop program hdevelop\first_example_shape_matching.dev, which locates the print on 

an IC; the steps described below start after the initialization of the application (press Run once to reach 

this point). 

Step 1: Select the object in the model image 

Row1 := 188 

Column1 := 182 

Row2 := 298 


gen_rectangle1 (ROI, Row1, Column1, Row2, Column2) 

reduce_domain (ModelImage, ROI, ImageROI) 

After grabbing the so-called model image, i.e., a representative image of the object to find, the first 

task is to create a region containing the object. In the example program, a rectangular region is created 

using the operator gen_rectangle1; alternatively, you can draw the region interactively using, e.g., 

draw_rectangle1 or use a region that results from a previous segmentation process. Then, an image 

containing just the selected region is created using the operator reduce_domain. The result is shown in 

figure 1. 

Step 2: Create the model 

inspect_shape_model (ImageROI, ShapeModelImages, ShapeModelRegions, 8, 30) 

create_shape_model (ImageROI, NumLevels, 0, rad(360), ’auto’, ’none’, 

’use_polarity’, 30, 10, ModelID) 

With the operator create_shape_model, the so-called model is created, i.e., the internal data 

structure describing the searched object. Before this, we recommend to apply the operator inspect_shape_model, 

which helps you to find suitable parameters for the model creation. in- 

1○ 2○ 

Figure 1: 1○ specifying the object; 2○ the internal model (4 pyramid levels).

Figure 2: Finding the object in other images. 


spect_shape_model shows the effect of two parameters: the number of pyramid levels on which the 

model is created, and the minimum contrast that object points must have to be included in the model. As 

a result, the operator inspect_shape_model returns the model points on the selected pyramid levels as 

shown in figure 1; thus, you can check whether the model contains the relevant information to describe 

the object of interest. 

When actually creating the model with the operator create_shape_model, you can specify additional 

parameters besides NumLevels and Contrast: First of all, you can restrict the range of angles the 

object can assume (parameters AngleStart and AngleExtent) and the angle steps at which the model 

is created (AngleStep). With the help of the parameter Optimization you can reduce the number 

of model points; this is useful in the case of very large models. Furthermore, you can switch on the 

pregeneration of the model for the allowed range of rotation. The parameter Metric lets you specify 

whether the polarity of the model points must be observed. Finally, you can specify the minimum 

contrast object points must have in the search images to be compared with the model (MinContrast). 

The creation of the model is described in detail in section 2. 

As a result, the operator create_shape_model returns a handle for the newly created model (ModelID), 

which can then be used to specify the model, e.g., in calls to the operator find_shape_model. Note 

that if you use HALCON’s COM or C++ interface and call the operator via the classes HShapeModelX 

or HShapeModel, no handle is returned because the instance of the class itself acts as your handle. 

If not only the orientation but also the scale of the searched object is allowed to vary, you must use the 

operator create_scaled_shape_model to create the model; then, you can describe the allowed range 

of scaling with three parameters similar to the range of angles. 

Step 3: Find the object again 

for i := 1 to 20 by 1 

grab_image (SearchImage, FGHandle) 

find_shape_model (SearchImage, ModelID, 0, rad(360), 0.7, 1, 0.5, 

’least_squares’, 0, 0.9, RowCheck, ColumnCheck, 

AngleCheck, Score) 

endfor 

To find the object again in a search image, all you need to do is call the operator find_shape_model; 

figure 2 shows the result for one of the example images. Besides the already mentioned ModelID, 

First Example


find_shape_model provides further parameters to optimize the search process: The parameters AngleStart, 

AngleExtent, and NumLevels, which you already specified when creating the model, allow 

you to use more restrictive values in the search process; by using the value 0 for NumLevels, the value 

specified when creating the model is used. With the parameter MinScore you can specify how many of 

the model points must be found; a value of 0.5 means that half of the model must be found. Furthermore, 

you can specify how many instances of the object are expected in the image (NumMatches) and 

how much two instances of the object may overlap in the image (MaxOverlap). To compute the position 

of the found object with subpixel accuracy the parameter SubPixel should be set to a value different 

from ’none’. Finally, the parameter Greediness describes the used search heuristics, ranging from 

“safe but slow” (value 0) to “fast but unsafe” (value 1). How to optimize the search process is described 

in detail in section 3 on page 20. 

The operator find_shape_model returns the position and orientation of the found object instances in 

the parameters Row, Column, and Angle, and their corresponding Score, i.e., how much of the model 

was found. 

If you use the operator find_scaled_shape_model (after creating the model using create_scaled_shape_model), 

the scale of the found object is returned Scale. 

2 Creating a Suitable Model 

A prerequisite for a successful matching process is, of course, a suitable model for the object you want to 

find. A model is suitable if it describes the significant parts of the object, i.e., those parts that characterize 

it and allow to discriminate it clearly from other objects or from the background. On the other hand, the 

model should not contain clutter, i.e., points not belonging to the object (see, e.g., figure 4). 

2.1 A Closer Look at the Region of Interest 

When creating the model, the first step is to select a region of interest (ROI), i.e., the part of the image 

which serves as the model. In HALCON, a region defines an area in an image or, more generally, a set 

of points. A region can have an arbitrary shape; its points do not even need to be connected. Thus, the 

region of the model can have an arbitrary shape as well. 

The sections below describe how to create simple and more complex regions. The following code fragment 

shows the typical next steps after creating an ROI: 


create_shape_model (ImageROI, 0, 0, rad(360), 0, ’none’, ’use_polarity’, 

30, 10, ModelID) 

Note that the region of interest used when creating a shape model influences the matching results: Its 

center of gravity is used as the reference point of the model (see section 2.1.4 on page 12 for more 

information).

2.1.1 How to Create a Region 

Figure 3: Creating an ROI from two regions. 

2.1.1 How to Create a Region 7 

HALCON offers multiple operators to create regions, ranging from standard shapes 

like rectangles (gen_rectangle2) or ellipses (gen_ellipse) to free-form shapes (e.g., 

gen_region_polygon_filled). These operators can be found in the HDevelop menu Operators 

⊲ Regions ⊲ Creation. 

However, to use these operators you need the “parameters” of the shape you want to create, e.g., the 

position, size and, orientation of a rectangle or the position and radius of a circle. Therefore, they are 

typically combined with the operators in the HDevelop menu Operators ⊲ Graphics ⊲ Drawing, which 

let you draw a shape on the displayed image and then return the shape parameters: 

draw_rectangle1 (WindowHandle, ROIRow1, ROIColumn1, ROIRow2, ROIColumn2) 

gen_rectangle1 (ROI, ROIRow1, ROIColumn1, ROIRow2, ROIColumn2) 

2.1.2 How to Combine and Mask Regions 

You can create more complex regions by adding or subtracting standard regions using the operators 

union2 and difference. For example, to create an ROI containing the square and the cross in figure 3, 

the following code fragment was used: 

draw_rectangle1 (WindowHandle, ROI1Row1, ROI1Column1, ROI1Row2, 

ROI1Column2) 

gen_rectangle1 (ROI1, ROI1Row1, ROI1Column1, ROI1Row2, ROI1Column2) 

draw_rectangle1 (WindowHandle, ROI2Row1, ROI2Column1, ROI2Row2, 

ROI2Column2) 

gen_rectangle1 (ROI2, ROI2Row1, ROI2Column1, ROI2Row2, ROI2Column2) 

union2 (ROI1, ROI2, ROI) 

Similarly, you can subtract regions using the operator difference. This method is useful to “mask” 

those parts of a region containing clutter, i.e., high-contrast points that are not part of the object. In 

figure 4, e.g., the task is to find the three capacitors. When using a single circular ROI, the created model 

Model Creation


model for full−circle ROI 

model for ring−shaped ROI 

Figure 4: Masking the part of a region containing clutter. 

contains many clutter points, which are caused by reflections on the metallic surface. Thus, the other two 

capacitors are not found. The solution to this problem is to use a ring-shaped ROI, which can be created 

by the following lines of code: 

draw_circle (WindowHandle, ROI1Row, ROI1Column, ROI1Radius) 

gen_circle (ROI1, ROI1Row, ROI1Column, ROI1Radius) 

gen_circle (ROI2, ROI1Row, ROI1Column, ROI1Radius-8) 

difference (ROI1, ROI2, ROI) 

Note that the ROI should not be too “thin”, otherwise it vanishes at higher pyramid levels! As a rule 

of thumb, an ROI should be 2 NumLevels−1 pixels wide; in the example, the width of 8 pixels therefore 

allows to use 4 pyramid levels. 

For this task even better results can be obtained by using a synthetic model image. This is described in 

section 2.3 on page 18.

a) 

c) 

2.1.3 Using Image Processing to Create and Modify Regions 9 

Figure 5: Using image processing to create an ROI: a) extract bright regions; b) select the card; c) the 

logo forms the ROI; d) result of the matching. 

2.1.3 Using Image Processing to Create and Modify Regions 

In the previous sections, regions were created explicitly by specifying their shape parameters. Especially 

for complex ROIs this method can be inconvenient and time-consuming. In the following, we therefore 

show you how to extract and modify regions using image processing operators. 

Example 1: Determining the ROI Using Blob Analysis 

To follow the example actively, start the HDevelop program hdevelop\create_roi_via_vision.dev, 

which locates the MVTec logo on a pendulum (see figure 5); 

we start after the initialization of the application (press Run once). The main idea is to “zoom in” on the 

desired region in multiple steps: First, find the bright region corresponding to the card, then extract the 

dark characters on it. 

Step 1: Extract the bright regions 

threshold (ModelImage, BrightRegions, 200, 255) 

connection (BrightRegions, ConnectedRegions) 

fill_up (ConnectedRegions, FilledRegions) 

First, all bright regions are extracted using a simple thresholding operation (threshold); the operator 

b) 

d) 

Model Creation


connection forms connected components. The extracted regions are then filled up via fill_up; thus, 

the region corresponding to the card also encompasses the dark characters (see figure 5a). 

Step 2: Select the region of the card 

select_shape (FilledRegions, Card, ’area’, ’and’, 1800, 1900) 

The region corresponding to the card can be selected from the list of regions with the operator select_shape. 

In HDevelop, you can determine suitable features and values using the dialog Visualization 

⊲ Region Features; just click into a region, and the dialog immediately displays its feature 

values. Figure 5b shows the result of the operator. 

Step 3: Use the card as an ROI for the next steps 

reduce_domain (ModelImage, Card, ImageCard) 

Now, we can restrict the next image processing steps to the region of the card using the operator reduce_domain. 

This iterative focusing has an important advantage: In the restricted region of the card, 

the logo characters are much easier to extract than in the full image. 

Step 4: Extract the logo 

threshold (ImageCard, DarkRegions, 0, 230) 

connection (DarkRegions, ConnectedRegions) 

select_shape (ConnectedRegions, Characters, ’area’, ’and’, 150, 450) 

union1 (Characters, CharacterRegion) 

The logo characters are extracted similarly to the card itself; as a last step, the separate character regions 

are combined using the operator union1. 

Step 5: Enlarge the region using morphology 

dilation_circle (CharacterRegion, ROI, 1.5) 


create_shape_model (ImageROI, ’auto’, 0, rad(360), ’auto’, ’none’, 


Finally, the region corresponding to the logo is enlarged slightly using the operator dilation_circle. 

Figure 5c shows the resulting ROI, which is then used to create the shape model. 

Example 2: Further Processing the Result of inspect_shape_model 

You can also combine the interactive ROI specification with image processing. A useful method in the 

presence of clutter in the model image is to create a first model region interactively and then process 

this region to obtain an improved ROI. Figure 6 shows an example; the task is to locate the arrows. To 

follow the example actively, start the HDevelop program hdevelop\process_shape_model.dev; we 

start after the initialization of the application (press Run once). 

Step 1: Select the arrow 

gen_rectangle1 (ROI, 361, 131, 406, 171) 

First, an initial ROI is created around the arrow, without trying to exclude clutter (see figure 6a).

a) 

b) 

c) 

d) 

model for 

Contrast = 30 

processed region 

2.1.3 Using Image Processing to Create and Modify Regions 11 

model for 

Contrast = 90 

model for 

Contrast = 134 

final ROI final model 

Figure 6: Processing the result of inspect_shape_model: a) interactive ROI; b) models for different values 

of Contrast; c) processed model region and corresponding ROI and model; d) result of the 

search. 

Step 2: Create a first model region 


inspect_shape_model (ImageROI, ShapeModelImage, ShapeModelRegion, 1, 30) 

Figure 6b shows the shape model regions that would be created for different values of the parameter 

Contrast. As you can see, you cannot remove the clutter without losing characteristic points of the 

arrow itself. 

Step 3: Process the model region 

fill_up (ShapeModelRegion, FilledModelRegion) 

opening_circle (FilledModelRegion, ROI, 3.5) 

You can solve this problem by exploiting the fact that the operator inspect_shape_model returns the 

shape model region; thus, you can process it like any other region. The main idea to get rid of the clutter 

is to use morphological operator opening_circle, which eliminates small regions. Before this, the 

Model Creation


operator fill_up must be called to fill the inner part of the arrow, because only the boundary points are 

part of the (original) model region. Figure 6c shows the resulting region. 

Step 4: Create the final model 


create_shape_model (ImageROI, 3, 0, rad(360), ’auto’, ’none’, 


The processed region is then used to create the model; figure 6c shows the corresponding ROI and the 

final model region. Now, all arrows are located successfully. 

2.1.4 How the ROI Influences the Search 

Note that the ROI used when creating the model also influences the results of the subsequent matching: 

By default, the center point of the ROI acts as the so-called point of reference of the model for the 

estimated position, rotation, and scale. After creating a model, you can change its point of reference 

with the operator set_shape_model_origin. Note that this operator expects not the absolute position 

of the new reference point as parameters, but its distance to the default reference point. Please note that 

by modifying the point of reference, the accuracy of the estimated position may decrease (see section 3.4 

on page 26). You can query the reference point using the operator get_shape_model_origin 

The point of reference also influences the search itself: An object is only found if the point of reference 

lies within the image, or more exactly, within the domain of the image (see also section 3.1.1 on page 

20). Please note that this test is always performed for the original point of reference, i.e., the center point 

of the ROI, even if you modified the reference point using set_shape_model_origin. 

2.2 Which Information is Stored in the Model? 

As the name shape-based pattern matching suggests, objects are represented and recognized by their 

shape. There exist multiple ways to determine or describe the shape of an object. Here, the shape is 

extracted by selecting all those points whose contrast exceeds a certain threshold; typically, the points 

correspond to the contours of the object (see, e.g., figure 1 on page 4). Section 2.2.1 takes a closer look 

at the corresponding parameters. 

To speed up the matching process, a so-called image pyramid is created, consisting of the original, fullsized 

image and a set of downsampled images. The model is then created and searched on the different 

pyramid levels (see section 2.2.2 on page 14 for details). 

In the following, all parameters belong to the operator create_shape_model if not stated otherwise. 

2.2.1 Which Pixels are Part of the Model? 

For the model those pixels are selected whose contrast, i.e., gray value difference to neighboring pixels, 

exceeds a threshold specified by the parameter Contrast when calling create_shape_model. In order 

to obtain a suitable model the contrast should be chosen in such a way that the significant pixels of the 

object are included, i.e., those pixels that characterize it and allow to discriminate it clearly from other

a) 

c) 

2.2.1 Which Pixels are Part of the Model? 13 

Figure 7: Selecting significant pixels via Contrast: a) complete object but with clutter; b) no clutter but 

incomplete object; c) hysteresis threshold; d) minimum contour size. 

objects or from the background. Obviously, the model should not contain clutter, i.e., pixels that do not 

belong to the object. 

In some cases it is impossible to find a single value for Contrast that removes the clutter but not also 

parts of the object. Figure 7 shows an example; the task is to create a model for the outer rim of a drillhole: 

If the complete rim is selected, the model also contains clutter (figure 7a); if the clutter is removed, 

parts of the rim are missing (figure 7b). 

To solve such problems, the parameter Contrast provides two additional methods: hysteresis thresholding 

and selection of contour parts based on their size. Both methods are used by specifying a tuple of 

values for Contrast instead of a single value. 

Hysteresis thresholding (see also the operator hysteresis_threshold) uses two thresholds, a lower 

and an upper threshold. For the model, first pixels that have a contrast higher than the upper threshold 

are selected; then, pixels that have a contrast higher than the lower threshold and that are connected to a 

high-contrast pixel, either directly or via another pixel with contrast above the lower threshold, are added. 

This method enables you to select contour parts whose contrast varies from pixel to pixel. Returning to 

the example of the drill-hole: As you can see in figure 7c, with a hysteresis threshold you can create a 

model for the complete rim without clutter. The following line of code shows how to specify the two 

thresholds in a tuple: 

inspect_shape_model (ImageROI, ModelImages, ModelRegions, 1, [26,52]) 

The second method to remove clutter is to specify a minimum size, i.e., number of pixels, for the contour 

components. Figure 7d shows the result for the example task. The minimum size must be specified in 

b) 

d) 

Model Creation


the third element of the tuple; if you don’t want to use a hysteresis threshold, set the first two elements 

to the same value: 

inspect_shape_model (ImageROI, ModelImages, ModelRegions, 1, [26,26,12]) 

Alternative methods to remove clutter are to modify the ROI as described in section 2.1 on page 6 or 

create a synthetic model (see section 2.3 on page 18). 

You can let let HALCON select suitable values itself by specifying the value ’auto’ for Contrast. 

You can query the used values via the operator determine_shape_model_params. If you want to 

specify some of the three contrast parameters and let HALCON determine the rest, please refer to the 

Reference Manual for detailed information. 

2.2.2 How Subsampling is Used to Speed Up the Search 

To speed up the matching process, a so-called image pyramid is created, both for the model image and 

for the search images. The pyramid consists of the original, full-sized image and a set of downsampled 

images. For example, if the original image (first pyramid level) is of the size 600x400, the second level 

image is of the size 300x200, the third level 150x100, and so on. The object is then searched first on the 

highest pyramid level, i.e., in the smallest image. The results of this fast search are then used to limit the 

search in the next pyramid image, whose results are used on the next lower level until the lowest level is 

reached. Using this iterative method, the search is both fast and accurate. Figure 8 depicts 4 levels of an 

example image pyramid together with the corresponding model regions. 

You can specify how many pyramid levels are used via the parameter NumLevels. We recommend 

to choose the highest pyramid level at which the model contains at least 10-15 pixels and in which 

the shape of the model still resembles the shape of the object. You can inspect the model image 

pyramid using the operator inspect_shape_model, e.g., as shown in the HDevelop program hdevelop\first_example_shape_matching.dev: 

inspect_shape_model (ImageROI, ShapeModelImages, ShapeModelRegions, 8, 30) 

area_center (ShapeModelRegions, AreaModelRegions, RowModelRegions, 

ColumnModelRegions) 

HeightPyramid := |ShapeModelRegions| 

for i := 1 to HeightPyramid by 1 

if (AreaModelRegions[i-1] >= 15) 

NumLevels := i 

endif 

endfor 



After the call to the operator, the model regions on the selected pyramid levels are displayed in HDevelop’s 

Graphics Window; you can have a closer look at them using the online zooming (menu entry 

Visualization ⊲ Online Zooming). The code lines following the operator call loop through the 

pyramid and determine the highest level on which the model contains at least 15 points. This value is 

then used in the call to the operator create_shape_model. 

A much easier method is to let HALCON select a suitable value itself by specifying the value ’auto’ 

for NumLevels. You can then query the used value via the operator get_shape_model_params.

2.2.2 How Subsampling is Used to Speed Up the Search 15 

Figure 8: The image and the model region at four pyramid levels (original size and zoomed to equal size). 

The operator inspect_shape_model returns the pyramid images in form of an image tuple (array); the 

individual images can be accessed like the model regions with the operator select_obj. Please note 

that object tuples start with the index 1, whereas control parameter tuples start with the index 0! ! 

You can enforce a further reduction of model points via the parameter Optimization. This may be 

useful to speed up the matching in the case of particularly large models. We recommend to specify 

the value ’auto’ to let HALCON select a suitable value itself. Please note that regardless of your 

selection all points passing the contrast criterion are displayed, i.e., you cannot check which points are 

part of the model. 

Model Creation


With an optional second value, you can specify whether the model is pregenerated completely for the 

allowed range of rotation and scale (see the following sections) or not. By default, the model is not 

pregenerated. You can pregenerate the model and thereby speed up the matching process by passing 

’pregeneration’ as the second value of Optimization. Alternatively, you can set this parameter via 

the operator set_system. Note, however, that if you allow large ranges of rotation and/or scaling, the 

memory requirements rise. Another effect is that the process of creating the model takes significantly 

more time. 

2.2.3 Allowing a Range of Orientation 

If the object’s rotation may vary in the search images you can specify the allowed range in the parameter 

AngleExtent and the starting angle of this range in the parameter AngleStart (unit: rad). Note that 

the range of rotation is defined relative to the model image, i.e., a starting angle of 0 corresponds to the 

orientation the object has in the model image. Therefore, to allow rotations up to +/-5 ◦ , e.g., you should 

set the starting angle to -rad(5) and the angle extent to rad(10). 

We recommend to limit the allowed range of rotation as much as possible in order to speed up the 

search process. If you pregenerate the model (see page 16), a large range of rotation also leads to high 

memory requirements. Note that you can further limit the allowed range when calling the operator 

find_shape_model (see section 3.1.2 on page 21). If you want to reuse a model for different tasks 

requiring a different range of angles, you can therefore use a large range when creating the model and a 

smaller range for the search. 

If the object is (almost) symmetric you should limit the allowed range. Otherwise, the search process 

will find multiple, almost equally good matches on the same object at different angles; which match (at 

which angle) is returned as the best can therefore “jump” from image to image. The suitable range of 

rotation depends on the symmetry: For a cross-shaped or square object the allowed extent must be less 

than 90 ◦ , for a rectangular object less than 180 ◦ , and for a circular object 0 ◦ . 

During the matching process, the model is searched for in different angles within the allowed range, at 

steps specified with the parameter AngleStep. If you select the value ’auto’, HALCON automatically 

chooses an optimal step size φopt to obtain the highest possible accuracy by determining the smallest 

rotation that is still discernible in the image. The underlying algorithm is explained in figure 9: The 

rotated version of the cross-shaped object is clearly discernible from the original if the point that lies 

farthest from the center of the object is moved by at least 2 pixels. Therefore, the corresponding angle 

Figure 9: Determining the minimum angle step size from the extent of the model. 

φ 

l 

l 

d

φopt is calculated as follows: 

2.2.4 Allowing a Range of Scale 17 

d 2 = l 2 + l 2 � 

− 2 · l · l · cos φ ⇒ φopt = arccos 1 − d2 

2 · l2 � � 

= arccos 1 − 2 

l2 � 

with l being the maximum distance between the center and the object boundary and d = 2 pixels. 

The automatically determined angle step size φopt is suitable for most applications; therefore, we recommend 

to select the value ’auto’ . You can query the used value after the creation via the operator 

get_shape_model_params. By selecting a higher value you can speed up the search process, however, 

at the cost of a decreased accuracy of the estimated orientation. Note that for very high values the 

matching may fail altogether! 

The value chosen for AngleStep should not deviate too much from the optimal value ( 1 

3 φopt ≤ φ ≤ 

3φopt). Note that choosing a very small step size does not result in an increased angle accuracy! 

2.2.4 Allowing a Range of Scale 

Similarly to the range of orientation, you can specify an allowed range of scale with the parameters 

ScaleMin, ScaleMax, and ScaleStep of the operator create_scaled_shape_model. 

Again, we recommend to limit the allowed range of scale as much as possible in order to speed up 

the search process. If you pregenerate the model (see page 16), a large range of scale also leads to 

high memory requirements. Note that you can further limit the allowed range when calling the operator 

find_scaled_shape_model (see section 3.1.2 on page 21). 

Note that if you are searching for the object on a large range of scales you should create the model based 

on a large scale because HALCON cannot “guess” model points when precomputing model instances at 

scales larger than the original one. On the other hand, NumLevels should be chosen such that the highest 

level contains enough model points also for the smallest scale. 

If you select the value ’auto’ for the parameter ScaleStep, HALCON automatically chooses a suitable 

step size to obtain the highest possible accuracy by determining the smallest scale change that is still 

discernible in the image. Similarly to the angle step size (see figure 9 on page 16), a scaled object is 

clearly discernible from the original if the point that lies farthest from the center of the object is moved 

by at least 2 pixels. Therefore, the corresponding scale change ∆sopt is calculated as follows: 

∆s = d 

l ⇒ ∆sopt = 2 

l 

with l being the maximum distance between the center and the object boundary and d = 2 pixels. 

The automatically determined scale step size is suitable for most applications; therefore, we recommend 

to select the value ’auto’ . You can query the used value after the creation via the operator 

get_shape_model_params. By selecting a higher value you can speed up the search process, however, 

at the cost of a decreased accuracy of the estimated scale. Note that for very high values the matching 

may fail altogether! 

The value chosen for ScaleStep should not deviate too much from the optimal value ( 1 

3 ∆sopt ≤ ∆s ≤ 

3∆sopt). Note that choosing a very small step size does not result in an increased scale accuracy! 

Model Creation


2.2.5 Which Pixels are Compared with the Model? 

For efficiency reasons the model contains information that influences the search process: With the parameter 

MinContrast you can specify which contrast a point in a search image must at least have in 

order to be compared with the model. The main use of this parameter is to exclude noise, i.e., gray value 

fluctuations, from the matching process. You can determine the noise by examining the gray values with 

the HDevelop dialog Visualization ⊲ Pixel Info; then, set the minimum contrast to a value larger 

than the noise. Alternatively, you can let HALCON select suitable values itself by specifying the value 

’auto’ for MinContrast. 

The parameter Metric lets you specify whether the polarity, i.e., the direction of the contrast must be 

observed. If you choose the value ’use_polarity’ the polarity is observed, i.e., the points in the 

search image must show the same direction of the contrast as the corresponding points in the model. If, 

for example, the model is a bright object on a dark background, the object is found in the search images 

only if it is also brighter than the background. 

You can choose to ignore the polarity globally by selecting the value ’ignore_global_polarity’. In 

this mode, an object is recognized also if the direction of its contrast reverses, e.g., if your object can 

appear both as a dark shape on a light background and vice versa. This flexibility, however, is obtained 

at the cost of a slightly lower recognition speed. 

If you select the value ’ignore_local_polarity’, the object is found even if the contrast changes 

locally. This mode can be useful, e.g., if the object consists of a part with a medium gray value, within 

which either darker of brighter sub-objects lie. Please note however, that the recognition speed may 

decrease dramatically in this mode, especially if you allowed a large range of rotation (see section 2.2.3 

on page 16). 

2.3 Synthetic Model Images 

Depending on the application it may be difficult to create a suitable model because there is no “good” 

model image containing a perfect, easy to extract instance of the object. An example of such a case was 

already shown in section 2.1.2 on page 7: The task of locating the capacitors seems to be simple at first, 

as they are prominent bright circles on a dark background. But because of the clutter inside and outside 

the circle even the model resulting from the ring-shaped ROI is faulty: Besides containing clutter points 

also parts of the circle are missing. 

In such cases, it may be better to use a synthetic model image. How to create such an image to locate 

the capacitors is explained below. To follow the example actively, start the HDevelop program hdevelop\synthetic_circle.dev; 

we start after the initialization of the application (press Run once). 

Step 1: Create an XLD contour 

RadiusCircle := 43 

SizeSynthImage := 2*RadiusCircle + 10 

gen_ellipse_contour_xld (Circle, SizeSynthImage / 2, SizeSynthImage / 2, 0, 

RadiusCircle, RadiusCircle, 0, 6.28318, 

’positive’, 1.5) 

First, we create a circular region using the operator gen_ellipse_contour_xld (see figure 10a). You 

can determine a suitable radius by inspecting the image with the HDevelop dialog Visualization ⊲

a) b) 

c) 

2.3 Synthetic Model Images 19 

Figure 10: Locating the capacitors using a synthetic model: a) paint region into synthetic image; b) corresponding 

model; c) result of the search. 

Online Zooming. Note that the synthetic image should be larger than the region because pixels around 

the region are used when creating the image pyramid. 

Step 2: Create an image and insert the XLD contour 

gen_image_const (EmptyImage, ’byte’, SizeSynthImage, SizeSynthImage) 

paint_xld (Circle, EmptyImage, SyntheticModelImage, 128) 

Then, we create an empty image using the operator gen_image_const and insert the XLD contour with 

the operator paint_xld. In figure 10a the resulting image is depicted. 

Step 3: Create the model 

create_scaled_shape_model (SyntheticModelImage, ’auto’, 0, 0, 0.01, 0.8, 

1.2, ’auto’, ’none’, ’use_polarity’, 30, 10, 

Now, the model is created from the synthetic image. Figure 10d shows the corresponding model region, 

figure 10e the search results. 

Note how the image itself, i.e., its domain, acts as the ROI in this example. 

Model Creation


3 Optimizing the Search Process 

The actual matching is performed by the operators find_shape_model, find_scaled_shape_model, 

find_shape_models, or find_scaled_shape_models. In the following, we show how to select suitable 

parameters for these operators to adapt and optimize it for your matching task. 

3.1 Restricting the Search Space 

An important concept in the context of finding objects is that of the so-called search space. Quite 

literally, this term specifies where to search for the object. However, this space encompasses not only the 

2 dimensions of the image, but also other parameters like the possible range of scales and orientations or 

the question of how much of the object must be visible. The more you can restrict the search space, the 

faster the search will be. 

3.1.1 Searching in a Region of Interest 

The obvious way to restrict the search space is to apply the operator find_shape_model to a region of 

interest only instead of the whole image as shown in figure 11. This can be realized in a few lines of 

code: 

Step 1: Create a region of interest 

Row1 := 141 


Row2 := 360 


gen_rectangle1 (SearchROI, Row1, Column1, Row2, Column2) 

First, you create a region, e.g., with the operator gen_rectangle1 (see section 2.1.1 on page 7 for more 

ways to create regions). 

Figure 11: Searching in a region of interest.

Step 2: Restrict the search to the region of interest 

3.1.2 Restricting the Range of Orientation and Scale 21 

for i := 1 to 20 by 1 

grab_image (SearchImage, FGHandle) 

reduce_domain (SearchImage, SearchROI, SearchImageROI) 

find_shape_model (SearchImageROI, ModelID, 0, rad(360), 0.8, 1, 0.5, 

’interpolation’, 0, 0.9, RowCheck, ColumnCheck, 


endfor 

The region of interest is then applied to each search image using the operator reduce_domain. In this 

example, the searching speed is almost doubled using this method. 

Note that by restricting the search to a region of interest you actually restrict the position of the point of 

reference of the model, i.e., the center of gravity of the model ROI (see section 2.1.4 on page 12). This 

means that the size of the search ROI corresponds to the extent of the allowed movement; for example, 

if your object can move ± 10 pixels vertically and ± 15 pixels horizontally you can restrict the search 

to an ROI of the size 20×30. In order to assure a correct boundary treatment on higher pyramid levels, 

we recommend to enlarge the ROI by 2 NumLevels−1 pixels; to continue the example, if you specified 

NumLevels = 4, you can restrict the search to an ROI of the size 36×46. 

Please note that even if you modify the point of reference using set_shape_model_origin, the original 

one, i.e., the center point of the model ROI, is used during the search. Thus, you must always specify the 

search ROI relative to the original reference point. 

3.1.2 Restricting the Range of Orientation and Scale 

When creating the model with the operator create_shape_model (or create_scaled_shape_model), 

you already specified the allowed range of orientation and scale (see 

section 2.2.3 on page 16 and section 2.2.4 on page 17). When calling the operator find_shape_model 

(or find_scaled_shape_model) you can further limit these ranges with the parameters AngleStart, 

AngleExtent, ScaleMin, and ScaleMax. This is useful if you can restrict these ranges by other 

information, which can, e.g., be obtained by suitable image processing operations. 

Another reason for using a larger range when creating the model may be that you want to reuse the model 

for other matching tasks. 

3.1.3 Visibility 

With the parameter MinScore you can specify how much of the object — more precisely: of the model 

— must be visible. A typical use of this mechanism is to allow a certain degree of occlusion as demonstrated 

in figure 12: The security ring is found if MinScore is set to 0.7. 

Let’s take a closer look at the term “visibility”: When comparing a part of a search image with the model, 

the matching process calculates the so-called score, which is a measure of how many model points could 

be matched to points in the search image (ranging from 0 to 1). A model point may be “invisible” and 

thus not matched because of multiple reasons: 

Optimal Search


a) 

b) c) 

Figure 12: Searching for partly occluded objects: a) model of the security ring; b) search result for 

MinScore = 0.8; c) search result for MinScore = 0.7. 

• Parts of the object’s contour are occluded, e.g., as in figure 12. 

Please note that an object must not be clipped at the image border; this case is not treated as an ! 

occlusion! More precisely, the smallest rectangle surrounding the model must not be clipped. 

• Parts of the contour have a contrast lower than specified in the parameter MinContrast when 

creating the model (see section 2.2.5 on page 18). 

• The polarity of the contrast changes globally or locally (see section 2.2.5 on page 18). 

• If the object is deformed, parts of the contour may be visible but appear at an incorrect position 

and therefore do not fit the model anymore. Note that this effect also occurs if camera observes the 

scene under an oblique angle; section 5.1 on page 46 shows how to handle this case. 

Besides these obvious reasons, which have their root in the search image, there are some not so obvious 

reasons caused by the matching process itself: 

• As described in section 2.2.3 on page 16, HALCON precomputes the model for intermediate angles 

within the allowed range of orientation. During the search, a candidate match is then compared 

to all precomputed model instances. If you select a value for the parameter AngleStep that is 

significantly larger than the automatically selected minimum value, the effect depicted in figure 13 

can occur: If the object lies between two precomputed angles, points lying far from the center are 

not matched to a model point, and therefore the score decreases. 

Of course, the same line of reasoning applies to the parameter ScaleStep (see section 2.2.4 on 

page 17). 

• Another stumbling block lies in the use of an image pyramid which was introduced in section 2.2.2 

on page 14: When comparing a candidate match with the model, the specified minimum score 

must be reached on each pyramid level. However, on different levels the score may vary, with 

only the score on the lowest level being returned in the parameter Score; this sometimes leads to 

the apparently paradox situation that MinScore must be set significantly lower than the resulting 

Score. 

Recommendation: The higher MinScore, the faster the search!

3.1.4 Thoroughness vs. Speed 

AngleStep = 20 AngleStep = 30 

Figure 13: The effect of a large AngleStep on the matching. 

3.1.4 Thoroughness vs. Speed 23 

With the parameter Greediness you can influence the search algorithm itself and thereby trade thoroughness 

against speed. If you select the value 0, the search is thorough, i.e., if the object is present 

(and within the allowed search space and reaching the minimum score), it will be found. In this mode, 

however, even very unlikely match candidates are also examined thoroughly, thereby slowing down the 

matching process considerably. 

The main idea behind the “greedy” search algorithm is to break off the comparison of a candidate with 

the model when it seems unlikely that the minimum score will be reached. In other words, the goal is 

not to waste time on hopeless candidates. This greediness, however, can have unwelcome consequences: 

In some cases a perfectly visible object is not found because the comparison “starts out on a wrong foot” 

and is therefore classified as a hopeless candidate and broken off. 

You can adjust the Greediness of the search, i.e., how early the comparison is broken off, by selecting 

values between 0 (no break off: thorough but slow) and 1 (earliest break off: fast but unsafe). Note 

that the parameters Greediness and MinScore interact, i.e., you may have to specify a lower minimum 

score in order to use a greedier search. Generally, you can reach a higher speed with a high greediness 

and a sufficiently lowered minimum score. 

3.2 Searching for Multiple Instances of the Object 

All you have to do to search for more than one instance of the object is to set the parameter Num- 

Matches accordingly. The operator find_shape_model (or find_scaled_shape_model) then returns 

the matching results as tuples in the parameters Row, Column, Angle, Scale, and Score. If you 

select the value 0, all matches are returned. 

Note that a search for multiple objects is only slightly slower than a search for a single object. 

A second parameter, MaxOverlap, lets you specify how much two matches may overlap (as a fraction). 

In figure 14b, e.g., the two security rings overlap by a factor of approximately 0.2. In order to speed up 

the matching as far as possible, however, the overlap is calculated not for the models themselves but for 

their smallest surrounding rectangle. This must be kept in mind when specifying the maximum overlap; 

in most cases, therefore a larger value is needed (e.g., compare figure 14b and figure 14d). 

Optimal Search


a) c) 

b) 

d) 

Figure 14: A closer look at overlapping matches: a) model of the security ring; b) model overlap; c) 

smallest rectangle surrounding the model; d) rectangle overlap; e) pathological case. 

Figure 14e shows a “pathological” case: Even though the rings themselves do not overlap, their surrounding 

rectangles do to a large degree. Unfortunately, this effect cannot be prevented. 

3.3 Searching for Multiple Models Simultaneously 

If you are searching for instances of multiple models in a single image, you can of course call the operator 

find_shape_model (or find_scaled_shape_model) multiple times. A much faster alternative is 

to use the operators find_shape_models or find_scaled_shape_models instead. These operators 

expect similar parameters, with the following differences: 

• With the parameter ModelIDs you can specify a tuple of model IDs instead of a single one. As 

when searching for multiple instances (see section 3.2 on page 23), the matching result parameters 

Row etc. return tuples of values. 

• The output parameter Model shows to which model each found instance belongs. Note that the 

parameter does not return the model IDs themselves but the index of the model ID in the tuple 

ModelIDs (starting with 0). 

• The search is always performed in a single image. However, you can restrict the search to a certain 

region for each model individually by passing an image tuple (see below for an example). 

• You can either use the same search parameters for each model by specifying single values for 

AngleStart etc., or pass a tuple containing individual values for each model. 

e)

a) 

b) 

3.3 Searching for Multiple Models Simultaneously 25 

Figure 15: Searching for multiple models : a) models of ring and nut; b) search ROIs for the two models. 

• You can also search for multiple instances of multiple models. If you search for a certain number 

of objects independent of their type (model ID), specify this (single) value in the parameter Num- 

Matches. By passing a tuple of values, you can specify for each model individually how many 

instances are to be found. In this tuple, you can mix concrete values with the value 0; the tuple 

[3,0], e.g., specifies to return the best 3 instances of the first model and all instances of the second 

model. 

Similarly, if you specify a single value for MaxOverlap, the operators check whether a found 

instance is overlapped by any of the other instances independent of their type. By specifying a 

tuple of values, each instance is only checked against all other instances of the same type. 

The example HDevelop program hdevelop\multiple_models.dev uses the operator 

find_scaled_shape_models to search simultaneously for the rings and nuts depicted in figure 15. 

Step 1: Create the models 

create_scaled_shape_model (ImageROIRing, ’auto’, -rad(22.5), rad(45), 

’auto’, 0.8, 1.2, ’auto’, ’none’, 

’use_polarity’, 60, 10, ModelIDRing) 

create_scaled_shape_model (ImageROINut, ’auto’, -rad(30), rad(60), ’auto’, 

0.6, 1.4, ’auto’, ’none’, ’use_polarity’, 60, 

10, ModelIDNut) 

ModelIDs := [ModelIDRing, ModelIDNut] 

First, two models are created, one for the rings and one for the nuts. The two model IDs are then 

concatenated into a tuple using the operator assign. 

Optimal Search


Step 2: Specify individual search ROIs 

gen_rectangle1 (SearchROIRing, 110, 10, 130, Width - 10) 

gen_rectangle1 (SearchROINut, 315, 10, 335, Width - 10) 

SearchROIs := [SearchROIRing,SearchROINut] 

add_channels (SearchROIs, SearchImage, SearchImageReduced) 

In the example, the rings and nuts appear in non-overlapping parts of the search image; therefore, it is 

possible to restrict the search space for each model individually. As explained in section 3.1.1 on page 

20, a search ROI corresponds to the extent of the allowed movement; thus, narrow horizontal ROIs can 

be used in the example (see figure 15b). 

The two ROIs are concatenated into a region array (tuple) using the operator concat_obj and then 

“added” to the search image using the operator add_channels. The result of this operator is an array 

of two images, both having the same image matrix; the domain of the first image is restricted to the first 

ROI, the domain of the second image to the second ROI. 

Step 3: Find all instances of the two models 

find_scaled_shape_models (SearchImageReduced, ModelIDs, [-rad(22.5), 

-rad(30)], [rad(45), rad(60)], [0.8, 0.6], [1.2, 

1.4], 0.7, 0, 0, ’least_squares’, 0, 0.8, 

RowCheck, ColumnCheck, AngleCheck, ScaleCheck, 

Score, ModelIndex) 

Now, the operator find_scaled_shape_models is applied to the created image array. Because the 

two models allow different ranges of rotation and scaling, tuples are specified for the corresponding 

parameters. In contrast, the other parameters are are valid for both models. Section 4.3.3 on page 35 

shows how to access the matching results. 

3.4 A Closer Look at the Accuracy 

During the matching process, candidate matches are compared with instances of the model at different 

positions, angles, and scales; for each instance, the resulting matching score is calculated. If you set 

the parameter SubPixel to ’none’, the result parameters Row, Column, Angle, and Scale contain 

the corresponding values of the best match. In this case, the accuracy of the position is therefore 1 

pixel, while the accuracy of the orientation and scale is equal to the values selected for the parameters 

AngleStep and ScaleStep, respectively, when creating the model (see section 2.2.3 on page 16 and 

section 2.2.4 on page 17). 

If you set the parameter SubPixel to ’interpolation’, HALCON examines the matching scores at 

the neighboring positions, angles, and scales around the best match and determines the maximum by 

interpolation. Using this method, the position is therefore estimated with subpixel accuracy (≈ 1 

20 pixel 

in typical applications). The accuracy of the estimated orientation and scale depends on the size of the 

object, like the optimal values for the parameters AngleStep and ScaleStep (see section 2.2.3 on page 

16 and section 2.2.4 on page 17): The larger the size, the more accurately the orientation and scale can 

be determined. For example, if the maximum distance between the center and the boundary is 100 pixel, 

◦ . 

the orientation is typically determined with an accuracy of ≈ 1 

10

new 

p. of ref. 

original 

p. of ref. 

3.4 A Closer Look at the Accuracy 27 

model rotation rotation inaccuracy 

Figure 16: Effect of inaccuracy of the estimated orientation on a moved point of reference. 

Recommendation: Because the interpolation is very fast, you can set SubPixel to ’interpolation’ 

in most applications. 

When you choose the values ’least_squares’, ’least_squares_high’, or 

’least_squares_very_high’, a least-squares adjustment is used instead of an interpolation, 

resulting in a higher accuracy. However, this method requires additional computation time. 

Please note that the accuracy of the estimated position may decrease if you modify the point of ! 

reference using set_shape_model_origin! This effect is visualized in figure 16: As you can see 

in the right-most column, an inaccuracy in the estimated orientation “moves” the modified point of 

reference, while the original point of reference is not affected. The resulting positional error depends 

on multiple factors, e.g., the offset of the reference point and the orientation of the found object. The 

main point to keep in mind is that the error increases linearly with the distance of the modified point of 

reference from the original one (compare the two rows in figure 16). 

An inaccuracy in the estimated scale also results in an error in the estimated position, which again 

increases linearly with the distance between the modified and the original reference point. 

For maximum accuracy in case the reference point is moved, the position should be determined using the 

least-squares adjustment. Note that the accuracy of the estimated orientation and scale is not influenced 

by modifying the reference point. 

Optimal Search


! 

3.5 How to Optimize the Matching Speed 

If high memory requirements are no problem for your application, a simple way to speed up the matching 

is to pregenerate the model for the allowed range of rotation and scale as described on page 16. 

In the following, we show how to optimize the matching process in two steps. Please note that in order 

to optimize the matching it is very important to have a set of representative test images from your 

application in which the object appears in all allowed variations regarding its position, orientation, 

occlusion, and illumination. 

Step 1: Assure that all objects are found 

Before tuning the parameters for speed, we recommend to find settings such that the matching succeeds 

in all test images, i.e., that all object instances are found. If this is not the case when using the default 

values, check whether one of the following situations applies: 

? Is the object clipped at the image border? 

Unfortunately, this failure cannot be prevented, i.e., you must assure that the object is not clipped 

(see section 3.1.3 on page 21). 

? Is the search algorithm “too greedy”? 

As described in section 3.1.4 on page 23, in some cases a perfectly visible object is not found if 

the Greediness is too high. Select the value 0 to force a thorough search. 

? Is the object partly occluded? 

If the object should be recognized in this state nevertheless, reduce the parameter MinScore. 

? Does the matching fail on the highest pyramid level? 

As described in section 3.1.3 on page 21, in some cases the minimum score is not reached on 

the highest pyramid level even though the score on the lowest level is much higher. Test this by 

reducing NumLevels in the call to find_shape_model. Alternatively, reduce the MinScore. 

? Does the object have a low contrast? 

If the object should be recognized in this state nevertheless, reduce the parameter MinContrast 

(operator create_shape_model!). 

? Is the polarity of the contrast inverted globally or locally? 

If the object should be recognized in this state nevertheless, use the appropriate value for the parameter 

Metric when creating the model (see section 2.2.5 on page 18). If only a small part of the 

object is affected, it may be better to reduce the MinScore instead. 

? Does the object overlap another instance of the object? 

If the object should be recognized in this state nevertheless, increase the parameter MaxOverlap 

(see section 3.2 on page 23). 

? Are multiple matches found on the same object? 

If the object is almost symmetric, restrict the allowed range of rotation as described in section 2.2.3 

on page 16 or decrease the parameter MaxOverlap (see section 3.2 on page 23).

Step 2: Tune the parameters regarding speed 

3.5 How to Optimize the Matching Speed 29 

The speed of the matching process depends both on the model and on the search parameters. To make 

matters more difficult, the search parameters depend on the chosen model parameters. We recommend 

the following procedure: 

• Increase the MinScore as far as possible, i.e., as long as the matching succeeds. 

• Now, increase the Greediness until the matching fails. Try reducing the MinScore; if this does 

not help restore the previous values. 

• If possible, use a larger value for NumLevels when creating the model. 

• Restrict the allowed range of rotation and scale as far as possible as described in section 2.2.3 on 

page 16 and section 2.2.4 on page 17. Alternatively, adjust the corresponding parameters when 

calling find_shape_model or find_scaled_shape_model. 

• Restrict the search to a region of interest as described in section 3.1.1 on page 20. 

The following methods are more “risky”, i.e., the matching may fail if you choose unsuitable parameter 

values. 

• Increase the MinContrast as long as the matching succeeds. 

• If you a searching for a particularly large object, it sometimes helps to select a higher point reduction 

with the parameter Optimization (see section 2.2.2 on page 14). 

• Increase the AngleStep (and the ScaleStep) as long as the matching succeeds. 

Optimal Search


4 Using the Results of Matching 

As results, the operators find_shape_model, find_scaled_shape_model etc. return 

• the position of the match in the parameters Row and Column, 

• its orientation in the parameter Angle, 

• the scaling factor in the parameter Scale, and 

• the matching score in the parameter Score. 

The matching score, which is a measure of the similarity between the model and the matched object, can 

be used “as it is”, since it is an absolute value. 

In contrast, the results regarding the position, orientation, and scale are worth a closer look as they are 

determined relative to the created model. Before this, we introduce HALCON’s powerful operators for 

the so-called affine transformations, which, when used together with the shape-based matching, enable 

you to easily realize applications like image rectification or the alignment of ROIs with a few lines of 

code. 

4.1 Introducing Affine Transformations 

“Affine transformation” is a technical term in mathematics describing a certain group of transformations. 

Figure 17 shows the types that occur in the context of the shape-based matching: An object can be 

translated (moved) along the two axes, rotated, and scaled. In figure 17d, all three transformations were 

applied in a sequence. 

Note that for the rotation and the scaling there exists a special point, called fixed point or point of reference. 

The transformation is performed around this point. In figure 17b, e.g., the IC is rotated around 

its center, in figure 17e around its upper right corner. The point is called fixed point because it remains 

unchanged by the transformation. 

The transformation can be thought of as a mathematical instruction that defines how to calculate the 

coordinates of object points after the transformation. Fortunately, you need not worry about the mathematical 

part; HALCON provides a set of operators that let you specify and apply transformations in a 

simple way. 

4.2 Creating and Applying Affine Transformations With HALCON 

HALCON allows to transform not only regions, but also images and XLD contours by providing the 

operators affine_trans_region, affine_trans_image, and affine_trans_contour_xld. The 

transformation in figure 17d corresponds to the line 

affine_trans_region (IC, TransformedIC, ScalingRotationTranslation, 

’false’) 

The parameter ScalingRotationTranslation is a so-called homogeneous transformation matrix that 

describes the desired transformation. You can create this matrix by adding simple transformations step 

by step. First, an identity matrix is created:

e) 

row / x 

column / y 

a) b) 

c) d) 

4.2 Creating and Applying Affine Transformations With HALCON 31 

Figure 17: Typical affine transformations: a) translation along two axes; b) rotation around the IC center; 

c) scaling around the IC center; d) combining a, b, and c; e) rotation around the upper right 

corner; f) scaling around the right IC center. 

hom_mat2d_identity (EmptyTransformation) 

Then, the scaling around the center of the IC is added: 

hom_mat2d_scale (EmptyTransformation, 0.5, 0.5, RowCenterIC, 

ColumnCenterIC, Scaling) 

Similarly, the rotation and the translation are added: 

hom_mat2d_rotate (Scaling, rad(90), RowCenterIC, ColumnCenterIC, 

ScalingRotation) 

hom_mat2d_translate (ScalingRotation, 100, 200, ScalingRotationTranslation) 

Please note that in these operators the coordinate axes are labeled with x and y instead of Row and 

Column! Figure 17a clarifies the relation. 

Transformation matrices can also be constructed by a sort of “reverse engineering”. In other words, if the 

result of the transformation is known for some points of the object, you can determine the corresponding 

f) 

Matching Results


! 

transformation matrix. If, e.g., the position of the IC center and its orientation after the transformation is 

known, you can get the corresponding matrix via the operator vector_angle_to_rigid. 

vector_angle_to_rigid (RowCenterIC, ColumnCenterIC, 0, 

TransformedRowCenterIC, TransformedColumnCenterIC, 

rad(90), RotationTranslation) 

and then use this matrix to compute the transformed region: 

affine_trans_region (IC, TransformedIC, RotationTranslation, ’false’) 

4.3 Using the Estimated Position and Orientation 

Even if the task is just to check whether an object is present in the image, you will typically use the returned 

position and orientation: to display the found instance. This basic use is described in section 4.3.1. 

More advanced applications are to align ROIs for other inspection tasks, e.g., measuring (section 4.3.4 

on page 36), or to transform the search image so that the object is positioned as in the model image 

(section 4.3.5 on page 40). Section 4.4 on page 44 shows how to locate grasping points on nuts, which 

could then be passed on to a robot. 

There are two things to keep in mind about the position and orientation returned in the parameters Row, 

Column, and Angle: Most important, contrary to expectation the estimated position is not exactly 

the position of the point of reference but only close to it. Instead, it is optimized for creating the 

transformation matrix with which the applications described above can be realized. 

Secondly, in the model image the object is taken as not rotated, i.e., its angle is 0, even if it seems to be 

rotated, e.g., as in figure 18b. 

4.3.1 Displaying the Matches 

Especially during the development of a matching application it is useful to display the matching results 

overlaid on the search image. This can be realized in a few steps (see, e.g., the HDevelop program 

hdevelop\first_example_shape_matching.dev): 

Step 1: Access the XLD contour containing the model 



get_shape_model_contours (ShapeModel, ModelID, 1) 

Below, we want to display the model at the extracted position and orientation. As shown in section 1 

on page 4, the corresponding region can be accessed via the operator inspect_shape_model. This is 

useful to display the model in the model image. However, for the search images we recommend to use 

the XLD version of the model, because XLD contours can be transformed more precisely and quickly. 

You can access the XLD model by calling the operator get_shape_model_contours after creating the 

model. Note that the XLD model is located in the origin of the image, not on the position of the model 

in the model image.

a) 

b) 

model image 

Row 

Row 

Column Column 

Angle = 0 

Row 

Column Column 

Angle = 0 

search image 

model image search image 

Row 

4.3.1 Displaying the Matches 33 

Figure 18: The position and orientation of a match: a) The center of the ROI acts as the default point of 

reference; b) In the model image, the orientation is always 0. 

Step 2: Determine the affine transformation 




if (|Score| = 1) 

vector_angle_to_rigid (0, 0, 0, RowCheck, ColumnCheck, AngleCheck, 

MovementOfObject) 

After the call of the operator find_shape_model, the results are checked; if the matching failed, empty 

tuples are returned in the parameters Score etc. For a successful match, the corresponding affine transformation 

can be constructed with the operator vector_angle_to_rigid from the position and orientation 

of the match (see section 4.2 on page 30). In the first 2 parameters, you pass the “relative” position 

of the reference point, i.e., its distance to the default reference point (the center of gravity of the ROI, see 

section 2.1.4 on page 12). By default, this relative position is (0,0); section 4.3.4 on page 36 shows the 

values in the case of a modified reference point. 

Step 3: Transform the XLD 

affine_trans_contour_xld (ShapeModel, ModelAtNewPosition, 


dev_display (ModelAtNewPosition) 

Now, you can apply the transformation to the XLD version of the model using the operator 

affine_trans_contour_xld and display it; figure 2 on page 5 shows the result. 

Angle 

Angle 

Matching Results


! 

Figure 19: Displaying multiple matches; the used model is depicted in figure 12a on page 22 . 

4.3.2 Dealing with Multiple Matches 

If multiple instances of the object are searched and found, the parameters Row, Column, Angle, and 

Score contain tuples. The HDevelop program hdevelop\multiple_objects.dev shows how to access 

these results in a loop: 





for j := 0 to |Score| - 1 by 1 

vector_angle_to_rigid (0, 0, 0, RowCheck[j], ColumnCheck[j], 

AngleCheck[j], MovementOfObject) 


The transformation corresponding to the movement of the match is determined as in the previous section; 

the only difference is that the position of the match is extracted from the tuple via the loop variable. 

Step 2: Use the transformation 

affine_trans_pixel (MovementOfObject, -120, 0, RowArrowHead, 

ColumnArrowHead) 

disp_arrow (WindowHandle, RowCheck[j], ColumnCheck[j], 

RowArrowHead, ColumnArrowHead, 2) 

In this example, the transformation is also used to display an arrow that visualizes the orientation (see 

figure 19). For this, the position of the arrow head is transformed using affine_trans_pixel with the 

same transformation matrix as the XLD model. 

Note that you must use the operator affine_trans_pixel and not affine_trans_point_2d, 

because the latter uses a different image coordinate system than affine_trans_pixel, 

affine_trans_contour_xld, affine_trans_region, and affine_trans_image.

4.3.3 Dealing with Multiple Models 

4.3.3 Dealing with Multiple Models 35 

When searching for multiple models simultaneously as described in section 3.3 on page 24, it is useful 

to store the information about the models, i.e., the XLD models, in tuples. The following example code 

stems from the already partly described HDevelop program hdevelop\multiple_models.dev, which 

uses the operator find_scaled_shape_models to search simultaneously for the rings and nuts depicted 

in figure 15 on page 25. 

Step 1: Access the XLD models 

create_scaled_shape_model (ImageROIRing, ’auto’, -rad(22.5), rad(45), 

’auto’, 0.8, 1.2, ’auto’, ’none’, 

’use_polarity’, 60, 10, ModelIDRing) 

get_shape_model_contours (ShapeModelRing, ModelIDRing, 1) 

create_scaled_shape_model (ImageROINut, ’auto’, -rad(30), rad(60), ’auto’, 


inspect_shape_model (ImageROINut, PyramidImage, ModelRegionNut, 1, 30) 

As in the previous sections, the XLD contours corresponding to the two models are accessed with the 

operator get_shape_model_contours. 

Step 2: Save the information about the models in tuples 

NumContoursRing := |ShapeModelRing| 

get_shape_model_contours (ShapeModelNut, ModelIDNut, 1) 

ModelIDs := [ModelIDRing, ModelIDNut] 

ShapeModels := [ShapeModelRing,ShapeModelNut] 

StartContoursInTuple := [1, NumContoursRing+1] 

To facilitate the access to the shape models later, the XLD contours are saved in tuples in analogy to the 

model IDs (see section 3.3 on page 24). However, when concatenating XLD contours with the operator 

concat_obj, one must keep in mind that XLD objects are already tuples as they may consist of multiple 

contours! To access the contours belonging to a certain model, you therefore need the number of contours 

of a model and the starting index in the concatenated tuple. The former is determined using the operator 

count_obj; the contours of the ring start with the index 1, the contours of the nut with the index 1 plus 

the number of contours of the ring. 

Matching Results


Step 3: Access the found instances 

find_scaled_shape_models (SearchImageReduced, ModelIDs, [-rad(22.5), 

-rad(30)], [rad(45), rad(60)], [0.8, 0.6], [1.2, 

1.4], 0.7, 0, 0, ’least_squares’, 0, 0.8, 

RowCheck, ColumnCheck, AngleCheck, ScaleCheck, 

Score, ModelIndex) 

for i := 0 to |Score| - 1 by 1 

Model := ModelIndex[i] 

vector_angle_to_rigid (0, 0, 0, RowCheck[i], ColumnCheck[i], 

AngleCheck[i], MovementOfObject) 

hom_mat2d_scale (MovementOfObject, ScaleCheck[i], ScaleCheck[i], 

RowCheck[i], ColumnCheck[i], MoveAndScalingOfObject) 

copy_obj (ShapeModels, ShapeModel, StartContoursInTuple[Model], 

NumContoursInTuple[Model]) 


MoveAndScalingOfObject) 


endfor 

As described in section 4.3.2 on page 34, in case of multiple matches the output parameters Row etc. 

contain tuples of values, which are typically accessed in a loop, using the loop variable as the index into 

the tuples. When searching for multiple models, a second index is involved: The output parameter Model 

indicates to which model a match belongs by storing the index of the corresponding model ID in the 

tuple of IDs specified in the parameter ModelIDs. This may sound confusing, but can be realized in an 

elegant way in the code: For each found instance, the model ID index is used to select the corresponding 

information from the tuples created above. 

As already noted, the XLD representing the model can consist of multiple contours; therefore, you cannot 

access them directly using the operator select_obj. Instead, the contours belonging to the model are 

selected via the operator copy_obj, specifying the start index of the model in the concatenated tuple 

and the number of contours as parameters. Note that copy_obj does not copy the contours, but only the 

corresponding HALCON objects, which can be thought of as references to the contours. 

4.3.4 Aligning Other ROIs 

The results of the matching can be used to align ROIs for other image processing steps. i.e., to position 

them relative to the image part acting as the model. This method is very useful, e.g., if the object to be 

inspected is allowed to move or if multiple instances of the object are to be inspected at once as in the 

example application described below. 

In the example application hdevelop\align_measurements.dev, the task is to inspect razor blades 

by measuring the width and the distance of their “teeth”. Figure 20a shows the model ROI, figure 20b 

the corresponding model region.

a) b) 

d) 

c) 

4.3.4 Aligning Other ROIs 37 

Figure 20: Aligning ROIs for inspecting parts of a razor: a) ROIs for the model; b) the model; c) measuring 

ROIs; d) inspection results with zoomed faults. 

The inspection task is realized with the following steps: 

Step 1: Position the measurement ROIs for the model blade 

Rect1Row := 244 

Rect1Col := 73 

DistColRect1Rect2 := 17 

Rect2Row := Rect1Row 

Rect2Col := Rect1Col + DistColRect1Rect2 

RectPhi := rad(90) 

RectLength1 := 122 

RectLength2 := 2 

Matching Results


! 

First, two rectangular measurement ROIs are placed over the teeth of the razor blade acting as the model 

as shown in figure 20c. To be able to transform them later along with the XLD model, they are moved 

to lie on the XLD model, whose reference point is the origin of the image. Note that before moving the 

regions the clipping must be switched off. 

area_center (ModelROI, Area, CenterROIRow, CenterROIColumn) 

get_system (’clip_region’, OriginalClipRegion) 

set_system (’clip_region’, ’false’) 

move_region (MeasureROI1, MeasureROI1Ref, - CenterROIRow, 

- CenterROIColumn) 

move_region (MeasureROI2, MeasureROI2Ref, - CenterROIRow, 

- CenterROIColumn) 

set_system (’clip_region’, OriginalClipRegion) 

DistRect1CenterRow := Rect1Row - CenterROIRow 

DistRect1CenterCol := Rect1Col - CenterROIColumn 

DistRect2CenterRow := Rect2Row - CenterROIRow 

DistRect2CenterCol := Rect2Col - CenterROIColumn 

Step 2: Find all razor blades 

find_shape_model (SearchImage, ModelID, 0, 0, 0.8, 0, 0.5, ’least_squares’, 

0, 0.7, RowCheck, ColumnCheck, AngleCheck, Score) 

Then, all instances of the model object are searched for in the image. 


for i := 0 to |Score|-1 by 1 





For each razor blade, the transformation representing its position and orientation is calculated. 

Step 4: Create measurement objects at the corresponding positions 

affine_trans_pixel (MovementOfObject, DistRect1CenterRow, 

DistRect1CenterCol, Rect1RowCheck, 

Rect1ColCheck) 

affine_trans_pixel (MovementOfObject, DistRect2CenterRow, 

DistRect2CenterCol, Rect2RowCheck, 

Rect2ColCheck) 

Now, the new positions of the measure ROIs are calculated using the operator affine_trans_pixel 

with the moved ROI coordinates. As remarked in section 4.3.2 on page 34, you must use 

affine_trans_pixel and not affine_trans_point_2d. Then, the new measure objects are created.

RectPhiCheck := RectPhi + AngleCheck[i] 

gen_measure_rectangle2 (Rect1RowCheck, Rect1ColCheck, 

RectPhiCheck, RectLength1, RectLength2, 

Width, Height, ’bilinear’, 

MeasureHandle1) 

gen_measure_rectangle2 (Rect2RowCheck, Rect2ColCheck, 

RectPhiCheck, RectLength1, RectLength2, 

Width, Height, ’bilinear’, 

MeasureHandle2) 

4.3.4 Aligning Other ROIs 39 

In the example application, the individual razor blades are only translated but not rotated relative to the 

model position. Instead of applying the full affine transformation to the measure ROIs and then creating 

new measure objects, one can therefore use the operator translate_measure to translate the measure 

objects themselves. The example program contains the corresponding code; you can switch between the 

two methods by modifying a variable at the top of the program. 

Step 5: Measure the width and the distance of the “teeth” 

measure_pairs (SearchImage, MeasureHandle1, 2, 25, ’negative’, 

’all’, RowEdge11, ColEdge11, Amp11, RowEdge21, 

ColEdge21, Amp21, Width1, Distance1) 

measure_pairs (SearchImage, MeasureHandle2, 2, 25, ’negative’, 

’all’, RowEdge12, ColEdge12, Amp12, RowEdge22, 

ColEdge22, Amp22, Width2, Distance2) 

Now, the actual measurements are performed using the operator measure_pairs. 

Step 6: Inspect the measurements 

NumberTeeth1 := |Width1| 

if (NumberTeeth1 < 37) 

for j := 0 to NumberTeeth1 - 2 by 1 

if (Distance1[j] > 4.0) 

RowFault := round(0.5*(RowEdge11[j+1] + RowEdge21[j])) 

ColFault := round(0.5*(ColEdge11[j+1] + ColEdge21[j])) 

disp_rectangle2 (WindowHandle, RowFault, ColFault, 0, 

4, 4) 

Finally, the measurements are inspected. If a “tooth” is too short or missing completely, no edges are 

extracted at this point resulting in an incorrect number of extracted edge pairs. In this case, the faulty 

position can be determined by checking the distance of the teeth. Figure 20d shows the inspection results 

for the example. 

Please note that the example program is not able to display the fault if it occurs at the first or the last 

tooth. 

Matching Results


4.3.5 Rectifying the Search Results 

In the previous section, the matching results were used to determine the so-called forward transformation, 

i.e., how objects are transformed from the model into the search image. Using this transformation, 

ROIs specified in the model image can be positioned correctly in the search image. 

You can also determine the inverse transformation which transforms objects from the search image back 

into the model image. With this transformation, you can rectify the search image (or parts of it), i.e., 

transform it such that the matched object is positioned as it was in the model image. This method is 

useful if the following image processing step is not invariant against rotation, e.g., OCR or the variation 

model. Note that image rectification can also be useful before applying shape-based matching, e.g., if 

the camera observes the scene under an oblique angle; see section 5.1 on page 46 for more information. 

The inverse transformation can be determined and applied in a few steps, which are described below; in 

the corresponding example application of the HDevelop program hdevelop\rectify_results.dev 

the task is to extract the serial number on CD covers (see figure 21). 

Step 1: Calculate the inverse transformation 

vector_angle_to_rigid (CenterROIRow, CenterROIColumn, 0, RowCheck, 

ColumnCheck, AngleCheck, MovementOfObject) 

hom_mat2d_invert (MovementOfObject, InverseMovementOfObject) 

You can invert a transformation easily using the operator hom_mat2d_invert. Note that in contrast to 

the previous sections, the transformation is calculated based on the absolute coordinates of the reference 

point, because here we want to transform the results such that they appear as in the model image. 

Step 2: Rectify the search image 

affine_trans_image (SearchImage, RectifiedSearchImage, 

InverseMovementOfObject, ’constant’, ’false’) 

Now, you can apply the inverse transformation to the search image using the operator 

affine_trans_image. Figure 21d shows the resulting rectified image of a different CD; undefined 

pixels are marked in grey. 

Step 3: Extract the numbers 

reduce_domain (RectifiedSearchImage, NumberROI, 

RectifiedNumberROIImage) 

threshold (RectifiedNumberROIImage, Numbers, 0, 128) 

connection (Numbers, IndividualNumbers) 

Now, the serial number is positioned correctly within the original ROI and can be extracted without 

problems. Figure 21e shows the result, which could then, e.g., be used as the input for OCR.

a) 

c) d) 

4.3.5 Rectifying the Search Results 41 

Figure 21: Rectifying the search results: a) ROIs for the model and for the number extraction; b) the model; 

c) number ROI at matched position; d) rectified search image (only relevant part shown); e) 

extracted numbers. 

e) 

b) 

Matching Results


a) 

b) 

Row1 

translate(−Row1,−Column1) 

Column1 

c) d) 

Figure 22: Rectifying only part of the search image: a) smallest image part containing the ROI; b) cropped 

search image; c) result of the rectification; d) rectified image reduced to the original number 

ROI. 

Unfortunately, the operator affine_trans_image transforms the full image even if you restrict its 

domain with the operator reduce_domain. In a time-critical application it may therefore be necessary 

to crop the search image before transforming it. The corresponding steps are visualized in figure 22. 

Step 1: Crop the search image 

affine_trans_region (NumberROI, NumberROIAtNewPosition, 

MovementOfObject, ’false’) 

smallest_rectangle1 (NumberROIAtNewPosition, Row1, Column1, Row2, 

Column2) 

crop_rectangle1 (SearchImage, CroppedNumberROIImage, Row1, Column1, 

Row2, Column2) 

First, the smallest axis-parallel rectangle surrounding the transformed number ROI is computed using 

the operator smallest_rectangle1, and the search image is cropped to this part. Figure 22b shows 

the resulting image overlaid on a grey rectangle to facilitate the comparison with the subsequent images.

Step 2: Create an extended affine transformation 

hom_mat2d_translate (MovementOfObject, - Row1, - Column1, 

MoveAndCrop) 

hom_mat2d_invert (MoveAndCrop, InverseMoveAndCrop) 

4.3.5 Rectifying the Search Results 43 

In fact, the cropping can be interpreted as an additional affine transformation: a translation by the 

negated coordinates of the upper left corner of the cropping rectangle (see figure 22a). We therefore 

“add” this transformation to the transformation describing the movement of the object using 

the operator hom_mat2d_translate, and then invert this extended transformation with the operator 

hom_mat2d_invert. 

Step 3: Transform the cropped image 

affine_trans_image (CroppedNumberROIImage, RectifiedROIImage, 

InverseMoveAndCrop, ’constant’, ’true’) 

reduce_domain (RectifiedROIImage, NumberROI, 

RectifiedNumberROIImage) 

Using the inverted extended transformation, the cropped image can easily be rectified with the operator 

affine_trans_image (figure 22c) and then be reduced to the original number ROI (figure 22d) in order 

to extract the numbers. 

Matching Results


Row 

Column 

model image search image 

Scale = 1 

Row 

Column 

Scale = 0.5 

Figure 23: The center of the ROI acts as the point of reference for the scaling. 

a) b) c) 

d) 

Figure 24: Determining grasping points on nuts of varying sizes: a) ring-shaped ROI; b) model; c) grasping 

points defined on the model nut; d) results of the matching. 

4.4 Using the Estimated Scale 

Similarly to the rotation (compare section 4.3 on page 32), the scaling is performed around the center 

of the ROI – if you didn’t use set_shape_model_origin, that is. This is depicted in figure 23a at the 

example of an ROI whose center does not coincide with the center of the IC. 

The estimated scale, which is returned in the parameter Scale, can be used similarly to the position 

and orientation. However, there is no convenience operator like vector_angle_to_rigid 

that creates an affine transformation including the scale; therefore, the scaling must be added separately. 

How to achieve this is explained below; in the corresponding example HDevelop program hdevelop\multiple_scales.dev, 

the task is to find nuts of varying sizes and to determine suitable points 

for grasping them (see figure 24).

Step 1: Specify grasping points 

RowUpperPoint := 284 

ColUpperPoint := 278 

RowLowerPoint := 362 

ColLowerPoint := 278 

4.4 Using the Estimated Scale 45 

In the example program, the grasping points are specified directly in the model image; they are marked 

with arrows in figure 24c. To be able to transform them together with the XLD model, their coordinates 

must be moved so that they lie on the XLD model: 

area_center (ModelROI, Area, CenterROIRow, CenterROIColumn) 

RowUpperPointRef := RowUpperPoint - CenterROIRow 

ColUpperPointRef := ColUpperPoint - CenterROIColumn 

RowLowerPointRef := RowLowerPoint - CenterROIRow 

ColLowerPointRef := ColLowerPoint - CenterROIColumn 

Step 2: Determine the complete transformation 

find_scaled_shape_model (SearchImage, ModelID, -rad(30), rad(60), 0.6, 1.4, 

0.9, 0, 0, ’least_squares’, 0, 0.8, RowCheck, 

ColumnCheck, AngleCheck, ScaleCheck, Score) 








After the matching, first the translational and rotational part of the transformation is determined with the 

operator vector_angle_to_rigid as in the previous sections. Then, the scaling is added using the 

operator hom_mat2d_scale. Note that the position of the match is used as the point of reference; this 

becomes necessary because the scaling is performed “after” the translation and rotation. The resulting, 

complete transformation can be used as before to display the model at the position of the matches. 

Step 3: Calculate the transformed grasping points 

affine_trans_pixel (MoveAndScalingOfObject, RowUpperPointRef, 

ColUpperPointRef, RowUpperPointCheck, 

ColUpperPointCheck) 

affine_trans_pixel (MoveAndScalingOfObject, RowLowerPointRef, 

ColLowerPointRef, RowLowerPointCheck, 

ColLowerPointCheck) 

Of course, the affine transformation can also be applied to other points in the model image with the 

operator affine_trans_pixel. In the example, this is used to calculate the position of the grasping 

points for all nuts; they are marked with arrows in figure 24d. 

As noted in section 4.3.2 on page 34, you must use affine_trans_pixel and not ! 

affine_trans_point_2d. 

Matching Results


5 Miscellaneous 

5.1 Adapting to a Changed Camera Orientation 

As shown in the sections above, HALCON’s shape-based matching allows to localize objects even if their 

position and orientation in the image or their scale changes. However, the shape-based matching fails 

if the camera observes the scene under an oblique angle, i.e., if it is not pointed perpendicularly at the 

plane in which the objects move, because an object then appears distorted due to perspective projection; 

even worse, the distortion changes with the position and orientation of the object. 

In such a case we recommend to rectify images before applying the matching. This is a three-step 

process: First, you must calibrate the camera, i.e., determine its position and orientation and other 

parameters, using the operator camera_calibration. Secondly, the calibration data is used to create a 

mapping function via the operator gen_image_to_world_plane_map, which is then applied to images 

with the operator map_image. For detailed information please refer to the Application Note on 3D 

Machine Vision, section 3.3 on page 49. 

5.2 Reusing Models 

If you want to reuse created models in other HALCON applications, all you need to do is to store the 

relevant information in files and then read it again. The following example code stems from the HDevelop 

program hdevelop\reuse_model.dev. First, a model is created: 

create_scaled_shape_model (ImageROI, ’auto’, -rad(30), rad(60), ’auto’, 


10, ModelID) 

Then, the model is stored in a file using the operator write_shape_model. With the model, HALCON 

automatically saves the XLD contour, the reference point, and the parameters that were used in the call 

to create_shape_model. 

write_shape_model (ModelID, ModelFile) 

In the example program, all shape models are cleared to represent the start of another application. 

The model, the XLD contour, and the reference point are now read from the files using the 

operator read_shape_model. Then, the XLD contours and the reference are accessed using 

get_shape_model_contours and get_shape_model_origin, respectively, Furthermore, the parameters 

used to create the model are accessed with the operator get_shape_model_params: 

read_shape_model (ModelFile, ReusedModelID) 

get_shape_model_contours (ReusedShapeModel, ReusedModelID, 1) 

get_shape_model_origin (ReusedModelID, ReusedRefPointRow, 

ReusedRefPointCol) 

get_shape_model_params (ReusedModelID, NumLevels, AngleStart, AngleExtent, 

AngleStep, ScaleMin, ScaleMax, ScaleStep, Metric, 

MinContrast) 

Now, the model can be used as if it was created in the application itself:

find_scaled_shape_model (SearchImage, ReusedModelID, AngleStart, 

AngleExtent, ScaleMin, ScaleMax, 0.9, 0, 0, 


AngleCheck, ScaleCheck, Score) 


vector_angle_to_rigid (ReusedRefPointRow, ReusedRefPointCol, 0, 

RowCheck[i], ColumnCheck[i], AngleCheck[i], 




affine_trans_contour_xld (ReusedShapeModel, ModelAtNewPosition, 



endfor 

5.2 Reusing Models 47 

Miscellaneous

48 Application Note on Shape-Based Matching



Finding and Decoding 2D Data 

Codes 

⊲ Finding and decoding 2D data codes of type Data Matrix ECC 200, QR Code, or PDF417 with one 

operator call 


⊲ Identification of individuals (Drivers’ licenses, ID cards, etc.) 

⊲ Identification and labeling of objects (Medicine, Production, etc.) 

⊲ Electronic Data Interchange (EDI) 

⊲ Transport and logistics 


create_data_code_2d_model 

query_data_code_2d_params 

get_data_code_2d_param 

set_data_code_2d_param 

find_data_code_2d 

get_data_code_2d_objects , get_data_code_2d_results 

read_data_code_2d_model , write_data_code_2d_model 

clear_all_data_code_2d_models , clear_data_code_2d_model 


Overview 

2D data codes are used in various application areas and get more and more important. This Application 

Note guides you to the handling of 2D data codes using the operators of HALCON. 

In section 1 on page 4 we introduce you to 2D data codes in general, including a description of the 

different symbol types supported by HALCON, namely PDF417, Data Matrix ECC 200, and QR Code. 

A first example in section 2 on page 5 shows the main steps needed to read a standard 2D data code. To 

read non-standard 2D data codes as well or to enhance the run time, section 3 on page 7 describes the 

different ways to change the 2D data code model, which is used to guide the search process of the 2D 

data code reader. 

Although the 2D data code reader of HALCON is rather powerful, there are some symbol representations 

that cannot be decoded for various reasons. Some problems can be solved by using image preprocessing 

methods. Section 4 on page 20 shows a selection of these problems and describes the corresponding 

preprocessing steps. A deeper insight into the handling of problems is given in section 5 on page 24. 

There, an approach for debugging the search process is provided. This can be used on the one hand to 

locate specific defects of symbols that are not decoded, and on the other hand to get information about 

successfully decoded symbols, so that the run time can be enhanced by a better model adaptation. Some 

problems strictly have to be avoided already during the image acquisition. Besides their introduction, the 

requirements and limitations concerning the appearance of the symbols are summarized for the individual 

symbol types. 

Unless specified otherwise, the HDevelop example programs that are presented in this Application 

Note can be found in the subdirectory 2d_data_codes of the directory HALCONROOT 

\examples\application_guide. 




Edition 1 April 2006 (HALCON 7.1.1) 

Edition 1a December 2006 (HALCON 7.1.2) 






Contents 

1 Introduction to 2D Data Codes 4 


3 Model Adaptation 7 

3.1 Global Parameter Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

3.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

3.3 Specific Parameter Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

3.4 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 

4 Preprocessing Difficult Images 20 

4.1 Slanted Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 

4.2 Small Module Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 

4.3 Large Module Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 

4.4 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 

5 Problem Handling 24 

5.1 Data Access for Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 

5.2 Selected Problems and Tips to Avoid Them . . . . . . . . . . . . . . . . . . . . . . . . 37 

5.3 Requirements and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

3

4 Application Note on 2D Data Codes 

1 Introduction to 2D Data Codes 

HALCON provides means to read 2D data codes of type Portable Data Format 417 (PDF417), Data 

Matrix ECC 200, and QR Code. 2D data codes, which are also called 2D bar codes or 2D symbologies, 

are used in various areas. Similar to 1D bar codes, they encode characters and numbers in graphical 

symbols that are constructed by dark and light bars or dots that are called modules. 1D bar codes use 

black bars and the spaces in between as modules. As the individual bars or spaces have a constant width 

along their height, you can read a 1D bar code in a single scanning line along the symbol’s width. In 

contrast to 1D bar codes, for the symbols of 2D data codes changes occur along both directions. Thus, 

the same information can be encoded in smaller symbols. The actual size of a symbol, i.e., the number 

of modules in both directions, mainly depends on the length of the encoded message and the level of the 

applied error correction. The latter is needed in order to completely decode a symbol even when it has 

small defects, e.g., if some of the modules are not visible. There are different types of 2D data codes. 

Two common types are the so-called stacked codes and matrix codes. 

Of the stacked codes, HALCON supports the PDF417 (see figure 1). Its symbol is built up by several 1D 

bar codes, which are arranged in rows and columns. Each 1D bar code encodes an individual ’codeword’. 

According to the name of the symbol type, each codeword consists of four dark as well as four light bars 

(spaces) and is built up by 17 modules. It always starts with a dark and ends with a light bar. The number 

of rows and columns of a PDF417 symbol is variable in a range of 3 to 90 rows and 1 to 30 columns. 

Start and stop patterns frame the symbol on the left and right border. The first and the last columns of 

codewords are called left and right row indicators. These codewords provide important information for 

the decoding, like the number of rows and columns, the error correction level etc. Around the symbol’s 

border, a homogeneous frame is placed, which is called quiet zone. For small symbols, a variant of 

PDF417 exists, where no right row indicator exists and the stop pattern is reduced to a one module wide 

bar. It is called compact or truncated PDF417. HALCON operators can read both conventional and 

truncated PDF417. 

Start pattern 

Data encoded in 

2 columns and 8 rows Stop pattern 

Left row indicator Right row indicator 

Figure 1: Stacked code of type PDF417. 

Composition of a PDF417 codeword: 

4 dark and 4 light bars, 

built up by 17 modules 

Matrix codes use graphical patterns. They consist of three components: a so-called finder element or 

finder pattern, which is needed to find the symbol and its orientation in an image, the data patterns, 

which consist of binary modules grouped relative to the finder pattern, and a quiet zone, similar to the 

one needed for PDF417 symbols. Matrix codes supported by HALCON are Data Matrix ECC 200 and 

QR Code (see figure 2). For both matrix code types, the foreground and background modules typically


are square or rectangular, but also circular foreground modules occur. The modules are ordered in rows 

and columns. The number of rows and columns define the size of a symbol, which for a QR Code is 

directly linked to its version number (the higher the version number, the bigger the symbol). The finder 

element of a Data Matrix ECC 200 consists of an L-shape, and alternating dark and light modules on the 

opposite borders. The finder element of a QR Code consists of three squares, which are called ’position 

detection patterns’. 

Figure 2: Types of matrix codes (finder patterns are marked in gray): (left) Data Matrix ECC 200 and 

(right) QR Code. 

Whereas stacked codes can also be read row by row by a 1D bar code reader, matrix codes can only 

be decoded by inspecting images, i.e., a camera is needed. The HALCON operators for reading 2D 

data codes assume images as the source for matrix codes as well as for stacked codes. A basic goal 

of HALCON is that all 2D data code operators are applied as easily as possible. Every 2D data code 

operator can be applied to all supported symbol types; only the assigned parameters vary. Additionally, 

finding, reading, and decoding of a symbol can be done with a single operator call. The following 

chapters show when and how to apply the different 2D data code operators and give suggestions how to 

handle common problems, e.g., when facing irregular symbols or images of bad quality. 


This section shows basic steps for the 2D data code reading. To follow the example actively, start the 

HDevelop program hdevelop\2d_data_codes_first_example.dev, which reads symbols of type 

’Data Matrix ECC 200’ in different images; the steps described below start after the initialization of 

the application (press Run once to reach this point). 

Step 1: Specify the 2D data code model 

create_data_code_2d_model (’Data Matrix ECC 200’, [], [], DataCodeHandle) 

First, the operator create_data_code_2d_model specifies the SymbolType that is to be read. Supported 

types are ’PDF417’, ’Data Matrix ECC 200’, and ’QR Code’. Here, ’Data Matrix ECC 

200’ is chosen. As a result, the operator returns a handle to access the created 2D data code model. 

First Example


Step 2: Find, read, and decode the symbol 

find_data_code_2d (Image, SymbolXLDs, DataCodeHandle, [], [], 

ResultHandles, DecodedDataStrings) 

Now, the images are read and for each image the operator find_data_code_2d searches for symbols 

of the specified type and, if found, reads and decodes them. The input to the operator is the image 

and the data code handle. If you work with images that contain more than one symbol per image you 

additionally must add the general parameter ’stop_after_result_num’ with the number of expected 

symbols. In this example, the number of symbols per image is 1, which is the default value. Because no 

additional parameters are set in this example, two empty tuples [] are passed in the operator call. The 

operator find_data_code_2d returns the XLD contour representing the border of the found symbol, 

a result handle for further investigations concerning the search process, as well as a string containing 

the encoded message(s). In the program, the XLD contour and the string are visualized mainly by the 

operators set_tposition, write_string, and dev_display, respectively (see figure 3). 

dev_display (Image) 

dev_set_color (’white’) 

gen_rectangle1 (Rectangle, 10, 10, 40, Width-70) 

dev_set_color (’black’) 

set_tposition (WindowHandle, 17, 15) 

write_string (WindowHandle, DecodedDataStrings) 

dev_set_color (’yellow’) 

dev_display (SymbolXLDs) 

Figure 3: Visualization of the result: surrounding XLD contour of the symbol and decoded string. 

If the string exceeds 1024 characters, only 1020 characters followed by ’...’ are displayed. In this case, 

the whole encrypted data can be accessed as ASCII code, i.e., as a tuple of numbers representing the 

individual characters, by the operator get_data_code_2d_results, which will be discussed in detail 

in section 5.1.2 on page 29.

Step 3: Clear the 2D data code model 

clear_data_code_2d_model (DataCodeHandle) 

3 Model Adaptation 7 

The operators create_data_code_2d_model and find_data_code_2d allocate memory. To reset the 

2D data code model and explicitly free the model memory, the operator clear_data_code_2d_model 

is called. 

When running the program, the applied operators assume default values for the parameters used by the 

specified 2D data code model. As these default values are choosen according to a certain standard, not 

all symbols can be found and decoded. The next chapter describes how the parameters can be adapted to 

better suit a specific application. 

3 Model Adaptation 

To be able to find a symbol in an image, the operator find_data_code_2d needs a set of parameters. 

In our first HDevelop program hdevelop\2d_data_codes_first_example.dev we did not explicitly 

set any parameters besides the SymbolType we were looking for. Thus, for all parameters needed for the 

chosen type, default values were used. These work for symbols fulfilling the following requirements: 

• The code must be printed dark on light, 

• the contrast value must be bigger than 30, 

• the sizes of symbol and modules are in a certain range (which depends on the selected symbol 

type), 

• there is no or only a small gap between neighboring modules of matrix codes (for PDF417 no gap 

is allowed), 

• for QR Codes, additionally all three position detection patterns must be visible. 

In many cases, you face symbols that do not fulfill all the requirements and therefore are not found. 

In this case, and also if you want to enhance the run time of your application, you have to modify the 

parameters in order to adapt the 2D data code model to your specific set of images. HALCON provides 

three different methods to modify the parameters. They are described in detail in the following sections. 

Additionally, tips concerning the run time and the storing of the 2D data code model are given. In 

particular, the next sections show how to 

• adjust the model to read a wider range of symbols (see section 3.1), 

• train the model automatically with a set of representative images (see section 3.2), 

• optimize the model by setting specific parameters manually (see section 3.3), 

• enhance the run time and save the modified model into a file (see section 3.4). 

Model Adaptation


3.1 Global Parameter Settings 

There are two predefined sets of parameters for the 2D data code model. The first one is used in the 

already mentioned standard mode, which is chosen by default when no other parameter settings are 

applied. It uses a restricted range of values for each parameter, is rather fast, and works fine for many, 

but not all, 2D data codes. The second one is used in the enhanced mode. There, a large range of values 

common for each parameter is checked. Therefore, it is not as fast as the standard mode, but now almost 

all readable symbols can be read. In particular, using this mode 

• the symbols may also appear light on dark, 

• the contrast can be lower (≥10), 

• the size of the modules can be smaller, 

• a bigger gap between neighboring modules is allowed for matrix codes, 

• for Data Matrix ECC 200 the symbol may be slanted up to 0.5235 (30 degrees) 

• and the size of the individual modules may vary in a specific range, 

• for a QR Code only two position detection patterns must be visible. 

Both modes can be set using the operator set_data_code_2d_param, although for the standard mode 

this is only necessary to reset a 2D data code model after using another 2D data code model. The HDevelop 

program hdevelop\2d_data_codes_global_settings.dev is similar to the first example, but 

now you can switch between standard and enhanced mode by (un)commenting the corresponding lines. 

enhanced := 1 

* enhanced := 0 

if (enhanced=1) 

set_data_code_2d_param (DataCodeHandle, ’default_parameters’, 

’enhanced_recognition’) 

else 

set_data_code_2d_param (DataCodeHandle, ’default_parameters’, 

’standard_recognition’) 

endif 

Now, when running the program in enhanced mode, the symbols that were not found in standard mode 

are correctly decoded. Switching from standard mode to enhanced mode is the easiest way to find symbols 

that do not fulfill the requirements of the standard mode. However, because for each parameter 

more alternatives have to be checked the time needed for the search process increases, especially when 

no code is found at all. The following code lines are used to measure the time needed to run the operator 

find_data_code_2d. Thus, you can compare the run time needed for reading with both global parameter 

sets. The result is displayed by a procedure called write_message. For example, in figure 4 the 

run time needed in enhanced mode is approximately twice the time needed in standard mode.

dev_update_var (’off’) 

count_seconds (T1) 



count_seconds (T2) 

dev_update_var (’on’) 

write_message (WindowHandle, 30, -1, 

’Time = ’ + (1000 * (T2-T1))$’.1f’ + ’ms’, true) 

3.2 Training 9 

Figure 4: Run time required on an Intel Pentium 4 platform (2.4 GHz CPU): 13.6ms in standard mode and 

26.5ms in enhanced mode. 

To decrease the run time you have to adapt the parameters to your specific images. This can be done either 

by an automatic training (see next section) or by setting specific parameters manually (see section 3.3). 

Note that these methods can mainly be used to reduce the run time. If your symbols cannot be decoded 

in enhanced mode, apart from a few exceptions a further adaptation of the model will not work either. 

In such a case, you should enhance the quality of your images. This can be done either during the 

image acquisition (pay attention, e.g., to the lighting conditions, see section 5.2 on page 37), which is 

recommended, or by a preprocessing (see section 4 on page 20). 

3.2 Training 

If you have a set of symbols you want to find, read, and decode, in most cases the individual symbols have 

similar attributes. This similarity can be used to train your 2D data code model, i.e., you use a subset 

of your symbols to automatically obtain an individual set of parameters suited best for your specific 

application. After the training, you use this parameter set to find the symbols in your remaining images. 

By this means, you can find the symbols also if they do not fulfill the requirements of the standard mode. 

But opposite to using the enhanced mode, in most cases more restricted parameter values are checked 

and therefore the search process becomes faster. 

Model Adaptation


3.2.1 Train the Model 

In the HDevelop program hdevelop\2d_data_codes_training.dev, the 2D data code model created 

by create_data_code_2d_model is trained with two different images. For training, the operator 

find_data_code_2d, i.e., the same operator as for reading a 2D data code, is applied. This time, the 

additional parameter ’train’ is assigned. Its value determines the group of parameters that will be 

affected by the training. Here, we choose ’all’, i.e., all available model parameters are trained. If 

you want to train only specific groups like ’symbol_size’, ’module_size’, or ’contrast’, you can 

combine them using tuples. Inside a tuple, you can also choose the value ’all’ and exclude specific 

groups from the training by using their name with a preceeding ’~’. A complete list of valid values 

for the parameter ’train’ is provided in the description of the operator find_data_code_2d in the 

Reference Manual. If you work with images that contain more than one symbol, you can use all symbols 

for training by setting the additional parameter ’stop_after_result_num’ to the correct number of 

expected symbols. If you want to use only specific symbols, you can reduce the domain to a region of 

interest (ROI) containing only the specific symbols. For a short introduction to ROIs see section 3.4.1. 

Extensive descriptions can be found in the Quick Guide, section 3.2 on page 27. 

The operator find_data_code_2d is applied to each training image individually. The training of the 

first image restricts the model to the parameter values needed to find, read, and decode the symbol of 

that specific image. All following training images are used to extend the restricted model again, so that 

the resulting model suits the whole set of images. If all symbols have the same size, no gaps between the 

modules, foreground and background modules of the same size, no distinct texture in the background, 

and a similar contrast, one image is sufficient for the training. But for most applications, several training 

images are recommended. For a satisfying training result, you should use your most different images. 

Here, we choose two images with different contrast values (see figure 5). 

create_data_code_2d_model (’ECC200’, [], [], DataCodeHandle) 

* -> dark image 

read_image (Image, ’datacode/ecc200/ecc200_cpu_007’) 

find_data_code_2d (Image, SymbolXLDs, DataCodeHandle, ’train’, ’all’, 


* -> light image 

read_image (Image, ’datacode/ecc200/ecc200_cpu_008’) 



The training leads to a new 2D data code model, which is now used to find, read, and decode the symbols 

in the remaining images. To do this, the operator find_data_code_2d is applied again, this time 

without the parameter ’train’. At the end of the program the operator clear_data_code_2d_model 

is called to reset the model and free the allocated memory. 

for i := 7 to 16 by 1 

read_image (Image, ’datacode/ecc200/ecc200_cpu_0’ + (round(i)$’.2’)) 



endfor 

clear_data_code_2d_model (DataCodeHandle)

Figure 5: Images with different contrast values for training the 2D data code model. 

3.2.2 Inspect the Changes 

3.2.2 Inspect the Changes 11 

In the program hdevelop\2d_data_codes_training.dev, the training is framed by additional code 

lines, which are not necessary for the training or the following reading of the symbols, but help 

to understand how the training changes the 2D data code model. Before the training, the operator 

query_data_code_2d_params gets a list of parameters valid for symbols of the type specified in the 

2D data code handle. In this case all model parameters that can be set for symbols of type ’Data 

Matrix ECC 200’ are queried. The current values of these parameters are obtained by the operator 

get_data_code_2d_param. 

query_data_code_2d_params (DataCodeHandle, ’get_model_params’, 

GenParamNames) 

get_data_code_2d_param (DataCodeHandle, GenParamNames, ModelBeforeTraining) 

After the training, the parameter values are checked again and the changes between the untrained and the 

trained model, stored in the variable ’ModelAdaptation’, are displayed (see figure 6). 

get_data_code_2d_param (DataCodeHandle, GenParamNames, ModelAfterTraining) 

ModelAdaptation := GenParamNames + ’: ’ + ModelBeforeTraining + ’ -> ’ 

+ ModelAfterTraining 

dev_inspect_ctrl (ModelAdaptation) 

In some cases, e.g., when you want to read additional symbols that are not similar to the symbols used 

for the training, you have to adapt selected parameters manually. The next section shows how to apply 

individual modifications and goes deeper into specific groups of parameters available for the 2D data 

code operators provided by HALCON. 

3.3 Specific Parameter Settings 

The third and most complex way to modify a 2D data code model is to change specific parameters 

manually. The HDevelop program hdevelop\2d_data_codes_manual_settings.dev shows how 

Model Adaptation


Figure 6: Displaying changes of parameter values. 

to modify parameters and introduces several operators that provide a deeper insight into a specific 2D 

data code model. The operator query_data_code_2d_params, as already mentioned in section 3.2.2, 

queries lists of parameters valid for a specific symbol type. Here, again the list of the available 

model parameters GenParamNames is queried (for further lists see section 5.1.1 on page 27). Their 

current values, in this case the default values of the standard mode, are obtained using the operator 

get_data_code_2d_param. 

query_data_code_2d_params (DataCodeHandle, ’get_model_params’, 


get_data_code_2d_param (DataCodeHandle, GenParamNames, GenParamValues) 

The modification of a model can be done either within the operator create_data_code_2d_model or 

by the operator set_data_code_2d_param, as already introduced in section 3.1. The latter operator 

can be called several times and therefore is used if several tuples of parameters that cannot be combined 

in the same operator call have to be modified. Instead of setting all parameters at once by specifying a 

global parameter set, we now modify the individual parameters separately. The following sections go 

deeper into the specific groups of model parameters, particularly concerning 

• the shape and size of the symbols (see section 3.3.1), 

• the appearance of the modules, e.g., their contrast or size (see section 3.3.2), 

• the general behavior of the 2D data code model, i.e., the strictness of the symbol search and the 

storing of intermediate results (see section 3.3.3). 

For the first two groups a concise overview to the value ranges specific to each individual symbol type is 

provided in section 5.3.2 on page 40. The complete list of parameter names is provided in the description 

of the operator set_data_code_2d_param in the Reference Manual.

3.3.1 Shape and Size of the Symbols 

3.3.1 Shape and Size of the Symbols 13 

One group of parameters used for a 2D data code model is related to the size, e.g., the number of rows 

and columns, but also the shape or type of a symbol. 

Symbol Shape (only Data Matrix ECC 200) 

For Data Matrix ECC 200, the parameter ’symbol_shape’ specifies the shape of the symbol. If it is set 

to ’rectangle’, the number of rows and columns differs. If it is set to ’square’, the number of rows 

and columns is equal. If the value ’any’ is passed, both shapes are searched for. Since HALCON 7.1.1, 

the parameter ’symbol_shape’ still can be queried but now the search algorithm is the same for both 

shapes. Thus, for the symbol search it is of no importance for the parameter settings anymore. 

Symbol Size 

Parameters related to the symbol size set the minimum and maximum number of rows and columns allowed 

for a searched symbol. For matrix codes (Data Matrix ECC 200 and QR Code), rows and columns 

correspond to modules, whereas for the stacked code (PDF417), they correspond to codewords (excluding 

the codewords of the start and stop patterns as well as those of the left and right row indicators). QR 

Codes are always square and therefore have the same number for rows and columns. For them, instead 

of setting the number explicitly, it can also be set implicitly by specifying the version number. For Data 

Matrix ECC 200 and PDF417 the number of rows and columns may differ. 

In the example program, the shape and size of the symbol are the first attributes to be modified. The variable 

window shows the default values of the parameters queried by get_data_code_2d_param before 

applying the modification. By default, the Data Matrix ECC 200 symbol may have any shape and its size 

can lie between 8 to 144 rows and 10 to 144 columns (see figure 7). The parameter ’symbol_size’ 

assumes a square symbol and therefore expects a single value for the number of rows and columns. So, 

by setting it to 18, we speed up the search process by restricting the model to square shaped symbols 

with 18 rows and 18 columns. 

set_data_code_2d_param (DataCodeHandle, ’symbol_size’, 18) 

Available range for the symbol size of Data Matrix ECC 200: 10x10 − 144x144 (square), 8x18 − 16x48 (rectangular) 

12x12 

18x18 

24x24 

Figure 7: Examples for symbol sizes of Data Matrix ECC 200. 

Model Adaptation


Model Type (only QR Code) 

Two types of QR Code are differentiated: the old Model 1 and the new Model 2 (see figure 8). For both 

model types the smallest symbol consists of 21 rows and columns and is specified as Version 1. The 

largest symbol consists of 73 rows and columns (Version 14) for Model 1 and 177 rows and columns 

(Version 40) for Model 2. In addition to the position detection patterns, symbols of Version 2 or larger 

contain extension patterns for Model 1 and alignment patterns for Model 2. When working with QR 

Codes, the model should be restricted to the correct model by setting ’model_type’ to 1 or 2. If the 

parameter is set to ’any’, both types are searched for. 

3.3.2 Appearance of the Modules 

QR Code: Model 1 QR Code: Model 2 

21x21 − 73x73 (Version 1−14) 21x21 − 177x177 (Version 1−40) 

Extension Patterns (except Version 1) Alignment Pattern (except Version 1) 

Fourth Corner is Fixed 

Figure 8: QR Codes: (left) Model 1, (right) Model 2. 

The modules of 2D data code symbols can differ significantly in their general appearance. In the program, 

the settings for some of the main attributes are modified. 

Polarity 

The parameter ’polarity’ determines if the foreground modules are darker or brighter than the background 

(see figure 9) or if both polarities are checked. By default, the value ’dark_on_light’ is set. 

If it is set to ’any’, both polarities are checked. In the program we change it to ’light_on_dark’ to 

adapt it to our specific symbols. 

set_data_code_2d_param (DataCodeHandle, ’polarity’, ’light_on_dark’) 

Mirrored 

Especially when reading symbols on transparent surfaces, it may occur that the symbol’s representation 

is mirrored (see figure 10). By default, the parameter ’mirrored’ is set to ’any’, i.e., mirrored and 

non-mirrored symbols are searched for. If you have only mirrored symbols you can restrict the search by 

setting the parameter to ’yes’. Here, all of our symbols are not mirrored, so we restrict the search by 

setting the parameter to ’no’. 

set_data_code_2d_param (DataCodeHandle, ’mirrored’, ’no’)

Figure 9: Symbols of different polarity: (left) dark on light and (right) light on dark. 

3.3.2 Appearance of the Modules 15 

Figure 10: Alignment of rows and columns: (left) non-mirrored and (right) mirrored symbol. 

Minimum Contrast 

The contrast corresponds to the difference between the gray values of the foreground and the background 

of a symbol, but also depends on the gradient of the edges. For blurred images, the contrast must be lower 

than the gray value difference. In standard mode, the minimum contrast is set to 30. Here, the contrast in 

some of our images is rather low. Therefore, we set the parameter ’contrast_min’ to the default value 

of the enhanced mode, i.e., a value of 10. See figure 11 for symbols of different contrast. 

set_data_code_2d_param (DataCodeHandle, ’contrast_min’, 10) 

Module Size 

Parameters related to the module size restrict the size of a module in pixels to speed up the search 

process. This is useful if the modules of all symbols are of similar size. For both matrix codes, the 

size is described by the width and height of a module, whereas for PDF417 codes, it is described by the 

module width and the module aspect ratio, which is the module height divided by the module width (see 

figure 12). The default values for the different ranges depend on the chosen symbol type. For symbols of 

type ’Data Matrix ECC 200’, the default range in standard mode is 6 to 20 pixels. Because we have 

rather small modules in our images, we restrict the range to 4 to 7 pixels. 

set_data_code_2d_param (DataCodeHandle, [’module_size_min’, 

’module_size_max’], [4,7]) 

Model Adaptation


Figure 11: Symbols of different contrast: (left) 10 and (right) 34 . 

Module Aspect = 

Module Height 

Module Width 

Module Gap (only Matrix Codes) 

Figure 12: Module aspect ratio for PDF417. 

Module Width 

Module Height 

In contrast to PDF417 codes, the modules of a matrix code are not necessarily connected. If a gap 

between the modules exists you can specify the range of its size for both x- and y-direction (’small’ or 

’big’). In standard mode no or only a small gap (≤ 10% of the module size) is allowed. In the example 

program, we have no gaps in all directions of our symbols, so we restrict the parameter ’module_gap’ 

to ’no’. Figure 13 illustrates the different gap sizes. 

set_data_code_2d_param (DataCodeHandle, ’module_gap’, ’no’) 

a) b) c) 

Figure 13: Size of module gaps: a) no gaps, b) small gaps, c) big gaps.

Maximum Slant (only Data Matrix ECC 200) 

3.3.3 Model Control Parameters 17 

For Data Matrix ECC 200, the L-shaped finder pattern is assumed to be right-angled, but a certain slant 

of the symbol can be coped with (see figure 14). In standard mode the maximum slant angle is 0.1745 

(10 degrees) and in enhanced mode the angle can be up to 0.5235 (30 degrees). Here, we keep the default 

value of the standard mode. 

Module Grid (only Data Matrix ECC 200) 

slant_max 

Figure 14: Slant angle. 

The parameter ’module_grid’ determines which algorithm is used for the calculation of the module 

positions of a Data Matrix ECC 200 symbol. If it is set to ’fixed’, the modules of the symbol have 

to be arranged in a regular grid with similar distance between the modules. If it is set to ’variable’, 

the grid is aligned to the alternating side of the finder pattern and the size of the modules may vary in 

a specific range, in particular they now may deviate up to a modules size from the regular grid. With 

’any’, both approaches are tested one after the other. 

Number of Position Detection Patterns (only QR Code) 

When working with QR Codes, you can set the parameter ’position_pattern_min’ for the number 

of position detection patterns that are at least required to be present to 3 or 2. 3 is the default value of the 

standard mode and means that all position detection patterns have to be visible. 2 is used in enhanced 

mode. There, one of the patterns may be missing. 

3.3.3 Model Control Parameters 

Besides the shape, size, and appearance of the symbol, you can specify the parameters ’persistence’ 

and ’strict_model’, which control the general behavior of a 2D data code model. 

Persistence 

The parameter ’persistence’ determines how intermediate results, obtained while searching for symbols 

by the operator find_data_code_2d, are stored in the 2D data code model. By default, they are 

stored only temporarily (’persistence’ set to 0) in order to reduce the allocated memory. Setting the 

parameter ’persistence’ to 1, the results are stored persistently. Due to the high memory requirements, 

we recommend to do this only if some of the intermediate results are to be visualized or used for 

Model Adaptation


debugging purposes, e.g., when a symbol can not be read. Further information about debugging is given 

in section 5.1 on page 26. 

Strictness 

Sometimes, symbols can be read but nevertheless do not fit the model restrictions on the size of the 

symbols. The parameter ’strict_model’ controls if the operator find_data_code_2d rejects such 

symbols or returns them as a result independent of their size and the size specified in the model. For the 

first case, which is the default, the parameter is set to ’yes’. This is reasonable if only symbols of a 

certain size are to be found and other symbols, which may be contained in the image as well, should be 

ignored. If you set the parameter ’strict_model’ to ’no’, it may occur that also symbols that do not 

strictly fulfill the restrictions are found. 

3.4 Miscellaneous 

3.4.1 Speeding up find_data_code_2d 

You can speed up the run time of the operator find_data_code_2d in two ways. One way is to restrict 

the search space, the other way is to restrict the value ranges of the parameters used by the 2D data code 

model. In the following, both approaches are described in more detail. 

Region of Interest 

The standard HALCON approach to speed up the processing is to restrict the search space. For this, you 

define a region of interest (ROI) where the symbol is searched in. If, e.g., all of your images have the 

symbol placed in the upper left corner, you define a region that covers the upper left corner in a way 

that it contains the symbols in all images. An ROI can be created, e.g., by generating a rectangle with 

the operator gen_rectangle1. After reading an image, you reduce the image domain to this specific 

rectangular region using the operator reduce_domain. Instead of the original image you then use the 

reduced one in the operator find_data_code_2d. Further information about ROIs can be found in the 

Quick Guide, section 3.2 on page 27. 

Restricted Model 

In section 3.1 on page 8 it was already mentioned that adjusting the model parameters significantly 

affects the run time of the operator find_data_code_2d. So, the second way to speed it up is to restrict 

all parameters to the minimum range of values needed for your specific symbol representations. To 

understand the importance of restricting the ranges of values for specific parameters, we have a closer 

look at the functionality of the symbol search. 

The search takes place in several passes, starting at the highest pyramid level, i.e., the level where symbols 

with the maximum module size are still visible. The minimum module size determines the lowest 

pyramid level to investigate (see figure 15). Reducing the range of values for the module size, we reduce 

the number of passes needed for the symbol search and thus enhance the run time. 

In each pyramid level specific parameters are checked. If, e.g., the ’polarity’ is set to ’any’, in a 

first pass ’dark_on_light’ symbols are searched for. If none are found, a second pass searches for 

’light_on_dark’ symbols. Therefore, restricting the polarity significantly increases the speed.

Figure 15: Pyramid levels. 

3.4.2 Store the 2D Data Code Model 19 

Each pass consists of two phases, the search phase and the evaluation phase. The search phase is used to 

look for finder patterns and generate symbol candidates for every detected finder pattern. The evaluation 

phase is used to investigate the candidates in a lower pyramid level and, if possible, to read them. The 

operator find_data_code_2d terminates when the required number of symbols was successfully decoded, 

or when the last pass was performed. This explains why the symbol search is rather fast when the 

right parameter values are checked first and takes much longer when a wide range of parameter values 

has to be checked, but the requested number of symbols is not found. 

In summary, if run time matters to your application, you should pay special attention to the model 

parameters for the following attributes: 

• polarity, 

• minimum module size, 

• number of symbol rows (for PDF417, especially for strongly cluttered or textured images), 

• module gaps (for matrix codes with very small modules), 

• the minimum number of position detection patterns (for QR codes). 

If these parameters are not set correctly or with a range that is unnecessarily wide, the search process 

slows down, especially when the requested number of symbols cannot be found. The actual values 

of the found symbols can be queried with the operator get_data_code_2d_results as described in 

section 5.1.2 on page 29. 

3.4.2 Store the 2D Data Code Model 

After changing the parameter setting by an automatic training or by a manual parameter setting, the 

new model can be stored in a file using the operator write_data_code_2d_model. The HDevelop 

Model Adaptation


program hdevelop\write_2d_data_code_model.dev shows the main steps for storing a trained 2D 

data code model. First, like in the previous example programs, a 2D data code handle for symbols of 

the type ’Data Matrix ECC 200’ is created. Then, the model is trained by applying the operator 

find_data_code_2d to selected images using the parameter ’train’. This step changes the 2D data 

code model, which then is stored into the default file ’2d_data_code_model.dcm’ using the operator 

write_data_code_2d_model. Finally, the model is deleted to free the allocated memory. 

create_data_code_2d_model (’ECC200’, [], [], DataCodeHandle) 



write_data_code_2d_model (DataCodeHandle, ’2d_data_code_model.dcm’) 


Using the saved file, you can restore the saved model in a later session. In the HDevelop 

program hdevelop\read_2d_data_code_model.dev, instead of calling the operator create_data_code_2d_model 

at the beginning of the program, the operator read_data_code_2d_model 

is used to create a 2D data code handle, which loads the parameter settings described in the file 

’2d_data_code_model.dcm’. 

read_data_code_2d_model (’2d_data_code_model.dcm’, DataCodeHandle) 

for i := 7 to 16 by 1 

read_image (Image, ’datacode/ecc200/ecc200_cpu_0’ + (round(i)$’.2’)) 



endfor 


4 Preprocessing Difficult Images 

The different methods for changing the model parameters follow two complementary goals. The global 

parameter settings of the enhanced mode extend the ranges for the parameter values of the 2D data code 

model so that almost any symbol can be read. An automatic training or a manual parameter setting on the 

other hand restricts the ranges to enhance the run time for the images of a specific application. Therefore, 

if a symbol cannot be read in enhanced mode it is most likely not readable with other parameter settings 

as well (the few exceptional parameters that may be out of the range specified for the enhanced mode 

are listed in section 5.3.2 on page 41). Reasons for a failure of the 2D data code reader comprise various 

irreparable distortions of the symbol, which will be introduced in section 5.2 on page 37, and several 

problems that occur because of the bad quality of the symbol’s appearance. Both problems should be 

avoided already when acquiring the image. But if you nevertheless have to work with images of bad 

quality, at least the following problems have a chance to be solved by preprocessing the image before 

applying the operator find_data_code_2d: 

• The symbol cannot be read because it is slanted beyond the allowed slant angle (for troubleshooting 

see section 4.1), 

• the size of the modules is smaller than the minimum size allowed for the specific symbol type (for 

troubleshooting see section 4.2),

4.1 Slanted Symbol 21 

• the gaps between the modules of a matrix code are larger than the biggest allowed module gap (for 

troubleshooting see section 4.3), 

• there is too much noise in the image so that foreground and background of the symbol cannot be 

clearly distinguished anymore (for troubleshooting see section 4.4). 

4.1 Slanted Symbol 

In section 3.3 on page 16 it was mentioned that small slant angles are allowed for the L-shaped finder 

element of a Data Matrix ECC 200 code. Large slant angles may lead to problems for all symbols, 

independent of the symbol type. If a symbol is strongly skewed because of radial or perspective distortions, 

a PDF417 or a QR Code cannot be decoded and a Data Matrix ECC 200 is not even detected as 

a valid symbol. To nevertheless decode those symbols, you have to rectify the image before applying 

the operator find_data_code_2d. If you have a set of images with slanted symbols we recommend a 

calibrated rectification. Comprehensive information can be found in the Application Note on 3D Machine 

Vision, section 3.3 on page 49. For single symbols or a set of symbols lying in the same plane, the 

HDevelop program hdevelop\2d_data_codes_rectify_symbol.dev provides a fast solution for an 

uncalibrated rectification of a perspective distortion, which requires a little manual effort. 

After reading an image with a slanted symbol, we use the online zooming tool of HDevelop to obtain 

the coordinates of all four corners of the symbol. Then, we define the coordinates of the four corners of 

the rectified symbol we want to obtain as result of the rectification. The shape of the rectified symbol 

depends on the ratio of rows and columns for the specified symbol type. As the symbol in our image 

is a Data Matrix ECC 200 with an equal number of rows and columns, we choose a square shape and 

define its corner coordinates. The operator hom_vector_to_proj_hom_mat2d uses the coordinates of 

the slanted symbol and the coordinates of the rectified symbol to compute a transformation matrix, which 

is then used by the operator projective_trans_image to transform the slanted symbol into a square 

symbol with the defined extent. If you have a set of symbols lying in the same plane, the transformation 

matrix only has to be computed once since it can be used to rectify the other symbols as well. 

hom_vector_to_proj_hom_mat2d ([130, 225, 290, 63], [101, 96, 289, 269], [1, 

1, 1, 1], [70, 270, 270, 70], [100, 100, 300, 

300], [1, 1, 1, 1], ’normalized_dlt’, 

HomMat2D) 

projective_trans_image (Image_slanted, Image_rectified, HomMat2D, 

’bilinear’, ’false’, ’false’) 

The rectified symbol can be found, read, and decoded as described in section 2 on page 5 and section 3 

on page 7. Figure 16 shows the slanted symbol before and after the rectification and decoding. 

4.2 Small Module Size 

2D data codes can only be read robustly if the minimum size of a module does not fall below 

four pixels for matrix codes and three pixels for PDF417 codes. If you work with images of 

lower resolution you can nevertheless read them after zooming them. The HDevelop program hdevelop\2d_data_codes_enlarge_modules.dev 

reads an image of low resolution. To increase the 

resolution of the symbol, the image is zoomed by a factor of 2 using the operator zoom_image_factor. 

Preprocessing


Figure 16: Uncalibrated rectification of an individual image with small manual effort: (left) symbol with 

perspective distortion, (right) decoded symbol after rectification. 

zoom_image_factor (Image, ImageZoomed, 2, 2, ’constant’) 

To study the effect of the zooming, you can use the zooming tool of HDevelop. In contrast to the zooming 

operation, the zooming tool has no effect on the resolution of the image, but on the display size. Figure 17 

shows a symbol and the modules of its upper right corner before and after the zooming operation. 

a) b) 

Figure 17: Symbol corner enlarged by the zooming tool of HDevelop: (a) before and (b) after increasing 

the resolution of the image. 

Now the number of pixels for each individual module is sufficient and the symbol can be read as described 

in section 2 on page 5 and section 3 on page 7. Figure 18 shows the successfully decoded

symbol. 

Figure 18: After zooming the image, the symbol is found, read, and decoded. 

4.3 Large Module Gaps 

4.3 Large Module Gaps 23 

For matrix codes, gaps between modules are allowed in a certain range. The HDevelop program hdevelop\2d_data_codes_minimize_module_gaps.dev 

shows how to read a Data Matrix ECC 200 

with very large gaps. First, we try to adapt the parameters for the 2D data code model as described 

in section 3 on page 15, i.e., we adapt the parameter ’module_gap_min’ to the biggest allowed 

module gap ’big’ before trying to read the code. The adaptation can be done either with the operator 

set_data_code_2d_param or as shown in the following code lines within the operator create_data_code_2d_model. 

create_data_code_2d_model (’Data Matrix ECC 200’, [’module_gap_min’, 

’module_gap_max’], [’no’, ’big’], 

DataCodeHandle) 

Because the gaps in our image are larger than the biggest allowed module gap, i.e., bigger than 50% of 

the module size, the reading of the symbol fails. Getting no result by adapting the parameters of the 

model, we have to adapt the image to suit the model. Here, we use gray value morphology, in particular 

a gray value erosion with a rectangular structuring element, to enlarge the foreground modules and thus 

minimize the size of the gaps. Figure 19 shows the symbol before and after the gray value morphology. 

After the preprocessing, the symbol is found, read, and decoded. 

gray_erosion_shape (Image, ImageMin, 15, 15, ’rectangle’) 

Note that this procedure mainly works for matrix codes, i.e., Data Matrix ECC 200 and QR Code. For 

them, the distance between the modules only have to become smaller, whereas for PDF417 no gaps are 

allowed at all and the exact closing of the gaps is rather challenging, especially with gaps of slightly 

different size. 

4.4 Noise 

For the detection of 2D data code symbols, background and foreground should be clearly distinguishable, 

i.e., the modules should consist of homogeneous or at least low-textured regions. If you cannot read 

Preprocessing


Figure 19: Preprocessing symbols with large module gaps: (left) original symbol with large gaps between 

the modules and (right) decoded symbol after gray value morphology. 

a symbol because there is too much texture or noise in the image you can try to preprocess the image 

with gray value morphology, a median filter, or a combination of both. The HDevelop program hdevelop\2d_data_codes_minimize_noise.dev 

reads a symbol with several parts distorted by noise. 

Especially the quiet zone in the lower part of the symbol is distorted by a noisy streak. We apply both 

preprocessing steps separately. For the gray value morphology, the operator gray_opening_shape 

with a rectangular structuring element is used. It partly closes the gaps and, what is more important here, 

reduces noise (see figure 20). 

gray_opening_shape (Image, ImageOpening, 7, 7, ’rectangle’) 

The median filter smoothes edges and also reduces noise (see figure 21). 

median_image (Image, ImageMedian, ’circle’, 3, ’continued’) 

For both procedures, because of the noise reduction, the difference between the symbol and the disturbing 

streak in the quiet zone becomes more obvious and the reading is successful. So, in this case both 

preprocessing steps lead to a successful 2D data code reading. 

5 Problem Handling 

The previous section proposed a selection of preprocessing steps for common problems when working 

with 2D data codes. Now, we will go deeper into the handling of problems. 

• In section 5.1 we introduce you to the debugging of the operator find_data_code_2d, which on 

the one hand helps to enhance the run time of successfully decoded symbols and on the other hand

5 Problem Handling 25 

Figure 20: Preprocessing a noisy image: (left) noise at the symbol’s lower border, (right) decoded after 

gray value morphology. 

Figure 21: Preprocessing a noisy image: (left) noise at the symbol’s lower border, (right) decoded after 

median filtering. 

is used for locating problems with symbols that are not decoded. To identify a problem often leads 

to ideas how an undecoded symbol can be preprocessed. 

• Some situations exist where a preprocessing yields no success. These situations and tips how to 

avoid them are presented in section 5.2. 

Problem Handling


• In section 5.3, the requirements and limitations for the 2D data code reader are summarized concisely. 

5.1 Data Access for Debugging 

During the search process of the operator find_data_code_2d, various results besides the decoded 

data string and the XLD contour of the successfully decoded symbol are available. These results provide 

hints how the search process can be enhanced with respect to run time, or why a symbol is not found 

or decoded. The HDevelop program hdevelop\2d_data_codes_data_access.dev shows how to 

access results for various reasons. The following sections describe the single steps of the program in 

detail. In particular, they introduce you to 

• data access in general, as well as a selection of results useful for all debugging purposes (see 

section 5.1.1), 

• a selection of results that are useful when facing symbols that are decoded successfully but slowly 

(see section 5.1.2), 

• a selection of results that may give you a hint why a symbol is not decoded (see section 5.1.3). 

The program concentrates on symbols of type Data Matrix ECC 200. Some characteristics of PDF417 

and QR Codes that deviate from the presented results are introduced in section 5.1.3 on page 35. 

5.1.1 General Information About Data Access 

The results obtained from the operator find_data_code_2d are divided into iconic and alphanumeric 

results. Iconic results, i.e., objects like images or regions, can be queried with the operator 

get_data_code_2d_objects. The input parameters for the operator are the DataCodeHandle, 

a CandidateHandle, and the ObjectName. Alphanumeric results are obtained by the operator 

get_data_code_2d_results. Here, the input parameters are the DataCodeHandle, the Candidate- 

Handle, and the ResultNames. The candidate handle needed for both iconic as well as for alphanumeric 

results specifies either an individual candidate for a symbol or a group of several candidates, for which 

the results or objects are queried. In a single operator call, you can combine a group of candidates with 

an individual result or an individual candidate with a tuple of results. A list of all predefined groups of 

candidates as well as all available object and result names for each SymbolType can be found in the 

Reference Manual at the descriptions of the individual operators. 

In the HDevelop program hdevelop\2d_data_codes_data_access.dev, we first specify the general 

settings, i.e., we create a 2D data code model for symbols of type ’Data Matrix ECC 200’ and set the 

default parameters to the enhanced mode so that all undamaged symbols can be decoded. By default, 

some of the results are stored only temporarily in the 2D data code model. Therefore, we set the model 

parameter ’persistence’ to 1 to keep all intermediate results in memory (see section 3.3.3 on page 

17).

create_data_code_2d_model (’Data Matrix ECC 200’, ’default_parameters’, 

’enhanced_recognition’, DataCodeHandle) 

set_data_code_2d_param (DataCodeHandle, ’persistence’, 1) 

5.1.1 General Information About Data Access 27 

The lists of the available alphanumeric result names and iconic object names are obtained 

by the operator query_data_code_2d_params with the parameters ’get_result_params’ and 

’get_result_objects’ (see also section 3.2.2 on page 11). 

query_data_code_2d_params (DataCodeHandle, ’get_result_params’, 


dev_inspect_ctrl (GenParamNames) 

query_data_code_2d_params (DataCodeHandle, ’get_result_objects’, 

GenObjectNames) 

dev_inspect_ctrl (GenObjectNames) 

After reading an image, we apply the operator find_data_code_2d, which now stores all of its intermediate 

results, so that we can investigate them further. 

For the group of alphanumeric results, we differentiate between general results and results associated 

with a specific symbol candidate or a group of candidates. General results are used, e.g., to get information 

about the number of candidates related to each individual group of candidates. In the program, 

we set the CandidateHandle to ’general’ and pass a tuple containing all available general Result- 

Names. In detail, we want to access the number of all successfully decoded symbols (’result_num’), 

the number of all investigated candidates (’candidate_num’), the number of candidates that were identified 

as symbols but could not be read (’undecoded_num’), the number of candidates that could not be 

identified as valid candidates (’aborted_num’), the lowest and highest pyramid level that is searched 

for symbols (’min_search_level’ and ’max_search_level’), and the number of passes that were 

completed (’pass_num’). The last three values provide us with information about the performance 

of the search process. All result values that are stored in the variable ’GenResult’ are shown in the 

variable window of HDevelop. For better inspection, we display them in a new window (see figure 22). 

GenResultNames := [’result_num’, ’candidate_num’, ’undecoded_num’, 

’aborted_num’, ’min_search_level’, 

’max_search_level’, ’pass_num’] 

GenResultValues := [] 

get_data_code_2d_results (DataCodeHandle, ’general’, GenResultNames, 

GenResultValues) 

GenResult := GenResultNames + ’: ’ + GenResultValues 

dev_inspect_ctrl (GenResult) 

candidatenum := GenResultValues[1] 

undecodednum := GenResultValues[2] 

abortednum := GenResultValues[3] 

If candidates are found the program queries and displays two iconic results of the search process, the 

search image, and the process image. The search image is the pyramid image in which a candidate is 

found, whereas the process image is the pyramid image in which it is investigated more closely. A visual 

inspection of search image and process image often leads to ideas why a symbol is not found or decoded, 

or why the decoding process takes too much time. 

Problem Handling


Figure 22: Display of general results. 

get_data_code_2d_objects (SearchImage, DataCodeHandle, 0, 

’search_image’) 

dev_display (SearchImage) 

write_message (WindowHandle, -1, -1, ’Search image’, true) 

get_data_code_2d_objects (ProcessImage, DataCodeHandle, 0, 

’process_image’) 

dev_display (ProcessImage) 

write_message (WindowHandle, -1, -1, ’Process image’, true) 

Figure 23 shows the search image and the process image of the image we already tried to read in section 

4.4 on page 23. The reason why the noisy streak in the quiet zone cannot be distinguished from the 

symbol becomes more obvious in the low-resolution search image than in the high-resolution original 

image. 

Figure 23: Pyramid images: (left) search image and (right) process image. 

Using the information on the number of candidates in each candidate group, we now discuss two different 

situations. On the one hand, we investigate successfully decoded symbols to get information about

5.1.2 Parameters to Access for Successfully Decoded Symbols 29 

their current parameter values so we can enhance the run time (see section 5.1.2). On the other hand, 

we investigate symbols which are not decoded to find out why they are not decoded and if specific 

preprocessing steps are suitable (see section 5.1.3). 

5.1.2 Parameters to Access for Successfully Decoded Symbols 

The main reason for the debugging of successfully decoded symbols is to enhance the run time of the 

search process, i.e., to reduce the number of the needed passes. For that, you have to restrict the ranges 

for the parameter values of the 2D data code model to a minimum (see section 3.2 on page 9 and section 

3.3 on page 11). To determine a reasonable range for a specific application it is helpful to query 

the current parameter values of the symbols using the operator get_data_code_2d_results. The success 

of the model adaptation can be controlled by comparing the number of completed passes queried in 

section 5.1.1 on page 27 before and after the adaptation. 

In the HDevelop program hdevelop\2d_data_codes_data_access.dev, we query a tuple of results 

which provide us with information about the symbol’s appearance and additional information concerning 

the successful search process (see figure 24). The latter comprises the actual pass in which the 

symbol was generated and processed (’pass’), a status message (’status’), and the decoded data 

string (’decoded_string’). Status messages provide you with the information whether a symbol was 

decoded successfully or why and at which point of the evaluation process the search was aborted for a 

specific candidate. Here, we query the results for successfully decoded symbols, so the status message 

in the tuple ’VariousResults’ is always ’successfully decoded’. 

ResultVariousNames := [’polarity’, ’module_height’, 

’module_width’, ’module_gap’, 

’mirrored’, ’contrast’, ’slant’, 

’pass’, ’status’, ’decoded_string’] 

ResultVariousValues := [] 

get_data_code_2d_results (DataCodeHandle, ResultHandles[j], 

ResultVariousNames, 

ResultVariousValues) 

VariousResults := ResultVariousNames + ’: ’ + 

ResultVariousValues 

dev_inspect_ctrl (VariousResults) 

The ’DecodedDataStrings’ returned by the operator find_data_code_2d and the ’decoded_string’ 

contained in the tuple ’VariousResults’ are restricted to 1024 characters. When 

working with rather large codes, you can query an unrestricted tuple containing all individual numbers 

and characters of the decoded data as ASCII code by passing the result name ’decoded_data’ to the 

operator get_data_code_2d_results. In the program, the resulting tuple ’ResultASCIICode’ is displayed 

in a window (see figure 25). 

get_data_code_2d_results (DataCodeHandle, ResultHandles[j], 

’decoded_data’, ResultASCIICode) 

dev_inspect_ctrl (ResultASCIICode) 

To control the classification of the modules that are determined in the search process the program queries 

two different arrays of regions, in particular the iconic representations of the foreground and background 

Problem Handling


Figure 24: Display of various individual results. 

Figure 25: Display of the decoded data (’MVTec’) as ASCII code. 

modules (’module_1_rois’ and ’module_0_rois’). The returned tuples of regions (’Foreground’ 

and ’Background’) are displayed in figure 26. 

dev_set_color (’red’) 

get_data_code_2d_objects (Foreground, DataCodeHandle, 

ResultHandles[j], ’module_1_rois’) 

dev_display (Foreground) 


get_data_code_2d_objects (Background, DataCodeHandle, 

ResultHandles[j], ’module_0_rois’) 

dev_display (Background) 

5.1.3 Parameters to Access for Symbols that are not Decoded 

For symbols that are not decoded, we differentiate further between symbols that are found but not decoded 

and symbols for which no candidate is classified as a valid symbol. If the minimum size of the 

modules is not set often the modules themselves are determined as candidates. To reduce the number of 

candidates and thus make the investigations more concise, we adapt the model to the minimum module 

size before applying the 2D data code reader to the problematic images (of index ’i’).

5.1.3 Parameters to Access for Symbols that are not Decoded 31 

Figure 26: Visualization of the individual modules of a successfully decoded symbol. 

if (i=3) 

set_data_code_2d_param (DataCodeHandle, ’module_size_min’, 24) 

endif 

if (i=4) 


endif 

if (i=5) 


endif 

Another method to reduce the number of the candidates, especially if the image contains other objects 

with right angles, is to reduce the domain of the image to an ROI containing the complete symbol 

(including its quiet zone). For the creation of ROIs see section 3.4.1 on page 18. 

To access all individual symbol candidates that are found but not decoded, we create the handle ’HandlesUndecoded’ 

by passing the candidate handle ’all_undecoded’ and the result name ’handle’ 

to the operator get_data_code_2d_results. 

get_data_code_2d_results (DataCodeHandle, ’all_undecoded’, 

’handle’, HandlesUndecoded) 

Then, we query the XLD contour and the corresponding status message for the undecoded symbol 

candidates. These provide important information about the reason why a candidate was not found or 

decoded. For successfully decoded symbols we received the XLD contour explicitly by the operator 

find_data_code_2d. For symbols that are not decoded, we have to query the XLD contours using the 

operator get_data_code_2d_objects with the result name ’candidate_xld’. 

Problem Handling


for j := 0 to |HandlesUndecoded|-1 by 1 


get_data_code_2d_results (DataCodeHandle, 

HandlesUndecoded[j], ’status’, 

StatusValue) 

write_message (WindowHandle, -1, -1, StatusValue, true) 

get_data_code_2d_objects (DataCodeObject, DataCodeHandle, 

HandlesUndecoded[j], 

’candidate_xld’) 

dev_display (DataCodeObject) 

Additionally, we visualize the regions of the modules, this time to check, e.g., if a great amount of 

modules is missing or if the modules deviate from the regular grid. Perhaps the modules are too small 

or the quiet zone is distorted by streaks. In these cases, the obtained grid may be translated relative to 

the symbol. These and other problems often can be solved by preprocessing the image as described in 

section 4 on page 20. 

dev_set_color (’red’) 

get_data_code_2d_objects (Foreground, DataCodeHandle, 


’module_1_rois’) 

dev_display (Foreground) 


get_data_code_2d_objects (Background, DataCodeHandle, 


’module_0_rois’) 

dev_display (Background) 

Besides the iconic results, alphanumeric information about the modules can be obtained. During the 

search process, the modules are read row by row. The value 0 defines a certain background module and 

the value 100 defines a certain foreground module. Often, modules have a value somewhere in between. 

An automatically chosen threshold divides the foreground from the background modules. By inspecting 

the values, or more precisely their deviations from 0 or 100, you can evaluate the quality of the module 

classification. 



’bin_module_data’, 

ResultBinModules) 

dev_inspect_ctrl (ResultBinModules) 

endfor 

Figure 27 shows some of the modules of the symbol that we tried to read in section 4.2 on page 21. 

Because the modules are smaller than the specified minimum module size the computed grid cannot 

be fitted correctly to the symbol and each computed region contains parts of several modules, i.e., the 

individual regions contain dark and light parts at the same time. 

The binary values of all regions that are stored in the tuple ’ResultBinModules’ (see figure 28) confirm 

the visual impression, since for most of the regions the values strongly deviate from 0 or 100. Hence, 

the decision whether a region belongs to foreground or background is not reliable. As described in 

section 4.2, the problems caused by a small module size can be solved by increasing the size of the 

image.


Figure 27: Regions of the approximated grid contain parts of multiple modules, which leads to an unreliable 

classification. 

Figure 28: Display of the binary values for each module. 

For candidates that are not detected as valid symbols, no module regions can be obtained. Therefore, after 

creating a handle for ’all_aborted’ candidates called ’HandlesAborted’, the debug information is 

reduced to the status message and the XLD contour for each candidate. 

Problem Handling


get_data_code_2d_results (DataCodeHandle, ’all_aborted’, 

’handle’, HandlesAborted) 

for j := 0 to |HandlesAborted|-1 by 1 


get_data_code_2d_objects (DataCodeObject, 

DataCodeHandle, 

HandlesAborted[j], 

’candidate_xld’) 


dev_display (DataCodeObject) 


HandlesAborted[j], ’status’, 

StatusValue) 

write_message (WindowHandle, -1, -1, StatusValue, true) 

endfor 

For example, in figure 29 the status message tells us that the X-border of the finder element cannot be 

adjusted. By looking at the corresponding XLD contour, it becomes obvious that it cannot be adjusted 

because of the disturbing streak in the quiet zone near the X-border. To differentiate clearly between 

streak and symbol, we have to remove the noise as described in section 4.4 on page 23. 

Figure 29: A border of the finder pattern cannot be adjusted because of the streak in the quiet zone. 

Sometimes no candidate can be found at all. If this happens check visually if some of the problems 

introduced in section 4 on page 20 occur. Perhaps the gaps between the modules are so large that no 

connection between the individual modules is recognizable for the 2D data code reader, or the symbol is


so noisy that the modules cannot be separated from each other. In these cases, you should try to solve 

the problems by applying the proposed preprocessing steps. Sometimes the symbols are damaged and 

cannot be read at all. Some of these situations and tips how to avoid them already at the image acquisition 

are introduced in section 5.2 on page 37. 

In the program, we concentrated on a selection of important results common for Data Matrix ECC 200. If 

you work with PDF417 or QR Codes, some results will differ. An example for rather different debugging 

results concerns slanted symbols. For the Data Matrix ECC 200 symbol that we already presented in 

section 4.1 on page 21 the 2D data code reader searches for a rectangular finder pattern. Because the 

symbol is slanted the right angles are lost and the status messages state problems related to the border 

of the symbol or the finder pattern (see figure 30). Since no valid finder pattern is identified all symbol 

candidates are aborted. 

Figure 30: The alternating border of a rectangular finder pattern cannot be reconstructed. 

For the PDF417 code in figure 31 no rectangular finder pattern is searched for. Therefore, several symbol 

candidates are found and adjusted in relation to the start or stop pattern of the symbol. Because these are 

degenerated and do not correspond to the outline of the actual symbol no data modules are found and the 

symbol is not decoded. 

The results for the QR Code in figure 32 are similar. Again, several symbol candidates are found, because 

the finder pattern, i.e., at least two of the three position detection patterns, are found. However, as the 

candidates are adjusted in relation to those two position detection patterns, the candidate does not fit the 

slanted symbol. Here, at least some of the modules are detected and visualize the square of the incorrect 

symbol candidate. Since inside the candidate the modules are in the wrong place the error correction 

fails. 

Problem Handling


Figure 31: The symbol candidate is related to the stop pattern and does not fit the actual symbol outline. 

Figure 32: The symbol candidate is related to two position detection patterns and does not fit the actual 

symbol outline. 

Additionally, further results are available, e.g., the actual error correction level for PDF417 and QR 

Codes, queried by get_data_code_2d_results with the parameter ’error_correction_level’. 

For PDF417, a value between 0 and 8 can be obtained. 0 means that errors are only detected but not 

corrected. The values 1 to 8 describe an increasing error correction capacity. For QR Codes, the increasing 

levels ’L’, ’M’, ’Q’, and ’H’ are available. As mentioned before, the complete list of results 

specific for each SymbolType can be found in the Reference Manual at the description of the operators 

get_data_code_2d_results and get_data_code_2d_objects.

5.2 Selected Problems and Tips to Avoid Them 

5.2 Selected Problems and Tips to Avoid Them 37 

The 2D data code reader of HALCON is a rather powerful tool, which can be used to read also partly 

distorted symbols. But sometimes a symbol is distorted so much that it cannot be decoded even after 

preprocessing the image. Some distortions are because of damaged symbols, e.g., symbols that are 

not printed correctly or for which a great amount of modules is missing for various reasons. Other 

distortions occur during the image acquisition. These can be avoided by the right acquisition conditions. 

In the following, we introduce you to examples for distortions that result either from 

• a bad geometry, i.e., the modules of the symbol are not placed on a regular grid (see section 5.2.1), 

• or a bad radiometry, i.e., due to bad lighting conditions the individual modules cannot be classified 

correctly (see section 5.2.2). 

We recommend to avoid both situations by flattening the symbols to a level surface and using diffuse 

light at the image acquisition. 

5.2.1 Geometric Distortions 

The modules of an ideal symbol are placed on a regular grid. For matrix codes the grid must be regular 

along the whole symbol, whereas for PDF417 a regular grid is necessary only within the individual 

columns. While searching for a symbol, the operator find_data_code_2d searches for the 

finder pattern (or start and stop pattern for PDF417) of the specified symbol. If found, it approximates 

a grid for the modules, which is adjusted in relation to the finder pattern. Some errors, i.e., 

small deviations of the modules from the grid (up to a displacement of half a modules size, for Data 

Matrix ECC 200 up to a whole modules size when ’module_grid’ is set to ’variable’), can be 

coped with. For stronger deviations the 2D data code reading fails. The HDevelop program hdevelop\2d_data_codes_arbitrary_distortions.dev 

reads symbols with various distortions. In 

Figure 33, e.g., six symbols are printed on paper. The paper is crumpled, so that arbitrary distortions 

occur to the symbols and for four of them the modules deviate so much from the regular grid, that they 

cannot be decoded. 

In contrast to the regular distortion in section 4.1 on page 21, which can be removed by a rectification, 

here the distortions are arbitrary and irregular. Therefore, no preprocessing step can be proposed. You 

have to prevent such a situation already when acquiring the images by attending that the symbols are 

as flat as possible. If your symbols are printed on a flexible surface like paper, you can, e.g., flatten 

them with a glass plate. But then, you have to be very careful and consider the right lighting conditions. 

Otherwise you may get problems because of reflections as described in the next section. 

5.2.2 Radiometric Distortions 

Besides the right geometry of the symbol’s grid, the contrast and a uniform appearance of the symbol’s 

modules, especially with regard to their polarity, are essential for the decoding of 2D data codes. 

The HDevelop program hdevelop\2d_data_codes_arbitrary_distortions.dev searches for the 

symbols of two more images. These contain symbols printed on a reflecting surface. Figure 34 shows an 

image where the reflections are so strong that for some parts of the symbols the contrast approaches 0, 

Problem Handling


Figure 33: Four symbols cannot be read because of arbitrary geometric distortions. 

Figure 34: Two symbols cannot be read because parts of the symbols are not visible because of reflections. 

i.e., the information about the modules of these parts is not available anymore. Thus, the symbols cannot 

be reconstructed by preprocessing and the 2D data code reading fails. 

Figure 35 illustrates another problem that occurs because of bad lighting conditions. Here, because of 

reflections the polarity changes within two of the symbols. This leads to problems, since in most cases a 

change of the appearance of the modules inside a symbol cannot be coped with. 

For both cases, no preprocessing is reasonable to improve the image in a way that makes the symbols 

readable again. Therefore, it is especially important to avoid strong reflections by using diffuse light at

5.3 Requirements and Limitations 39 

Figure 35: Two symbols cannot be read because of a changed module appearance (polarity). 

the image acquisition. 

5.3 Requirements and Limitations 

For a successful 2D data code reading, a symbol’s representation must fulfill certain requirements. Some 

of them were already stated in the preceeding sections. Generally, the value ranges specified for the 

individual parameters should not be exceeded. Most of the limits are soft, i.e., sometimes symbols can 

be read although their parameter values do not completely lie in the specified ranges. But since this 

cannot be ensured you should try to adhere to the rules summarized in section 5.3.1. A concise list of the 

value ranges allowed for the parameter groups related to the size and appearance of the symbols is given 

in section 5.3.2. 

5.3.1 Main Rules to Follow 

All symbol types: 

• The symbol (including the quiet zone) must be contained completely in the image. 

• The modules must fit a regular grid and therefore should have approximately the same size. For 

matrix codes, the grid has to be regular along the whole symbol, whereas for PDF417 codes it 

has to be regular only within the individual columns. Some errors up to a displacement of half a 

modules size (for Data Matrix ECC 200 up to a whole modules size when ’module_grid’ is set 

to ’variable’) can be coped with, whereas errors at the finder pattern are worse than errors at 

the data modules. 

• The appearance of the modules of a symbol should be uniform. Especially the polarity must not 

change within a symbol. 

Problem Handling


• Although a minimum contrast of 1 is allowed, for a stable result the symbols should have a minimum 

contrast of 10 between foreground and background. 

PDF417: 

• For a stable result, the minimum width of the modules should not fall below 3 pixels. 

• At the border of the symbol, there should be a quiet zone of at least 2 module’s width in each 

direction. 

• Gaps between the modules are not allowed. 

Data Matrix ECC 200: 

• For a stable result, the minimum size of the modules should not fall below 4 pixels. 


direction. 

• Gaps between the modules are allowed, but should not exceed 50% of the module’s size. 

• Although theoretically allowed, a slant angle of maximum 0.5235 (30 degrees) should not be 

exceeded. 

QR Code: 

• For a stable result, the minimum size of the modules should not fall below 4 pixels. 


direction. 

• Gaps between the modules are allowed, but should not exceed 50% of the module’s size. 

• At least two of three position detection patterns have to be visible. 

5.3.2 Valid Parameter Ranges 

The following list concisely summarizes the value ranges valid for the parameter groups related to the 

size and appearance of the individual symbol types. A complete list of parameter names that belong to 

each group can be found at the description of the operator set_data_code_2d_param in the Reference 

Manual.

5.3.2 Valid Parameter Ranges 41 

Symbol Type PDF417 Data Matrix ECC 200 QR Code 

Symbol Size 

- Columns 1 - 30 codewords 10 - 144 modules, even 21 - 177 modules 

- Rows 3 - 90 modules 8 - 144 modules, even 21 - 177 modules 

Symbol Shape — rectangle, square, any — 

Model Type — — 1, 2, any 

Version — — 1 - 40 

Polarity dark_on_light, dark_on_light, dark_on_light, 

light_on_dark, light_on_dark, light_on_dark, 

any any any 

Mirrored yes, no, any yes, no, any yes, no, any 

Minimum Contrast 1 - 100 1 - 100 1 - 100 

Module Size — 2 - 200 2 - 200 

Module Aspect Ratio 0.5 - 20 — — 

Module Gaps no gaps no, small, big no, small, big 

Maximum Slant — 0 - 0.7 — 

Module Grid — fixed, variable, any — 

Minimum Position Patterns — — 2, 3 

Note that for most parameters the range of values valid for a symbol corresponds to the range of values 

checked by the 2D data code reader when using the global parameter settings in enhanced mode (see 

section 3.1 on page 8). Exceptions to that rule are 

• the minimum contrast for all symbol types (allowed range: 1 - 100, default in enhanced mode: 

10), 

• the maximum module size for matrix codes (allowed range: 2 - 200, default in enhanced mode: 

100), 

• the module aspect ratio for PFD417 (allowed range: 0.5 - 20, default in enhanced mode for the 

minimum module aspect: 1.0, default in enhanced mode for the maximum module aspect: 10), 

• the maximum slant for Data Matrix ECC200 (allowed range: 0 - 0.7, default in enhanced mode: 

0.5235). 

For a stable result, it is recommended to comply with the value ranges specified for the enhanced parameter 

set. 

In some cases, a measure of the quality of the symbols is asked for. There are various standards available 

that are related to the quality verification of 2D data codes, but their focus differs depending, e.g., on 

the way the symbol is applied to a surface. A printed symbol with square modules and no gaps between 

the modules demands for a different verification standard than a symbol produced by dot peening. An 

overlapping quality standard for all kinds of 2D data codes, independent on their appearance, material, 

production process etc., is not yet available. Therefore, HALCON does not provide a tool to measure 

a symbol’s quality. Nevertheless, if necessary you can use HALCON operators to implement tools 

following the standards suited best for your application. 

Problem Handling

42 Application Note on 2D Data Codes



1D Metrology 

⊲ Detecting edges and measuring their position and distance along lines and arcs 


⊲ Inspect object dimensions 


Overview 

With HALCON, you can perform highly accurate measurements in 1D (i.e., measure distances and 

angles along lines and arcs), 2D (i.e., measure features like the size and orientation of objects), and 

3D (i.e., measure in 3D world coordinates). This Application Note describes 1D metrology, a fast and 

easy-to-use tool that enables you to measure the dimensions of objects with high accuracy. 

A first example and a short characterization of the various methods is given in section 1 on page 4. A 

description of how to create a measure object as well as the basic concepts of 1D metrology can be found 

in section 2 on page 5. Then, the methods to perform 1D metrology are described in detail. 

For more complex measurement tasks, it might be necessary to control the selection of the edges. To 

do this, the measure object can be extended to a fuzzy measure object (section 4 on page 25) where the 

selection of the edges is controlled by fuzzy membership functions. 

Unless specified otherwise, the HDevelop example programs can be found in the subdirectory 

1d_metrology of the directory HALCONROOT \examples\application_guide. 











Contents 

1 First Example and Overview 4 

2 The Basics of Measure Objects 5 

2.1 The Process of 1D Edge Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 

2.2 Creating a Measure Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

3 Using the Measure Object 10 

3.1 The Position of Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

3.2 The Position of Edge Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 

3.3 The Position of Points With a Particular Gray Value . . . . . . . . . . . . . . . . . . . . 15 

3.4 The Gray Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

3.5 Measuring in World Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

3.6 Compensate for Radial Image Distortions . . . . . . . . . . . . . . . . . . . . . . . . . 20 

3.7 Changing the Position and Orientation of the Measure Object . . . . . . . . . . . . . . . 21 

3.8 Tips and Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 

3.9 Destroying the Measure Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

4 Advanced Measuring: The Fuzzy Measure Object 25 

4.1 A First Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

4.2 Controlling the Selection of Edges and Edge Pairs . . . . . . . . . . . . . . . . . . . . . 27 

4.3 Creating the Fuzzy Measure Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 

4.4 Using the Standard Measure Object Functionality . . . . . . . . . . . . . . . . . . . . . 36 

A Slanted Edges 37 

B Examples for Fuzzy Membership Functions 42 

B.1 Set Type contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 

B.2 Set Type position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

B.3 Set Type position_pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 

B.4 Set Type size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 

B.5 Set Type gray . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

B.6 Set Type position_pair combined with size . . . . . . . . . . . . . . . . . . . . . . . . . 61 

C Definition of the Geometric Mean 62 

3

4 Application Note on 1D Metrology 

1 First Example and Overview 

The example program hdevelop\measure_switch.dev is a simple application where the width of and 

the distance between the pins of a switch (see figure 1a) must be measured. This task can easily be solved 

by 1D metrology because positions and distances are measured along a line. 

(a) (b) 

Figure 1: Image of a switch: (a) The width of the pins and their distance from each other must be measured; 

(b) the rectangular ROI within which the edges are determined. 

First, a measure object is created by specifying a rectangle that lies over the pins as depicted in figure 1b. 

The operator gen_measure_rectangle2 returns a handle to the created measure object. 

gen_measure_rectangle2 (Row, Column, Phi, Length1, Length2, Width, Height, 

Interpolation, MeasureHandle) 

To perform the measurement, you pass this handle to the measuring operators. To inspect the switch, 

we use measure_pairs, which extracts the edge pairs corresponding to the pins and returns their width 

(distance between the edges of a pair, IntraDistance) and their distance (distance between edges of 

subsequent pairs, InterDistance). 

measure_pairs (Image, MeasureHandle, Sigma, Threshold, Transition, Select, 

RowEdgeFirst, ColumnEdgeFirst, AmplitudeFirst, 

RowEdgeSecond, ColumnEdgeSecond, AmplitudeSecond, 

IntraDistance, InterDistance) 

The resulting edges are displayed in figure 2 together with the measured width of the pins and the distance 

between them. 

The following sections take a closer look at these steps and explain how to solve more complex tasks: 

Section 2 describes the basics of measuring, i.e., how the edges are determined (section 2.1 on page 6) 

and how to create measure objects for rectangular and circular ROIs (section 2.2 on page 9). 

Section 3 on page 10 explains how to use the measure object to extract edges (section 3.1 on page 11), 

edge pairs (section 3.2 on page 13), points that have a particular gray value (section 3.3 on page 15), and 

the gray value profile (section 3.4 on page 16).

2 The Basics of Measure Objects 5 

Figure 2: The result of measuring the width of and the distance between the pins of a switch. 

Often, it is necessary to determine the dimensions of an object in world coordinates, e.g., in µm. The 

transformation of the results of a measure object into world coordinates is described in section 3.5 on 

page 20. 

Section 3.7 on page 21 describes how to change the position and orientation of measure objects. Some 

tips and tricks for the use of measure objects are given in section 3.8 on page 21. 

Section 4 on page 25 shows how to solve more complex measurement tasks, e.g., if the object shows 

different gray levels or if parts of the image are disturbed by reflections or by shadows that are cast on 

the object. For this purpose, it is possible to introduce special weighting functions to suppress unwanted 

edges. This extension is called fuzzy measure object. 

2 The Basics of Measure Objects 

The main idea behind the measure object is to extract edges that run approximately perpendicular to 

the line or arc of measurement. This process is described in section 2.1. It is important to understand 

this process because you influence it with parameters both when creating the measure object and when 

applying it. Section 2.2 on page 9 shows how to specify the shape of the measuring ROI. Please note that 

in contrast to other HALCON methods, measuring ROIs are not created using reduce_domain, but by 

specifying their dimensions when creating a measure object. A measure object can be used to determine 

the position of straight edges that run approximately perpendicular to the given ROI. 

Basics


2.1 The Process of 1D Edge Extraction 

HALCON determines the position of 1D edges by the following approach: First, equidistant lines of 

projection are constructed perpendicular to the line or arc of measurement (also called profile line) with 

a length equal to the width of the ROI (figure 3). 

Start 

Profile Line 

1 Pixel 

ROI 

End 

(a) (b) 

End 

ROI 

Profile Line 

Figure 3: Lines of projection for a (a) rectangular ROI and (b) circular ROI. 

Then, the mean gray value along each line of projection is calculated. The sequence of these mean values 

is called the profile. If the lines of projection are not oriented horizontally or vertically, the pixel 

values along them must be interpolated. You can select the interpolation mode with the parameter Interpolation 

of the operators for the creation of measure objects (e.g., gen_measure_rectangle2, 

see section 1 on page 4). If Interpolation = ’nearest_neighbor’, the gray values in the measurement 

are obtained from the gray values of the closest pixel, i.e., by constant interpolation. This is the fastest 

method. However, the geometric accuracy is slightly lower in this mode. For Interpolation = ’bilinear’, 

bilinear interpolation is used, while for Interpolation = ’bicubic’, bicubic interpolation is used. 

The bicubic interpolation yields the most accurate results. However, it is the slowest method. 

The length of the lines of projection, i.e., the width of the ROI, defines the degree of averaging in the 

direction perpendicular to the profile line. The thick lines in figure 4c and d show the profiles for the two 

ROIs displayed in figure 4a and b. It can be seen that the profile that corresponds to the wider ROI is 

less noisy. Therefore, as long as the edges run approximately perpendicular to the profile line, the ROI 

should be chosen as wide as possible. If the edges do not run perpendicular to the profile line, the width 

of the ROI must be chosen smaller to allow the detection of edges. Note that in this case, the profile will 

contain more noise. Consequently, the edges will be determined less accurate (see appendix A on page 

37). 

The profile is then smoothed with a Gaussian smoothing filter (see the thick lines in figure 4e and f), 

whose standard deviation is specified with the parameter Sigma of the measuring operators (e.g., measure_pairs). 

Now, the first derivative of the smoothed profile is calculated (thin lines in figure 4e-f). 

Note that the amplitude of the derivative is scaled with a factor √ 2π · Sigma. The subpixel precise positions 

of all local extrema of the first derivative are edge candidates. In figure 5a on page 8, these edge 

candidates are indicated by vectors pointing from the respective position of the derivative (thin line) to 

the smoothed gray value profile (thick line). Only those edge candidates that have an absolute value of 

1 Pixel 

Start

gray value 

gray value 

250 

200 

150 

100 

50 

(a) ROI of Width=2 (b) ROI of Width=20 

0 

0 20 40 60 80 100 120 

250 

200 

150 

100 

50 

0 

-50 

-100 

profile line 

gray value 

250 

200 

150 

100 

50 

2.1 The Process of 1D Edge Extraction 7 

0 

0 20 40 60 80 100 120 

profile line 

(c) Width=2, no smoothing (d) Width=20, no smoothing 

-150 

0 20 40 60 80 100 120 

profile line 

gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 

-150 

0 20 40 60 80 100 120 

profile line 

(e) Width=2, Sigma=0.9 (f) Width=20, Sigma=0.9 

Figure 4: Profiles (thick lines) and derivatives (thin lines) of the profiles for two ROIs with different width 

with and without smoothing the profiles (the orientation of the ROIs is from top left down to the 

right). 

the first derivative larger than a given Threshold (another parameter of the measuring operators) are 

considered as edges in the image (see figure 5b on page 8). 

For each edge, the position of its intersection with the profile line is returned. Figure 6 shows the detected 

edges, displayed by straight lines with a length similar to the width of the ROI. 

Basics


gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 

-150 

0 20 40 60 80 100 120 

profile line 

gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 

-150 

0 20 40 60 80 100 120 

(a) (b) 

profile line 

Figure 5: Positions of (a) edge candidates and (b) edges along the profile. The profile was derived from 

the 20 pixel wide ROI, it was smoothed with Sigma = 0.9. 

Figure 6: Edges detected from the 20 pixel wide ROI. The profile was smoothed with Sigma = 0.9 and 

thresholded with a Threshold of 12. 

Note that the above described approach for the extraction of 1D edges from a profile differs significantly 

from the 2D edge extraction performed, e.g., with the operator edges_sub_pix (compare figure 7 on 

page 9). In the case of the 1D edge extraction, only the positions of the intersections of the edges with 

the profile line are determined as described above whereas the 2D edge extraction returns the complete 

edges as contours. If the edges are straight and run approximately perpendicular to the profile line, the 

results of the 1D edge extraction are similar to the intersection of the 2D edges with the profile line. The 

more the edges are curved or not perpendicular to the profile line, the more different the results of the 

two approaches will be. An example where the results of the two approaches are completely different is 

given in section 3.1 on page 11. 

Keep in mind that measure objects are designed to measure the position of straight edges that run approximately 

perpendicular to the profile line. This means that measure objects are not suited to measure the 

position of, e.g., curved edges. In those cases, the measure position will typically not coincide with the 

intersection of the gray value edge with the profile line. An example for such a case is given in figure 8. 

Here, the edges are curved, which leads to wrong measurement results. This effect will be reduced if the

Figure 7: Edges detected in 2D with the operator edges_sub_pix . 

2.2 Creating a Measure Object 9 

Figure 8: An example for an incorrect use of the measure object. The ROI is oriented from left to right. 

width of the ROI is chosen smaller. 

2.2 Creating a Measure Object 

Measure objects can be created with two different shapes: Rotated rectangles and circular arcs. They are 

created with the operators gen_measure_rectangle2 or gen_measure_arc. 

To create a rectangular measure object, you use the operator gen_measure_rectangle2: 



A rotated rectangle is defined by its center (Row, Column), its orientation Phi, and its half edge lengths 

Length1 and Length2 (see figure 9). The parameter Phi is given in radians in mathematically positive 

sense. All the other parameters are given in pixels. 

To create a circular measure object, you use the operator gen_measure_arc: 

Basics


Start 

Length2 

Profile Line 

Length1 

ROI 

Phi 

Center(Row,Column) 

Figure 9: Rectangular region of interest (ROI). 

gen_measure_arc (CenterRow, CenterCol, Radius, AngleStart, 

AngleExtent, AnnulusRadius, Width, Height, 


The circular arc is defined by the following parameters (see figure 10 on page 11): The position and 

size are defined by the center (CenterRow, CenterColumn) and the radius (Radius). The segment 

of the circle is defined by its start angle (AngleStart) and the angular extent (AngleExtent). The 

width of the circular ROI is defined by the annulus radius (AnnulusRadius). Again, all parameters are 

given in pixels, except for the angles (AngleStart and AngleExtent), which are given in radians in 

mathematically positive sense. 

The first parameters of both operators for the creation of measure objects define the position, size, and 

orientation of the respective ROI. The parameters Width and Height are the size of the images, on which 

the measure can be applied. The parameter Interpolation sets the interpolation mode (see section 2.1 

on page 6). 

As a result, the operators return a handle for the newly created measure object (MeasureHandle), which 

can then be used to specify the measure object, e.g., in calls to the operator measure_pos. Note that if 

you use HALCON’s COM or C++ interface and call the operator via the classes HMeasureX or HMeasure, 

no handle is returned because the instance of the class itself acts as your handle. 

3 Using the Measure Object 

For different tasks, depending on your application, you can measure the position of individual edges and 

the distance between them (see section 3.1). If the object to inspect has two edges (like the edge pairs in 

section 1 on page 4), you can let HALCON group the edges to edge pairs (see section 3.2 on page 13). 

End

End 

Annulus 

Radius 

ROI 

Radius 

Profile Line 

AngleExtent 

AngleStart 

Center(CenterRow,CenterColumn) 

Figure 10: Circular region of interest (ROI). 

3.1 The Position of Edges 11 

Furthermore, the position of points on the profile that have a particular gray value can be determined 

(section 3.3 on page 15). It is also possible to access the gray value profile itself (section 3.4 on page 

16). 

Often, it is necessary to determine the dimensions of an object in world coordinates, e.g., in µm. The 

transformation of the results of a measure object into world coordinates is described in section 3.5 on 

page 20. 

Section 3.7 on page 21 describes how to change the position and orientation of measure objects. Some 

tips and tricks for the use of measure objects are given in section 3.8 on page 21. 

3.1 The Position of Edges 

You measure the position of the individual edges with the operator measure_pos. The edges are determined 

with the approach described in section 2.1 on page 6. With this approach, it is also possible to 

extract edges that appear only if the image is smoothed in a direction perpendicular to the ROI. 

In the example program hdevelop\measure_ic_leads.dev, we use this operator to measure the 

length of the leads of the IC that is displayed in figure 11a. At a first glance, it is not clear how to 

solve this task because the individual leads are not enclosed by two edges. However, we can exploit the 

averaging perpendicular to the profile line: If the ROI encloses multiple leads, as depicted in figure 11a, 

the averaged gray value of the leads is darker than the background, but lighter than the body of the IC 

(see figure 11b and figure 11c). 

Let’s take a look at the corresponding code. First, a measure object is created for the ROI at the top of 

the image. The ROI must have a width such that it contains several leads. 

Start 

Measuring


(a) 

gray value 

gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 

-150 

0 20 40 60 80 100 

250 

200 

150 

100 

50 

0 

-50 

-100 

profile line 

(b) 

-150 

0 20 40 60 80 100 

profile line 

(c) 

Figure 11: (a) Image of an IC with the ROIs that are used for the determination of the lead length; (b) and 

(c) show the profile (thick line) and its derivation (thin line) for (b) the ROI at the top and (c) the 

ROI at the bottom of the image in (a). The ROIs are oriented from the top to the bottom of the 

image. 



Now, all edges that are perpendicular to the ROI are extracted. 

measure_pos (Image, MeasureHandle, Sigma, Threshold, Transition, Select, 

RowEdge, ColumnEdge, Amplitude, Distance) 

The resulting edges enclose the leads (see figure 12). Their distance is returned in the parameter Distance. 

In the example, only two edges are returned. Therefore, the parameter Distance contains only 

one element, which is the length of the leads at the top of the image. To determine the length of the leads 

at the bottom of the image, the same measurements are carried out within the second ROI. 

The parameter Sigma determines the degree of smoothing of the profile along the profile line. The 

parameter Threshold defines the threshold that is applied to the derivative of the profile for the selection 

of edges. Both parameters are described in section 2.1 on page 6. 

The parameter Transition can be used to select edges with a particular transition. If Transition is

Figure 12: The result of the determination of the lead length. 

3.2 The Position of Edge Pairs 13 

set to ’negative’, only edges at which the gray values change from light to dark will be returned. In 

the current example, the two edges in the ROI at the top of the image have a negative transition. If 

Transition is set to ’positive’, only edges at which the gray values change from dark to light (e.g., the 

two edges in the ROI at the bottom of the image) will be returned. To determine all edges, Transition 

must be set to ’all’ in this application. 

The parameter Select can be used to restrict the result to the ’first’ or ’last’ edge. In the current example, 

Select must be set to ’all’. 

Besides the distance of the edges, measure_pos also returns their position in the output parameters 

RowEdge and ColumnEdge as well as their amplitude (Amplitude). In the example, the position is used 

to visualize the edges: 

disp_line (WindowHandle, RowEdge, ColumnEdge-Length2, RowEdge, 

ColumnEdge+Length2) 

3.2 The Position of Edge Pairs 

As already noted in section 1 on page 4, whenever the object to measure is enclosed by two edges, you 

can use the operator measure_pairs, which groups edges to edge pairs. In the following example, we 

use this operator to determine the width of the leads from the image displayed in figure 11a on page 12. 

First, a suitable ROI must be defined. Because we want to determine the edges of the individual leads, 

the ROI must be oriented perpendicular to the leads (see figure 13a). Here, it is oriented from the left to 

the right. With this ROI, a measure object is created as described in section 2 on page 5. 

Now, all edge pairs that enclose dark image regions are extracted 

(hdevelop\measure_ic_leads.dev): 

Measuring


(a) (b) 

Figure 13: (a) The ROI for and (b) a part of the result of the determination of the lead width. 

measure_pairs (Image, MeasureHandle, Sigma, Threshold, Transition, Select, 

RowEdgeFirst, ColumnEdgeFirst, AmplitudeFirst, 

RowEdgeSecond, ColumnEdgeSecond, AmplitudeSecond, 

IntraDistance, InterDistance) 

The resulting edges are displayed in figure 13b, which shows only a part of the image. 

The parameters of measure_pairs are partly identical to the ones of measure_pos (see section 3.1 on 

page 11): Sigma determines the degree of smoothing of the profile. The parameter Threshold defines 

the threshold that is applied to the derivative of the profile for the selection of edges. Both parameters 

are described in section 2.1 on page 6. 

The parameter Transition can be used to select edge pairs with a particular transition. If Transition 

is set to ’negative’, as in the example, only edge pairs where the first edge has a negative transition 

and the second edge has a positive transition, i.e., edge pairs that enclose a dark image region will be 

returned. If Transition is set to ’positive’, the behavior is exactly opposite, i.e., edge pairs that enclose 

a bright image region will be returned. If Transition is set to ’all’, the transition is determined by the 

first edge. I.e., dependent on the positioning of the measure object, edge pairs with a light-dark-light 

transition or edge pairs with a dark-light-dark transition are returned. This is suited, e.g., to measure 

objects with different brightness relative to the background. 

If more than one consecutive edge with the same transition is found, only the first one will be used 

as a pair element. This behavior may cause problems in applications in which the Threshold cannot 

be selected high enough to suppress consecutive edges of the same transition. This problem can be 

demonstrated, e.g., by defining the ROI in the example program hdevelop\measure_switch.dev (see 

section 1 on page 4) the other way round, i.e., from bottom right to top left. Then, the first edge with a 

negative transition corresponds to the shadow of the pin and the edge that corresponds to the pin itself is 

not used as a pair element (see figure 14a).

3.3 The Position of Points With a Particular Gray Value 15 

For these applications, an advanced pairing mode exists that only selects the strongest edges of a sequence 

of consecutive edges of the same transition. This mode is selected by appending ’_strongest’ 

to any of the above modes for Transition, e.g., ’negative_strongest’. Figure 14b shows the result of 

setting Transition to ’negative_strongest’. 

(a) (b) 

Figure 14: The results of measuring the width of and the distance between the pins of a switch if the 

ROI is defined from bottom right to top left: (a) Transition = ’negative’, (b) Transition = 

’negative_strongest’. 

The parameter Select can be used to restrict the result to the ’first’ or ’last’ edge pair. To determine all 

edge pairs, Select must be set to ’all’. 

The operator measure_pairs returns the coordinates of the edge positions separately for the first and 

the second edge of each edge pair. The output parameters RowEdgeFirst and ColumnEdgeFirst 

contain the coordinates of the first edges of the edge pairs and the parameters RowEdgeSecond and 

ColumnEdgeSecond contain the coordinates of the second edges. With this, the center of the pairs can 

easily be determined by calculating the mean of the respective first and second edge positions: 

RowPairCenter := (RowEdgeFirst+RowEdgeSecond)/2.0 

ColumnPairCenter := (ColumnEdgeFirst+ColumnEdgeSecond)/2.0 

The output parameters IntraDistance and InterDistance contain the distance between the two 

edges of the edge pairs and the distance between the edge pairs. 

3.3 The Position of Points With a Particular Gray Value 

With the operator measure_thresh, it is also possible to determine the subpixel precise positions where 

the gray value profile has a given value. This 1D thresholding works like the 2D subpixel precise thresholding 

operator threshold_sub_pix. 

This operator can be useful to extract particular edges in applications where the lighting conditions can 

be controlled very accurately. Note that the measured position depends heavily on the brightness of the 

image, i.e., the measured position will vary if the brightness level of the image changes, e.g., due to 

illumination changes. 

Measuring


3.4 The Gray Projection 

For applications that require the determination of special 1D features, it is possible to access the gray 

value profile directly via the operator measure_projection. It returns the original, i.e., unsmoothed 

gray value profile as a tuple in the output parameter GrayValues. On the profile, points that satisfy a 

particular condition can be determined. Then, the pixel coordinates of these points can be calculated. 

In the example program hdevelop\measure_caliper.dev, the distances between the pitch lines of a 

caliper are determined. The used image (figure 15a) shows the pitch lines as approximately one pixel 

wide lines (see, e.g., figure 15b). To determine the line positions in the gray value profile, a 1D line 

extraction is implemented based on the gray value profile. 

(a) (b) 

Figure 15: (a) The image of a caliper; (b) detail of (a). 

First, the measure object is created. It is positioned such that only the longest pitch lines, i.e., those 

which indicate full centimeters run through the measuring ROI. 


’bilinear’, MeasureHandle) 

Then, the profile is determined with the operator measure_projection, which returns the profile in the 

output parameter GrayValues: 

measure_projection (Image, MeasureHandle, GrayValues) 

To reduce the noise, the profile must be smoothed because the operator measure_projection returns 

the original, i.e., unsmoothed profile.

Sigma := 0.3 

create_funct_1d_array (GrayValues, Function) 

smooth_funct_1d_gauss (Function, Sigma, SmoothedFunction) 

The smoothed profile is displayed in figure 16. 

gray value 

250 

200 

150 

100 

50 

0 

0 50 100 150 200 250 300 350 400 

profile line 

Figure 16: The smoothed profile. 

3.4 The Gray Projection 17 

To detect line points on the profile, it is sufficient to determine the points where the first derivative of the 

smoothed profile vanishes. The first derivative of the smoothed profile is determined with the operator 

derivate_funct_1d with the control parameter Mode set to ’first’; the points where the first derivative 

vanishes are determined with the operator zero_crossings_funct_1d: 

derivate_funct_1d (SmoothedFunction, ’first’, FirstDerivative) 

zero_crossings_funct_1d (FirstDerivative, ZeroCrossings) 

The first derivative of the smoothed profile is displayed in figure 17a. 

first derivative 

15 

10 

5 

0 

-5 

-10 

-15 

0 50 100 150 200 250 300 350 400 

profile line 

second derivative 

30 

20 

10 

0 

-10 

-20 

-30 

0 50 100 150 200 250 300 350 400 

(a) (b) 

profile line 

Figure 17: The (a) first and (b) second derivative of the smoothed profile. 

Because of noise, the first derivative shows many more zero crossings than just in the middle of each line, 

i.e., between the large negative and positive spikes in figure 17a. We can select the desired zero crossings 

Measuring


by looking at the magnitude of the second derivative: Bright lines on a dark background will have a 

negative second derivative while dark lines on a bright background will have a positive second derivative. 

The second derivative of the smoothed profile is determined with the operator derivate_funct_1d with 

the control parameter Mode set to ’second’: 

derivate_funct_1d (SmoothedFunction, ’second’, SecondDerivative) 

The second derivative of the smoothed profile is displayed in figure 17b on page 17. 

In our case, the pitch lines appear as dark lines on a bright background. Therefore, we have to select 

those line positions, where the second derivative has a large positive value: For each line position, i.e., 

zero crossing of the first derivative, the value of the second derivative is determined with the operator 

get_y_value_funct_1d. If this value is larger than a user defined threshold, the line position will be 

stored in a tuple that holds the positions of all salient lines. 

MinimumMagnitudeOfSecondDerivative := 8 

PositionOfSalientLine := [] 

for i := 0 to |ZeroCrossings|-1 by 1 

get_y_value_funct_1d (SecondDerivative, ZeroCrossings[i], ’constant’, 

Y) 

if (Y > MinimumMagnitudeOfSecondDerivative) 

PositionOfSalientLine := [PositionOfSalientLine,ZeroCrossings[i]] 

endif 

endfor 

Finally, the positions of the lines on the profile must be transformed into pixel coordinates because 

measure_projection returns just the gray value profile along the profile line. Note that internally, the 

coordinates of the center of the measuring ROI are rounded and the length of the measuring ROI is set 

to the largest integer value that is smaller or equal to the given value. 

The pixel coordinates of the start point of the profile line can be determined as follows (see figure 18): 

RowStart = ⌊(Row + 0.5)⌋ + ⌊Length1⌋ · sin(Phi) 

ColStart = ⌊(Column + 0.5)⌋ − ⌊Length1⌋ · cos(Phi) 

This can be realized by the following lines of code: 

RowStart := floor(Row+0.5)+floor(Length1)*sin(Phi) 

ColStart := floor(Column+0.5)-floor(Length1)*cos(Phi) 

Now, the pixel coordinates of the positions of the salient lines on the profile can be determined as follows: 

RowLine = RowStart − PositionOfSalientLine · sin(Phi) 

ColLine = ColStart + PositionOfSalientLine · cos(Phi) 

In HDevelop, this can be expressed by the following lines of code: 

RowLine := RowStart-PositionOfSalientLine*sin(Phi) 

ColLine := ColStart+PositionOfSalientLine*cos(Phi) 

Figure 19 shows the positions of the determined lines.

Start 

Position 

of line 

Length1 

ROI 

Profile Line 

Phi 

Center(Row,Column) 

End 

Figure 18: The measuring ROI with the position of a point on the profile. 

(a) (b) 

Figure 19: (a) The determined lines; (b) detail of (a). 

3.4 The Gray Projection 19 

Measuring


In the following section, the pixel coordinates of the pitch lines and the distance between them are 

transformed into world coordinates. 

If you are using a circular measure object, which has been created with the operator gen_measure_arc, 

the following formulae must be used to determine the pixel coordinates RowPoint and ColPoint of a 

point on the profile: 

RowPoint = ⌊(CenterRow + 0.5)⌋ − Radius · sin(AngleStart + Position) 

ColPoint = ⌊(CenterColumn + 0.5)⌋ + Radius · cos(AngleStart + Position) 

Here, Position is the position of the point on the profile and CenterRow, CenterColumn, Radius, 

and AngleStart are the parameters used for the creation of the measure object with the operator 

gen_measure_arc. 

3.5 Measuring in World Coordinates 

In many applications, the dimensions of the measured object must be determined in world coordinates, 

e.g., in µm. This can either be achieved by transforming the original measurement results into world 

coordinates (see Application Note on 3D Machine Vision, section 3.2 on page 44) or, by first rectifying 

the image and applying the measurement in the transformed image (see Application Note on 3D Machine 

Vision, section 3.3 on page 49). As a prerequisite, the camera must be calibrated (Application Note on 

3D Machine Vision, section 3.1 on page 29). 

In the example program hdevelop\measure_caliper.dev, first the positions of the pitch lines of a 

caliper are determined (see section 3.4 on page 16). These positions are now transformed into world 

coordinates with the operator image_points_to_world_plane: 

image_points_to_world_plane (CamParam, WorldPose, RowLine, ColLine, ’mm’, 

X, Y) 

The parameter CamParam contains the camera parameters and the parameter WorldPose the position and 

orientation of the world plane in which the measurement takes place. Both parameters can be determined 

during the camera calibration (Application Note on 3D Machine Vision, section 3.1 on page 29). In the 

current example program, they are assumed to be known. 

Now, the distance between the pitch lines can be calculated with the operator distance_pp: 

Num := |X| 

distance_pp (X[0:Num-2], Y[0:Num-2], X[1:Num-1], Y[1:Num-1], Distance) 

The resulting distances are displayed in figure 20. 

3.6 Compensate for Radial Image Distortions 

If the image suffers from heavy radial distortions, a correction of the image coordinates or a rectification 

of the image will be necessary. 

A detailed description of how to correct the image coordinates can be found in Application Note on 3D 

Machine Vision, section 3.2.6 on page 48. The description of how to rectify images can be found in the 

Application Note on 3D Machine Vision, section 3.3.2 on page 55.

3.7 Changing the Position and Orientation of the Measure Object 21 

Figure 20: The distances between the pitch lines in world coordinates. 

3.7 Changing the Position and Orientation of the Measure Object 

Often, the position of the objects to be measured varies from image to image. In this case, the ROI of the 

measure must be adapted accordingly. This process is also called alignment. 

The position and orientation of the object to be measured can be determined, e.g., with shape-based 

matching. For a detailed description of shape-based matching, please refer to the Application Note 

on Shape-Based Matching. This manual also contains an example for aligning a measure ROI (see 

Application Note on Shape-Based Matching, section 4.3.4 on page 36). 

If only the position of the measure ROI must be changed, the operator translate_measure can be 

used. If also the orientation must be adapted as in the above mentioned example in the Application Note 

on Shape-Based Matching, it is necessary to transform the parameters that define the measure ROI and 

to create a new measure object with these parameters. 

3.8 Tips and Tricks 

3.8.1 Distances for Circular ROIs 

There are multiple representations for the distance between two edges that are determined with a measure 

object that is based on a circular ROI (see figure 21): 

Measuring


• arc length, 

• linear distance, and 

• angular distance. 

End 

ROI 

Radius 

Second edge 

Arc length 

Linear distance 

Angular distance 

First edge 

Start 

Center(CenterRow,CenterColumn) 

Figure 21: Distances for circular ROIs. 

Measure objects based on a circular ROI return the distances as arc lengths. 

The example program hdevelop\measure_ring.dev shows that it is very easy to transform the measurements 

into another representation. In this example, the size of the cogs of a cogwheel must be 

measured (see figure 22). 

First, a measure object is created based on a circular ROI: 

gen_measure_arc (CenterRow, CenterCol, Radius, AngleStart, 

AngleExtent, AnnulusRadius, Width, Height, 


Then, the edges of the cogs are determined with the operator measure_pairs: 

measure_pairs (Image, MeasureHandle, Sigma, Threshold, Transition, 

Select, RowEdgeFirst, ColumnEdgeFirst, 

AmplitudeFirst, RowEdgeSecond, ColumnEdgeSecond, 

AmplitudeSecond, IntraDistance, InterDistance) 

The size of the cogs, i.e., the distance between the edges of the edge pairs is returned in the output 

parameter ’IntraDistance’ as arc length. 

The linear distance can be calculated from the coordinates of the edges with the operator distance_pp:

3.8.2 ROIs that lie partially outside the image 23 

Figure 22: The size of the cogs of the ring in the upper left corner of the image given in different representations. 

distance_pp (RowEdgeFirst, ColumnEdgeFirst, RowEdgeSecond, 

ColumnEdgeSecond, LinearDistance) 

The angular distance can be derived from the arc lengths as follows: 

AngularDistance = 

Arc length 

Radius 

which is implemented in the HDevelop example program as follows: 

AngularDistance := deg(IntraDistance/Radius) 

Note that here the angular distance is determined in degrees. 

3.8.2 ROIs that lie partially outside the image 

Gray value information exists only for positions on the profile line where at least a part of the respective 

line of projection (see section 2.1 on page 6) lies inside the image. If parts of an ROI lie outside the 

image, the edges inside the image and the distances between them will still be determined correctly. 

The example presented in figure 23 demonstrates such a case. Here, the edges of a Siemens star must be 

determined with a measure object centered at the center of the Siemens star. In figure 23a, the image as 

well as the outline of the ROI are shown. The upper part of the ROI lies outside the image. The distances 

between the edges as returned by the measure object are displayed in figure 23b. 

Measuring


(a) (b) 

Figure 23: Image of a Siemens star: (a) The circular ROI lies partially outside the image; (b) the measured 

distances between the edges. 

Note that at positions where no gray value information is available, the profile contains zeros. Figure 24 

shows the gray value profile determined with the operator measure_projection from the measure 

object that was created with the above described circular ROI (see figure 23a). 

gray value 

250 

200 

150 

100 

50 

0 

0 100 200 300 400 500 600 700 

profile line 

Figure 24: The gray value profile from the circular ROI that lies partially outside the image.

3.8.3 Accuracy vs. Speed 

3.8.3 Accuracy vs. Speed 25 

By default, the averaging of gray values (see section 2.1 on page 6) is performed using fixed point 

arithmetic, to achieve maximum speed. You can trade some of the speed against accuracy by setting the 

system parameter ’int_zooming’ to ’false’ with the operator set_system. Then the internal calculations 

are performed using floating point arithmetic. Note that the gain in accuracy is minimal. 

See also the description of the parameter Interpolation at section 2.1 on page 6. 

3.8.4 Image Type 

Measure objects can only be applied to images of type ’byte’ and ’uint2’. 

Note that if a multi-channel image is passed to the measuring operators (e.g., measure_pos), only the 

first channel of the image will be used. 

3.9 Destroying the Measure Object 

If the measure object is not longer needed in the program, it should be destroyed with the operator 

close_measure: 

close_measure (MeasureHandle) 

This deletes the measure object and frees all of the associated memory. 

4 Advanced Measuring: The Fuzzy Measure Object 

Some applications require that the selection of the edges or edge pairs can be controlled in more detail. 

With the standard measure object (see above), edges or edge pairs can be selected only based on their 

contrast and transition. With a fuzzy measure object it is possible to select edges also based on their 

contrast and position. The selection of edge pairs can be additionally controlled based on their position 

and size, as well as based on the gray value between the two edges of the pair. The name fuzzy measure 

object does not mean that the measurements are “fuzzy”; it comes from the so-called “fuzzy membership 

functions” that control the selection of edges (see section 4.2.2 on page 30). 

In the following section, a first example demonstrates how to create and use a fuzzy measure object. The 

features that can be used to control the selection of edges and edge pairs are described in section 4.2 on 

page 27 as well as the fuzzy membership functions, which are used to formalize the desired constraints. 

Finally, section 4.3 on page 32 shows how to create a fuzzy measure object from a standard measure 

object and one or more fuzzy membership functions. 

4.1 A First Example 

The fuzzy measure object is an extension to the standard measure object. It allows to control the selection 

of edges by specifying weighting functions. 

Fuzzy Measure


The example program hdevelop\fuzzy_measure_switch.dev shows a possible usage of the fuzzy 

measure object. Again, the width of and the distance between the pins of a switch must be measured. In 

this case, there are reflections on the middle pin (see figure 25), which will lead to wrong results if the 

standard measure object is used (figure 26a). 

Figure 25: Image of a switch with reflections on the middle pin. 

(a) (b) 

Figure 26: The results of measuring the width of and the distance between the pins of the switch with (a) 

a standard measure object and (b) a fuzzy measure object . 

The selection of edges can be improved if additional information is available. In the example, the pins 

are approximately 9 pixels wide. This information can be translated into a fuzzy membership function. 

Figure 27 shows this fuzzy membership function. It returns 1.0 for the desired pair size and 0.0 if the 

size of the pairs deviates more than two pixels from the desired pair size. The intermediate values are 

interpolated linearly (see figure 27). In HALCON, this fuzzy membership function is created with the 

operator create_funct_1d_pairs: 

create_funct_1d_pairs ([7,9,11], [0.0,1.0,0.0], SizeFunction)

membership value 

1 

0.8 

0.6 

0.4 

0.2 

0 

4.2 Controlling the Selection of Edges and Edge Pairs 27 

5 6 7 8 9 10 11 12 13 

feature value 

Figure 27: The fuzzy membership function that is used in the current example to describe the constraint 

on the pair width. 

The standard measure object is transformed into a fuzzy measure object with the operator 

set_fuzzy_measure. As parameters, you pass the handle of the standard measure object, the fuzzy 

membership function, and the name of the feature to which the fuzzy membership function applies. 

SetType := ’size’ 

set_fuzzy_measure (MeasureHandle, SetType, SizeFunction) 

The resulting fuzzy measure object can be used to extract edge pairs that are approximately 9 pixels wide, 

with the term “approximately” being defined by the fuzzy membership function. 

Then, the fuzzy measure object is applied to extract the desired edge pairs: 

Sigma := 0.9 

AmpThresh := 12 

FuzzyThresh := 0.5 

Transition := ’negative’ 

Select := ’all’ 

fuzzy_measure_pairs (Image, MeasureHandle, Sigma, AmpThresh, FuzzyThresh, 

Transition, RowEdgeFirst, ColumnEdgeFirst, 

AmplitudeFirst, RowEdgeSecond, ColumnEdgeSecond, 

AmplitudeSecond, RowEdgeCenter, ColumnEdgeCenter, 

FuzzyScore, IntraDistance, InterDistance) 

Internally, the specified fuzzy membership function is used to weight all the possible edge pairs. The 

width of each possible edge pair is determined and transformed into a fuzzy value. If this fuzzy value 

lies above the user defined threshold FuzzyThresh, the edge pair is accepted, otherwise it is rejected. 

The result is displayed in figure 26b on page 26. With this, the width of all pins is measured correctly. 

4.2 Controlling the Selection of Edges and Edge Pairs 

The selection of edges and edge pairs can be controlled based on various features. Section 4.2.1 gives 

an overview over the possible features. The constraints on the feature values are defined with fuzzy 

membership functions. A short introduction into the concept of fuzzy logic and how fuzzy membership 

functions are created in HALCON is given in section 4.2.2 on page 30. 

Fuzzy Measure


4.2.1 Features that Can Be Used to Control the Selection of Edges and Edge Pairs 

In the example program hdevelop\fuzzy_measure_switch.dev described above, the selection of 

edge pairs has been controlled by the approximately known pair size. There are further features according 

to which the edges or edge pairs can be selected. These features are grouped into so-called set types 

(parameter SetType of the operators set_fuzzy_measure set_fuzzy_measure_norm_pair). 

The set types that describe positions or sizes can be specified either with fuzzy membership functions 

that directly describe the constraint on the feature value like the one displayed in figure 27 on page 27 

(see section 4.2.2 on page 30) or with normalized fuzzy membership functions, i.e., fuzzy membership 

functions that are defined relative to the desired pair width (section 4.3.2 on page 32). The first method 

uses the operator set_fuzzy_measure to specify the fuzzy membership function. The second method 

uses the operator set_fuzzy_measure_norm_pair, which enables a generalized usage of the fuzzy 

membership function because the pair size is set independently of the fuzzy membership function. 

The set types that refer to features of edge pairs will only take effect if edge pairs are determined, i.e., if 

the operators fuzzy_measure_pairs or fuzzy_measure_pairing are used. 

In the following, a list of the possible set types is given. It is possible to use multiple set types. Details 

can be found in section 4.3.3 on page 34. Note that the subtypes of each set are mutually exclusive, i.e., 

only one subtype of each set can be specified for a fuzzy measure object. 

Set Type Subtype Function Type Short Description 

Direct Norm. 

contrast ’contrast’ • - Evaluates the amplitude of the 

edges. 

position ’position’ • • Evaluates the signed distance 

of the edges to the reference 

point of the fuzzy measure object. 

The reference point is located 

at the start of the profile 

line (see figure 9 on page 10 and 

figure 10 on page 11). 

’position_center’ • • Like ’position’ with the reference 

point located at the center 

of the profile line. 

’position_end’ • • Like ’position’ with the reference 

point located at the end of 

the profile line. 

’position_first_edge’ • • Like ’position’ with the reference 

point located at the first 

edge. 

’position_last_edge’ • • Like ’position’ with the reference 

point located at the last 

edge.

4.2.1 Features that Can Be Used to Control the Selection of Edges and Edge Pairs 29 

Set Type Subtype Function Type Short Description 

Direct Norm. 

position_pair ’position_pair’ • • Evaluates the signed distance of 

(for edge pairs 

the edge pairs to the reference 

only) 

point of the fuzzy measure object. 

The position of an edge 

pair is defined as the center of 

the two edges. The reference 

’position_pair_center’ • • 

point is located at the start of the 

profile line (see figure 9 on page 

10 and figure 10 on page 11). 

Like ’position_pair’ with the 

reference point located at the 

center of the profile line. 

’position_pair_end’ • • Like ’position_pair’ with the 


end of the profile line. 

’position_first_pair’ • • Like ’position_pair’ with the 


position of the first edge pair. 

’position_last_pair’ • • Like ’position_pair’ with the 


position of the last edge pair. 

size (for edge ’size’ • • Evaluates the size of the edge 

pairs only) 

pairs, i.e., the distance between 

the two edges. 

’size_diff’ - • Evaluates the signed difference 

between the desired PairSize 

and the actual size of the edge 

pairs. 

’size_abs_diff’ - • Evaluates the absolute difference 

between the desired Pair- 

Size and the actual size of the 

edge pairs. 

gray (for edge ’gray’ • - Evaluates the mean gray value 

pairs only) 

between the two edges of edge 

pairs. 

Note that for the extraction of edge pairs all possible pairs of edges are internally analyzed according 

to the specified fuzzy membership functions. If only partial knowledge about the edge pairs to be determined 

is specified with the fuzzy membership functions, this will possibly lead to unintuitive results. 

For example, if the position of the edge pairs is specified but not their size, some small edge pairs will 

possibly be omitted because of a large edge pair that consists of the first edge of the first small edge pair 

and the second edge of the last small edge pair and that has a membership value equal to or higher than 

that of the small edge pairs (see appendix B.3.1 on page 50). Therefore, all the available knowledge 

Fuzzy Measure


about the edge pairs to be extracted should be specified with fuzzy membership functions. In the above 

example, the definition of the desired edge pairs can be made more clear by also specifying the size of 

the edge pairs (see appendix B.6 on page 61). 

4.2.2 Defining the Constraints on the Feature Values with Fuzzy Membership Functions 

The constraints for the selection of edges and edge pairs are defined with fuzzy membership functions. 

These functions transform the feature values into membership values. The decision whether an edge or 

an edge pair is accepted is made based on these membership values. For this so-called defuzzyfication, 

the membership values are compared with a given threshold. 

This section describes the concept of fuzzy membership functions and how they are created in HALCON 

as well as the defuzzyfication. 

Fuzzy Membership Functions 

Fuzzy logic handles the concept of partial truth, i.e., truth values between “completely true” and “completely 

false”. A fuzzy set is a set whose elements have degrees of membership. The degree of membership 

is defined by the feature value of the element and by the fuzzy membership function, which 

transforms the feature value into a membership value, i.e., the degree of membership. 

With this, fuzzy logic provides a framework for the representation of knowledge that is afflicted with 

uncertainties or can only be described vaguely. This applies often to knowledge that is expressed in 

natural language, e.g., “the distance between the two edges of a pair is approximately 9 pixel,” or “the 

edges are strong and they are located at the beginning of the profile line.” 

For example, a fuzzy membership function for the requirement that the edges must be strong could look 

like the one displayed in figure 28a. Here, any edge that has an amplitude of less than 50 will not be part 

of the fuzzy set of strong edges. Edges with an amplitude higher than 100 are full members of the fuzzy 

set, i.e, they have a membership value of 1.0. Edges with amplitudes between 50 and 100 are partial 

members of the fuzzy set, e.g., an edge with an amplitude of 75 will have a membership value of 0.5. 

In general, a fuzzy membership function is a mathematical function that defines the degree of an element’s 

membership in a fuzzy set. 

In HALCON, fuzzy membership functions are defined as piecewise linear functions by at least two 

pairs of values, sorted in an ascending order by their x value. They are created with the operator create_funct_1d_pairs: 

create_funct_1d_pairs (XValues, YValues, FuzzyMembershipFunction) 

The x values represent the feature of the edges or edge pairs and must lie within the parameter space of 

the set type, i.e., in case of ’contrast’ and ’gray’ features and, e.g., byte images within the range 0.0 ≤ x 

≤ 255.0. For ’size’, x has to satisfy 0.0 ≤ x, whereas for ’position’ and ’position_pair’, x can be any real 

number. The y values of the fuzzy membership function represent the resulting membership value for the 

corresponding feature value and have to satisfy 0.0 ≤ y ≤ 1.0. Outside of the function’s interval, defined 

by the smallest and the largest specified x value, the y values of the interval borders are continued with 

constant values equal to the closest defined y value. 

Therefore, the fuzzy membership function displayed in figure 28a is defined by the following code, 

which specifies the coordinates of the two control points of the fuzzy membership function.


1 

0.8 

0.6 

0.4 

0.2 

0 

4.2.2 Defining the Constraints on the Feature Values with Fuzzy Membership Functions 31 

0 50 100 150 200 250 

feature value for ’contrast’ 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 40 50 60 70 80 

feature value for ’position’ 

Figure 28: Examples for fuzzy membership functions. They can, e.g., be used to select (a) strong edges 

or (b) edges at the beginning of the profile line. The dots indicate the points to pass to 

create_funct_1d_pairs . 

create_funct_1d_pairs ([50,100], [0,1], FuzzyMembershipFunction1) 

A fuzzy membership function describing the condition that the edge must be at the beginning of the 

profile line could look like the one displayed in figure 28b (assuming that the profile line has a length of 

80 pixels). It can be defined as follows: 

Length := 80 

create_funct_1d_pairs ([Length/8.0, Length/2.0], [1,0], 

FuzzyMembershipFunction2) 

Further examples for fuzzy membership functions can be found in the appendix B on page 42. 

Defuzzyfication 

Finally, to decide wether an edge or an edge pair meets the constraint, the membership value must be 

“defuzzyfied”. This is done by comparing it with a user-defined fuzzy threshold. If the membership 

value lies below this threshold, the edge (pair) will be rejected. Otherwise, it will be accepted. 

If, e.g., a fuzzy threshold of 0.5 is assumed as in figure 29, all edges that have a membership value of at 

least 0.5 will be accepted. For the fuzzy membership function displayed in figure 29a, these are all edges 

with a feature value of at least 75; for the fuzzy membership function of figure 29b, edges will only be 

accepted if their feature value is at most 25. 

The specification of the fuzzy membership function and the definition of the fuzzy threshold are separated 

for fuzzy measure objects. The fuzzy membership function is specified with the operator 

set_fuzzy_measure or set_fuzzy_measure_norm_pair, respectively, i.e., during the creation of 

the fuzzy measure object (see section 4.3). In contrast, the fuzzy threshold is defined when the fuzzy 

measure object is used with one of the operators fuzzy_measure_pos, fuzzy_measure_pairs, or 

fuzzy_measure_pairing. This separation between the specification of the fuzzy membership function 

and the definition of the fuzzy threshold allows a flexible use of the fuzzy measure object: Often, it 

will be sufficient to adjust only the fuzzy threshold to varying imaging conditions. 

Fuzzy Measure



1 

0.8 

0.6 

0.4 

0.2 

0 

0 50 100 150 200 250 

feature value for ’contrast’ 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 40 50 60 70 80 

feature value for ’position’ 

Figure 29: Examples for fuzzy membership functions (thick lines) together with a fuzzy threshold at 0.5 

(thin lines). 

4.3 Creating the Fuzzy Measure Object 

This section describes how to create a fuzzy measure object from an existing measure object. Fuzzy 

measure objects can be created by specifying a fuzzy membership function as described in section 4.3.1. 

Fuzzy measure objects can also be created with normalized fuzzy membership functions, i.e., fuzzy 

membership functions that are defined relative to the desired pair width (see section 4.3.2). Section 4.3.3 

on page 34 shows that it is even possible to specify multiple fuzzy membership functions for different 

set types of one fuzzy measure object. 

4.3.1 Specifying a Fuzzy Membership Function 

To transform a standard measure object into a fuzzy measure object, you specify a fuzzy membership 

function with the operator set_fuzzy_measure. 

For example, you can create a fuzzy measure object for the extraction of edge pairs that have a size of 

approximately 10 pixels with the following lines of code: 

create_funct_1d_pairs ([5,10,15], [0,1,0], 

FuzzyMembershipFunctionPairSize10) 

set_fuzzy_measure (MeasureHandle, ’size’, 


The fuzzy membership function used in this example is displayed in figure 30. 

4.3.2 Specifying a Normalized Fuzzy Membership Function 

Fuzzy measure objects can also be created with normalized fuzzy membership functions, i.e., fuzzy 

membership functions that are defined relative to the desired pair width. For this, the operator 

set_fuzzy_measure_norm_pair must be used.


1 

0.8 

0.6 

0.4 

0.2 

0 

4.3.2 Specifying a Normalized Fuzzy Membership Function 33 

0 5 10 15 20 

feature value 

Figure 30: Fuzzy membership function, which can be used to extract edge pairs that are approximately 

10 pixels wide. 

To create a fuzzy measure object that is identical with the one created in the above example (see section 

4.3.1), the following code can be used: 

create_funct_1d_pairs ([0.5,1,1.5], [0,1,0], 

FuzzyMembershipFunctionPairSizeNormalized) 

PairSize := 10 

set_fuzzy_measure_norm_pair (MeasureHandle, PairSize, ’size’, 


Here, the normalized fuzzy membership function is used, which is displayed in figure 31. Internally, the 

operator set_fuzzy_measure_norm_pair multiplies the x values of the normalized fuzzy membership 

function with the given pair size. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 0.5 1 1.5 2 

normalized feature value 

Figure 31: Normalized fuzzy membership function, which can be used for the extraction of edge pairs of 

various sizes, according to the defined pair size. 

One major advantage of using the normalized form of fuzzy membership functions is that the application 

can very easily be adapted to varying sizes of the object to be measured. If, e.g., larger objects must be 

measured or if a camera with a higher resolution is used, it will suffice to adapt the pair size. 

Fuzzy Measure


If, e.g., the above example for the extraction of 10 pixel wide edge pairs must be adapted to the extraction 

of 20 pixel wide edge pairs, simply the definition of the pair size must be changed: 


set_fuzzy_measure_norm_pair (MeasureHandle, PairSize, ’size’, 


This creates a fuzzy measure object that uses the fuzzy membership function displayed in figure 32, 

internally. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 5 10 15 20 25 30 35 40 

feature value 

Figure 32: Fuzzy membership function, which can be used for the extraction of approximately 20 pixel 

wide edge pairs. 

To achieve such a fuzzy measure object without normalized fuzzy membership functions, the definition 

of the fuzzy membership function itself must be adapted: 

create_funct_1d_pairs ([10,20,30], [0,1,0], 


set_fuzzy_measure (MeasureHandle, ’size’, 


4.3.3 Specifying Multiple Fuzzy Membership Functions 

You can also specify fuzzy membership functions for different set types by repeatedly calling the operator 

set_fuzzy_measure or set_fuzzy_measure_norm_pair, respectively: 

set_fuzzy_measure (MeasureHandle, ’contrast’, FuzzyMembershipFunction1) 

set_fuzzy_measure (MeasureHandle, ’position’, FuzzyMembershipFunction2) 

If more than one set is defined in this way, the individual membership values will be aggregated with a 

fuzzy and operator: The overall membership value is derived by the geometric mean (see appendix C on 

page 62) of the individual membership values of each set. 

For example, the two set types ’contrast’ and ’position’ are used to achieve the selection of strong 

edges at the beginning of the profile line. For this, the two fuzzy membership functions as displayed in 

figure 28a and b on page 31 are used.

4.3.3 Specifying Multiple Fuzzy Membership Functions 35 

Assuming an edge at the position 20 along the profile line that has an amplitude of 80, the membership 

value is determined as follows. First, the membership values of the individual fuzzy sets are determined: 

membership value position=20 = 0.67 

membership value contrast=80 = 0.60 

Then, the geometric mean is calculated: 

� 

membership value20/80 = membership valueposition=20 · membership valuecontrast=80 = 0.63 

This value is compared to the fuzzy threshold. If it lies above this threshold, the respective edge will be 

accepted, otherwise it will be rejected. 

Figure 33a shows a plot of the membership value over the two feature values for ’contrast’ and ’position’. 

Assuming a fuzzy threshold of 0.5, the thick line in figure 33b marks the boundary of the domain of 

the feature values that lead to an acceptance of the respective edge. All edges with position/amplitude 

combinations that lie in this domain in the upper left corner will be accepted. The thin line marks the 

domain that results if both constraints must be satisfied individually. As can be seen, the domain that 

results from the combination of several set types is a bit larger. In particular, if one feature value meets 

the requirement very well, the other value may be slightly worse than allowed by the respective individual 

constraint. 


0.8 

0.6 

0.4 

0.2 

0 

1 

0 10 20 30 40 50 

edge position 60 70 80 0 

50 100150200250 

edge amplitude 

edge amplitude 

250 

200 

150 

100 

50 

0 

0 10 20 30 40 50 60 70 80 

edge position 

(a) (b) 

Figure 33: Membership value as a function of two feature values: (a) The membership value; (b) the 

boundary of the domain where edges will be accepted (thick line) compared to the boundary 

of the domain if the two constraints must be satisfied individually. 

To illustrate this, an edge at the position 10 having an amplitude of 65 is assumed. The individual 

membership values are: 

membership value position=10 = 1.00 

membership value contrast=65 = 0.30 

Fuzzy Measure


With this, the membership value can be calculated: 

� 

membership value10/65 = membership valueposition=10 · membership valuecontrast=65 = 0.54 

The overall membership value lies above 0.5 although the individual membership value for the contrast 

of the edge lies well below this value. 

Note that it is not possible to specify multiple fuzzy membership functions for one and the same set 

type. Setting a further fuzzy membership function to a set discards the previously defined function and 

replaces it by the new one. 

4.3.4 Changing and Resetting a Fuzzy Measure Object 

Fuzzy membership functions that have been specified for a particular set type can be changed and reset. 

To change a fuzzy membership function for a set type, simply specify the new fuzzy membership function 

for this set type with one of the operators set_fuzzy_measure and set_fuzzy_measure_norm_pair, 

as described in section 4.3.1 on page 32. 

A fuzzy measure object can be reset with the operator reset_fuzzy_measure. This discards the fuzzy 

membership function of the specified fuzzy set SetType. For example, the fuzzy membership function 

of the set type ’contrast’ is discarded with the following lines of code: 

SetType := ’contrast’ 

reset_fuzzy_measure (MeasureHandle, SetType) 

4.4 Using the Standard Measure Object Functionality 

All operators that provide functionality for the standard measure object can also be applied to fuzzy measure 

objects. This includes especially the operators measure_projection, measure_thresh, translate_measure, 

and close_measure. The operators for the standard measure object ignore the specified 

fuzzy membership functions and return the same results as if they were applied to a standard measure 

object.

Appendix 

A Slanted Edges 


This section shows the effect of the width of the ROI and the angle between the profile line and the edges. 

In the figures on the next pages, the images on the left hand show the size and orientation of the ROI and 

the diagrams on the right hand show the resulting profile (thick line) and its derivative (thin line, Sigma 

= 0.9). 

As long as the profile line runs approximately perpendicular to the edges, the ROI should be chosen as 

wide as possible because then the profile is less noisy (compare, e.g., figure 35a and figure 35c). 

If the profile line does not run perpendicular to the edges (see figure 36 on page 39 and figure 37 on page 

40), the width of the ROI must be chosen smaller to allow the detection of edges. Note that in this case, 

the profile will contain more noise. Consequently, the edges will be determined less accurately. What is 

more, the distance between the determined edges does not represent the perpendicular, i.e., the shortest 

distance between the respective image edges (see figure 34). If the angle δ between the profile line and 

the perpendicular of the edges is known, the perpendicular distance can be determined from the distance 

between the determined edges (Distance) as follows: 

perpendicular distance = cos(δ) · Distance 

Profile Line 

ROI 

Distance 

δ 

Perpendicular distance 

between the edges 

Figure 34: Relationship between the determined Distance and the perpendicular distance between 

edges. 

If the profile line is heavily slanted with respect to the edges, it may become impossible to determine the 

edges reliably, even with a very narrow ROI (see, e.g., figure 38a on page 41). 

Appendix


gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 

(a) Width of the ROI: 6 pixels. 

gray value 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 

(b) Width of the ROI: 30 pixels. 

gray value 

profile line 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 

(c) Width of the ROI: 60 pixels. 

profile line 

-150 

0 20 40 60 80 100 120 140 

profile line 

Figure 35: Angle between the profile line and the perpendicular of the edges: 0 ◦ .

gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 


gray value 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 


gray value 

profile line 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 


profile line 

-150 

0 20 40 60 80 100 120 140 

profile line 

Figure 36: Angle between the profile line and the perpendicular of the edges: 15 ◦ . 


Appendix


gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 


gray value 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 


gray value 

profile line 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 


profile line 

-150 

0 20 40 60 80 100 120 140 

profile line 

Figure 37: Angle between the profile line and the perpendicular of the edges: 30 ◦ .

gray value 

250 

200 

150 

100 

50 

0 

-50 

-100 


gray value 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 


gray value 

profile line 

-150 

0 20 40 60 80 100 120 140 

250 

200 

150 

100 

50 

0 

-50 

-100 


profile line 

-150 

0 20 40 60 80 100 120 140 

profile line 

Figure 38: Angle between the profile line and the perpendicular of the edges: 45 ◦ . 


Appendix


B Examples for Fuzzy Membership Functions 

In this section, examples for fuzzy membership functions are given. For each set type (see section 4.2.1 

on page 28), the typical range of the x values of the fuzzy membership functions as well as one fuzzy 

membership function is given. If the set type allows the creation of the fuzzy measure object with fuzzy 

membership functions and normalized fuzzy membership functions, both fuzzy membership functions 

are displayed. In this case, the normalized fuzzy membership function is defined such that, together with 

the stated PairSize, it creates the same fuzzy measure object as the direct fuzzy membership function. 

Then, mostly a PairSize of 10.0 is used to make it easy to compare the two fuzzy membership functions. 

The effect of the specification of the fuzzy membership function is shown by means of the results of the 

fuzzy measure object applied to the test image shown in figure 39. 

Figure 39: The test image that is used to show the effect of the different fuzzy membership functions . 

The results of the standard measure object applied to this test image are shown in figure 40. 

(a) (b) 

Figure 40: Result of the standard measure object applied to the test image: (a) Edges determined with 

the operator measure_pos, (b) edge pairs determined with the operator measure_pairs . 

Figure 40a shows the edges determined with the operator measure_pos and figure 40b shows the edge 

pairs determined with the operator measure_pairs. Both operators were applied with the following 

parameter settings: 

Parameter Value 

Sigma 1.0 

Threshold 10.0 

Transition ’all’ 

Select ’all’ 

FuzzyThresh 0.5 

The following examples shall give an impression of the effect of different fuzzy membership functions 

specified for different set types. They are in no way complete. Note that all the edge pairs are extracted 

with the operator fuzzy_measure_pairs, not with the operator fuzzy_measure_pairing.

B.1 Set Type contrast 

B.1.1 Subtype ’contrast’ 

Evaluates the amplitude of the edges. 

B.1 Set Type contrast 43 

The x values of fuzzy membership functions for the sub set type ’contrast’ must lie within the range of 

gray values, i.e., 

0.0 ≤ x ≤ 255.0 for ’byte’ images and 

0.0 ≤ x ≤ 65535.0 for ’uint2’ images. 

Figure 41 shows a fuzzy membership function that can be used to extract strong edges in ’byte’ images. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 50 100 150 200 250 

feature value 

The sub set type ’contrast’ cannot be specified 

with a normalized fuzzy membership function. 

(a) (b) 

Figure 41: A fuzzy membership function that can be used to extract strong edges in ’byte’ images (SetType 

= ’contrast’). 

Figure 42 shows the positions of the edges that will be extracted if the the fuzzy measure object is created 

with the fuzzy membership function displayed in figure 41. With this fuzzy measure object, only those 

edges are extracted that have a contrast between 75 and 125. 

Figure 42: The extracted edges. 

The fuzzy membership function displayed in figure 41a can be created with the following HDevelop 

code: 

SetType := ’contrast’ 

create_funct_1d_pairs ([50,100,150], [0,1,0], FuzzyMembershipFunction) 

set_fuzzy_measure (MeasureHandle, SetType, FuzzyMembershipFunction) 

Appendix


B.2 Set Type position 

B.2.1 Subtype ’position’ 

Evaluates the signed distance of the edges to the reference point of the fuzzy measure object. The 

reference point is located at the start of the profile line (see figure 9 on page 10 and figure 10 on page 

11). 

The x values of fuzzy membership functions for the sub set type ’position’ must lie within the range 

0.0 ≤ x ≤ length of the ROI. 

Figure 43 shows fuzzy membership functions that can be used to extract edges at the beginning of the 

profile line. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 40 50 


Figure 43: Fuzzy membership functions that can be used to extract edges at the beginning of the profile 

line (SetType = ’position’, PairSize = 10). 


with the fuzzy membership function displayed in figure 43. Only the first four edges are returned. 



code: 

SetType := ’position’ 

create_funct_1d_pairs ([50,130], [1,0], FuzzyMembershipFunction) 


The equivalent normalized fuzzy membership function, which is displayed in figure 43b can be created 

with the following HDevelop code:

SetType := ’position’ 


create_funct_1d_pairs ([5,13], [1,0], NormalizedFuzzyMembershipFunction) 

set_fuzzy_measure_norm_pair (MeasureHandle, PairSize, SetType, 

NormalizedFuzzyMembershipFunction) 

B.2.2 Subtype ’position_center’ 

B.2.2 Subtype ’position_center’ 45 

’position_center’ behaves like ’position’ with the reference point located at the center of the profile line. 

The x values of fuzzy membership functions for the sub set type ’position_center’ must lie within the 

range 

length of the ROI 

− 

2 

≤ x ≤ 


2 

. 

Figure 45 shows fuzzy membership functions that can be used to extract edges in the center of the profile 

line. 


1 

0.8 

0.6 

0.4 

0.2 

0 

-200 -100 0 100 200 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

-25 -20 -15 -10 -5 0 5 10 15 20 25 


Figure 45: Fuzzy membership functions that can be used to extract edges in the center of the profile line 

(SetType = ’position_center’, PairSize = 10). 


with the fuzzy membership function displayed in figure 45. Only the edges in the center of the profile 

line are returned. 



code: 

Appendix


SetType := ’position_center’ 

create_funct_1d_pairs ([-70,-20,20,70], [0,1,1,0], FuzzyMembershipFunction) 



with the following HDevelop code: 

SetType := ’position_center’ 


create_funct_1d_pairs ([-7,-2,2,7], [0,1,1,0], 




B.2.3 Subtype ’position_end’ 

’position_end’ behaves like ’position’ with the reference point located at the end of the profile line. 

The x values of fuzzy membership functions for the sub set type ’position_end’ must lie within the range 

−length of the ROI ≤ x ≤ 0.0. 

Figure 47 shows fuzzy membership functions that can be used to extract edges at the end of the profile 

line. 


1 

0.8 

0.6 

0.4 

0.2 

0 

-500 -400 -300 -200 -100 0 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

-50 -40 -30 -20 -10 0 


Figure 47: Fuzzy membership functions that can be used to extract edges at the end of the profile line 

(SetType = ’position_end’, PairSize = 10). 


with the fuzzy membership function displayed in figure 47. Only the edges at the end of the profile line 

are returned. 


code:


SetType := ’position_end’ 

create_funct_1d_pairs ([-250,0], [0,1], FuzzyMembershipFunction) 


B.2.4 Subtype ’position_first_edge’ 47 



SetType := ’position_end’ 


create_funct_1d_pairs ([-25,0], [0,1], NormalizedFuzzyMembershipFunction) 



B.2.4 Subtype ’position_first_edge’ 

’position_first_edge’ behaves like ’position’ with the reference point located at the first edge. 

The x values of fuzzy membership functions for the sub set type ’position_first_edge’ must lie within the 

range 


Figure 49 shows fuzzy membership functions that can be used to extract edges close to the first edge. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 40 50 


Figure 49: Fuzzy membership functions that can be used to extract edges close to the first edge (SetType 

= ’position_first_edge’, PairSize = 10). 

Appendix


Figure 50 shows the positions of the edges that will be extracted if the the fuzzy measure object is 

created with the fuzzy membership function displayed in figure 49. Only the edges at the beginning of 

the profile line are returned. Note that even though the same fuzzy membership function has been used 

as in figure 43 on page 44, one more edge has been extracted. This is because the reference point is 

located at the first edge instead of the start of the profile line. 


The fuzzy membership function displayed in figure 49a on page 47 can be created with the following 

HDevelop code: 

SetType := ’position_first_edge’ 

create_funct_1d_pairs ([50,130], [1,0], FuzzyMembershipFunction) 


The equivalent normalized fuzzy membership function, which is displayed in figure 49b on page 47 can 

be created with the following HDevelop code: 

SetType := ’position_first_edge’ 


create_funct_1d_pairs ([5,13], [1,0], NormalizedFuzzyMembershipFunction) 



B.2.5 Subtype ’position_last_edge’ 

’position_last_edge’ behaves like ’position’ with the reference point located at the last edge. 

The x values of fuzzy membership functions for the sub set type ’position_last_edge’ must lie within the 

range 


Figure 51 shows fuzzy membership functions that can be used to extract edges close to the last edge. 


with the fuzzy membership function displayed in figure 51. Only the edges at the end of the profile line 

are returned. Note that even though the same fuzzy membership function has been used as in figure 47 

on page 46, more edges have been extracted. This is because the reference point is located at the last 

edge instead of the end of the profile line. 


code:


1 

0.8 

0.6 

0.4 

0.2 

0 

-500 -400 -300 -200 -100 0 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

B.3 Set Type position_pair 49 

-50 -40 -30 -20 -10 0 


Figure 51: Fuzzy membership functions that can be used to extract edges close to the last edge (SetType 

= ’position_last_edge’, PairSize = 10). 


SetType := ’position_last_edge’ 

create_funct_1d_pairs ([-250,0], [0,1], FuzzyMembershipFunction) 




SetType := ’position_last_edge’ 


create_funct_1d_pairs ([-25,0], [0,1], NormalizedFuzzyMembershipFunction) 



B.3 Set Type position_pair 

Note that for the extraction of edge pairs all possible pairs of edges are internally analyzed according 

to the specified fuzzy membership functions. If only partial knowledge about the edge pairs to be determined 

is specified with the fuzzy membership functions, this will possibly lead to unintuitive results. 

For example, if the position of the edge pairs is specified but not their size, some small edge pairs will 

possibly be omitted because of a large edge pair that consists of the first edge of the first small edge pair 

and the second edge of the last small edge pair and that has a membership value equal to or higher than 

that of the small edge pairs (see appendix B.3.1). Therefore, all the available knowledge about the edge 

Appendix


pairs to be extracted should be specified with fuzzy membership functions. In the above example, the 

definition of the desired edge pairs can be made more clear by also specifying the size of the edge pairs 

(see appendix B.6 on page 61). 

B.3.1 Subtype ’position_pair’ 

Evaluates the signed distance of the edge pairs to the reference point of the fuzzy measure object. The 

position of an edge pair is defined as the center of the two edges. The reference point is located at the 

start of the profile line (see figure 9 on page 10 and figure 10 on page 11). 

The x values of fuzzy membership functions for the sub set type ’position_pair’ must lie within the range 


Figure 53 shows fuzzy membership functions that can be used to extract edge pairs in approximately the 

first half of the profile line. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 40 50 


Figure 53: Fuzzy membership functions that can be used to extract edge pairs in approximately the first 

half of the profile line (SetType = ’position_pair’, PairSize = 10). 

Figure 54 shows the positions of the edge pairs that will be extracted if the the fuzzy measure object is 

created with the fuzzy membership function displayed in figure 53. Only edge pairs in approximately 

the first half of the profile line are returned. Note that one large edge pair is extracted instead of multiple 

small ones. This happens because internally, all possible pairs of edges are analyzed according to the 

specified fuzzy membership functions and no condition for the size of the edge pairs was given. See 

appendix B.6 on page 61 for an example where in addition a fuzzy membership function for the size of 

the edge pairs is specified. 


code: 

SetType := ’position_pair’ 

create_funct_1d_pairs ([0,50,250,300], [0,1,1,0], FuzzyMembershipFunction) 


The equivalent normalized fuzzy membership function, which is displayed in figure 53b can be created


Figure 54: The extracted edge pairs. 

SetType := ’position_pair’ 


create_funct_1d_pairs ([0,5,25,30], [0,1,1,0], 




B.3.2 Subtype ’position_pair_center’ 

B.3.2 Subtype ’position_pair_center’ 51 

’position_pair_center’ behaves like ’position_pair’ with the reference point located at the center of the 

profile line. 

The x values of fuzzy membership functions for the sub set type ’position_pair_center’ must lie within 

the range 


− 

2 

≤ x ≤ 


2 

. 

Figure 55 shows fuzzy membership functions that can be used to extract edge pairs in the middle of the 

profile line. 


1 

0.8 

0.6 

0.4 

0.2 

0 

-200 -100 0 100 200 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

-25 -20 -15 -10 -5 0 5 10 15 20 25 


Figure 55: Fuzzy membership functions that can be used to extract edge pairs in the middle of the profile 

line (SetType = ’position_pair_center’, PairSize = 10). 


created with the fuzzy membership function displayed in figure 55. Only one edge pair in the middle of 

Appendix


the profile line is returned. Note that only the position of the edge pair, i.e., the position of the center 

between the two edges of the edge pair is used to restrict the extraction of edge pairs. The position of the 

individual edges of the edge pair is not restricted. In this case, one large edge pair with its center in the 

middle of the profile line is returned. 


The fuzzy membership function displayed in figure 55a on page 51 can be created with the following 


SetType := ’position_pair_center’ 

create_funct_1d_pairs ([-25,0,25], [0,1,0], FuzzyMembershipFunction) 


The equivalent normalized fuzzy membership function, which is displayed in figure 55b on page 51 can 

be created with the following HDevelop code: 

SetType := ’position_pair_center’ 


create_funct_1d_pairs ([-2.5,0,2.5], [0,1,0], 




B.3.3 Subtype ’position_pair_end’ 

’position_pair_end’ behaves like ’position_pair’ with the reference point located at the end of the profile 

line. 

The x values of fuzzy membership functions for the sub set type ’position_pair_end’ must lie within the 

range 


Figure 57 shows fuzzy membership functions that can be used to extract edge pairs at the end of the 

profile line. 


created with the fuzzy membership function displayed in figure 57. Only edge pairs at the end of the 

profile line are returned. 


code:


1 

0.8 

0.6 

0.4 

0.2 

0 

-500 -400 -300 -200 -100 0 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

B.3.4 Subtype ’position_first_pair’ 53 

-50 -40 -30 -20 -10 0 


Figure 57: Fuzzy membership functions that can be used to extract edge pairs at the end of the profile 

line (SetType = ’position_pair_end’, PairSize = 10). 


SetType := ’position_pair_end’ 

create_funct_1d_pairs ([-300,-200,0], [0,0.9,1], FuzzyMembershipFunction) 




SetType := ’position_pair_end’ 


create_funct_1d_pairs ([-30,-20,0], [0,0.9,1], 




B.3.4 Subtype ’position_first_pair’ 

’position_first_pair’ behaves like ’position_pair’ with the reference point located at the position of the 

first edge pair. 

The x values of fuzzy membership functions for the sub set type ’position_first_pair’ must lie within the 

range 


Appendix


Figure 59 shows fuzzy membership functions that can be used to extract edge pairs in approximately the 

first half of the profile line. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 40 50 



half of the profile line (SetType = ’position_first_pair’, PairSize = 10). 


created with the fuzzy membership function displayed in figure 59. Only one edge pair in the first half 

of the profile line is returned. Note that a large edge pair is extracted instead of multiple small ones. 

This happens because internally, all possible pairs of edges are analyzed according to the specified fuzzy 

membership functions and no condition for the size of the edge pairs was given. See appendix B.6 on 

page 61 for an example where in addition to a fuzzy membership function for the position of the edge 

pairs a fuzzy membership function for their size is specified. Note also that even though the same fuzzy 

membership function has been used as in appendix B.3.1 on page 50 one more edge pair is extracted 

because the reference point is located at the first edge pair instead of the start of the profile line. 



code: 

SetType := ’position_first_pair’ 

create_funct_1d_pairs ([0,50,250,300], [0,1,1,0], FuzzyMembershipFunction) 




SetType := ’position_first_pair’ 






B.3.5 Subtype ’position_last_pair’ 

B.3.5 Subtype ’position_last_pair’ 55 

’position_last_pair’ behaves like ’position_pair’ with the reference point located at the position of the 

last edge pair. 

The x values of fuzzy membership functions for the sub set type ’position_last_pair’ must lie within the 

range 


Figure 61 shows fuzzy membership functions that can be used to extract edge pairs close to the last edge 

pair. 


1 

0.8 

0.6 

0.4 

0.2 

0 

-500 -400 -300 -200 -100 0 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

-50 -40 -30 -20 -10 0 


Figure 61: Fuzzy membership functions that can be used to extract edge pairs close to the last edge pair 

(SetType = ’position_last_pair’, PairSize = 10). 


created with the fuzzy membership function displayed in figure 61. Only edge pairs in the second half of 

the profile line are returned. Note that even though the same fuzzy membership function has been used 

as in appendix B.3.3 on page 52, more edge pairs are extracted. This is because the reference point is 

located at the last edge pair instead of the end of the profile line. 


code: 

Appendix



SetType := ’position_last_pair’ 

create_funct_1d_pairs ([-300,-200,0], [0,0.9,1], FuzzyMembershipFunction) 




SetType := ’position_last_pair’ 


create_funct_1d_pairs ([-30,-20,0], [0,0.9,1], 




B.4 Set Type size 

B.4.1 Subtype ’size’ 

Evaluates the size of the edge pairs, i.e., the distance between the two edges. 

The x values of fuzzy membership functions for the sub set type ’size’ must lie within the range 

x ≥ 0.0. 

Figure 63 shows fuzzy membership functions that can be used to extract edge pairs that are approximately 

50 pixels wide. 


created with the fuzzy membership function displayed in figure 63. Only edge pairs that are approximately 

50 pixels wide are returned. 


code: 


create_funct_1d_pairs ([20,50,80], [0,1,0], FuzzyMembershipFunction) 





1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

B.4.2 Subtype ’size_diff’ 57 

0 0.5 1 1.5 2 


Figure 63: Fuzzy membership functions that can be used to extract edge pairs that are approximately 

50 pixels wide (SetType = ’size’, PairSize = 50). 




create_funct_1d_pairs ([0.4,1,1.6], [0,1,0], 




B.4.2 Subtype ’size_diff’ 

Evaluates the signed difference between the desired PairSize and the actual size of the edge pairs.The 

signed difference is defined by: 

x = 

PairSize − actual size of a pair 

PairSize 

The x values of fuzzy membership functions for the sub set type ’size_diff’ must lie within the range 

x ≤ 1.0. 

Figure 65 shows a fuzzy membership function that can be used to extract edge pairs that have the specified 

PairSize or are a little bit smaller. 


created with the fuzzy membership function displayed in figure 65. Only edge pairs that are between 

11 pixels and 15 pixels wide are returned. 

Appendix


The sub set type ’size_diff’ can only be specified 



1 

0.8 

0.6 

0.4 

0.2 

0 

-1 -0.5 0 0.5 1 


(a) (b) 

Figure 65: A fuzzy membership function that can be used to extract edge pairs that have the specified 

PairSize or are a little bit smaller (SetType = ’size_diff’, PairSize = 15). 


The normalized fuzzy membership function displayed in figure 65b can be created with the following 


SetType := ’size_diff’ 


create_funct_1d_pairs ([-0.01,0,0.5], [0,1,0], 




Note that with the subtype ’size_diff’ asymmetrical fuzzy membership functions can be specified for 

the difference between the PairSize and the actual size of an edge pair. This is not possible with the 

subtype ’size_abs_diff’ (see appendix B.4.3). 

B.4.3 Subtype ’size_abs_diff’ 

Evaluates the absolute difference between the desired PairSize and the actual size of the edge pairs.The 

absolute difference is defined by: 

x = 

|PairSize − actual size of a pair| 

PairSize 

The x values of fuzzy membership functions for the sub set type ’size_abs_diff’ must lie within the range 

x ≥ 0.0.

B.4.3 Subtype ’size_abs_diff’ 59 

Figure 67 shows a fuzzy membership function that can be used to extract edge pairs that are approximately 

as wide as the specified PairSize. 

The sub set type ’size_abs_diff’ can only be 

specified with a normalized fuzzy membership 

function. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 0.2 0.4 0.6 0.8 1 


(a) (b) 

Figure 67: A fuzzy membership function that can be used to extract edge pairs that are approximately as 

wide as the specified PairSize (SetType = ’size_abs_diff’, PairSize = 15). 


created with the fuzzy membership function displayed in figure 67. Only edge pairs that are between 

11 pixels and 19 pixels wide are returned. Note that in addition to the edge pairs that are extracted in 

appendix B.4.2 on page 57 also larger edge pairs are returned. 


The normalized fuzzy membership function displayed in figure 67b can be created with the following 


SetType := ’size_abs_diff’ 


create_funct_1d_pairs ([0,0.5], [1,0], NormalizedFuzzyMembershipFunction) 



Note that with the subtype ’size_abs_diff’ only symmetrical fuzzy membership functions can be specified 

for the difference between the PairSize and the actual size of an edge pair. To specify asymetrical fuzzy 

membership functions, use the subtype ’size_abs_diff’ (see appendix B.4.2 on page 57). Appendix


B.5 Set Type gray 

B.5.1 Subtype gray 

Evaluates the mean gray value between the two edges of edge pairs. 

The x values of fuzzy membership functions for the sub set type ’gray’ must lie within the range of gray 

values, i.e., 

0.0 ≤ x ≤ 255.0 for ’byte’ images and 

0.0 ≤ x ≤ 65535.0 for ’uint2’ images. 

Figure 69 shows a fuzzy membership function that can be used to extract edge pairs that enclose areas 

with a gray value of approximately 50 or 150. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 50 100 150 200 250 

feature value 

The sub set type ’gray’ cannot be specified 


(a) (b) 

Figure 69: A fuzzy membership function that can be used to extract edge pairs that enclose areas with a 

gray value of approximately 50 or 150 (SetType = ’gray’). 


with the fuzzy membership function displayed in figure 69. With this fuzzy measure object, only those 

edge pairs are returned that enclose areas of the specified gray values. 



code: 

SetType := ’gray’ 

create_funct_1d_pairs ([20,40,60,80,120,140,160,180], [0,1,1,0,0,1,1,0], 

FuzzyMembershipFunction) 

set_fuzzy_measure (MeasureHandle, SetType, FuzzyMembershipFunction)

B.6 Set Type position_pair combined with size 

B.6 Set Type position_pair combined with size 61 

Figure 71 shows two fuzzy membership functions that can be used to determine edge pairs in approximately 

the first half of the profile line. In addition to only specifying the position of the edge pairs (see 

appendix B.3.1 on page 50), also a fuzzy membership function for the size of the edge pairs is specified 

(see appendix B.4.1 on page 56). With this, it is possible to define precisely which kind of edge pairs 

must be returned. 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 

(a) (b) 


1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 

feature value 


half of the profile line: (a) The restriction for the position of the pairs (SetType = ’position_pair’) 

and (b) a fuzzy membership function that favors small edge pairs (SetType = ’size’). 


created with the fuzzy membership functions displayed in figure 71. Only edge pairs in approximately 

the first half of the profile line are returned. Note that smaller edge pairs than in the example in appendix 

B.3.1 on page 50 are extracted. This is because of the constraint on the size of the edge pairs, 

which is introduced by the fuzzy membership function for the size of the edge pairs (figure 71b). 



code: 


FuzzyMembershipFunctionPositionPair) 

The fuzzy membership function displayed in figure 71b can be created with the following HDevelop 

code: 

create_funct_1d_pairs ([0,50], [1,0.5], FuzzyMembershipFunctionSize) 

Then, the fuzzy measure object can be created: 

Appendix


set_fuzzy_measure (MeasureHandle, ’position_pair’, 

FuzzyMembershipFunctionPositionPair) 

set_fuzzy_measure (MeasureHandle, ’size’, FuzzyMembershipFunctionSize) 

C Definition of the Geometric Mean 

The geometric mean of a sequence {ai} n i=1 

Thus, 

and so on. 

is defined by 

G(a1, . . . , an) ≡ 

� n� 

i=1 

ai 

G(a1, a2) = � (a1a2) 

� 1/n 

G(a1, a2, a3) = (a1a2a3) 1/3



Machine Vision in World 

Coordinates 

⊲ Calibration of camera systems (single camera, multiple camera setup, binocular stereo system) 

⊲ Transformation of image coordinates into 3D world coordinates and vice versa 

⊲ Rectification of images to compensate distortion or camera orientation 

⊲ Determination of orientation and position of known 3D objects 

⊲ 3D reconstruction of unknown 3D objects using binocular stereo vision 

⊲ Combination of multiple images into a larger mosaic image (image stitching) 

⊲ Calibration of hand-eye (robot-camera) systems and transformation of image processing results 

from camera into robot coordinates 


⊲ Inspection of dimensional accuracy in world coordinates 

⊲ Generation of overview images of large objects 

⊲ 3D reconstruction 

⊲ Robot vision 


Overview 

Measurements in 3D become more and more important. HALCON provides many methods to perform 

3D measurements. This application note gives you an overview over these methods, and it assists you 

with the selection and the correct application of the appropriate method. 

A short characterisation of the various methods is given in section 1 on page 5. Principles of 3D transformations 

and poses as well as the description of the camera model can be found in section 2 on page 

8. Afterwards, the methods to perform 3D measurements are described in detail. 

Unless specified otherwise, the HDevelop example programs can be found in the subdirectory 

3d_machine_vision of the directory HALCONROOT \examples\application_guide. 





Edition 1a July 2004 (HALCON 7.0.1) 

Edition 1b April 2005 (HALCON 7.0.2) 








Contents 

1 Can You Really Perform 3D Machine Vision with HALCON? 5 

2 Basics 8 

2.1 3D Transformations and Poses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

2.2 Camera Model and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 

3 3D Machine Vision in a Specified Plane With a Single Camera 27 

3.1 Calibrating the Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 

3.2 Transforming Image into World Coordinates and Vice Versa . . . . . . . . . . . . . . . 44 

3.3 Rectifying Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 

3.4 Inspection of Non-Planar Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 

4 Calibrated Mosaicking 59 

4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

4.2 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 

4.3 Merging the Individual Images into One Larger Image . . . . . . . . . . . . . . . . . . 63 

5 Uncalibrated Mosaicking 69 

5.1 Rules for Taking Images for a Mosaic Image . . . . . . . . . . . . . . . . . . . . . . . . 72 

5.2 Definition of Overlapping Image Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

5.3 Detection of Characteristic Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 

5.4 Matching of Characteristic Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 

5.5 Generation of the Mosaic Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 

5.6 Bundle Adjusted Mosaicking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 

5.7 Spherical Mosaicking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 

6 Pose Estimation of Known 3D Objects With a Single Camera 84 

6.1 Pose Estimation for General 3D Objects . . . . . . . . . . . . . . . . . . . . . . . . . . 84 

6.2 Pose Estimation for 3D Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 

7 3D Machine Vision With a Binocular Stereo System 88 

7.1 The Principle of Stereo Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

7.2 The Setup of a Stereo Camera System . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 

7.3 Calibrating the Stereo Camera System . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 

3

4 Application Note on 3D Machine Vision 

7.4 Obtaining 3D Information from Images . . . . . . . . . . . . . . . . . . . . . . . . . . 98 

7.5 Uncalibrated Stereo Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 

8 Robot Vision 109 

8.1 The Principle of Hand-Eye Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 

8.2 Determining Suitable Input Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 

8.3 Performing the Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 

8.4 Using the Calibration Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 

9 Rectification of Arbitrary Distortions 123 

9.1 Basic Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 

9.2 Rules for Taking Images of the Rectification Grid . . . . . . . . . . . . . . . . . . . . . 125 

9.3 Machine Vision on Ruled Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 

9.4 Using Self-Defined Rectification Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 

A The HALCON Calibration Plate 136 

B HDevelop Procedures Used in this Application Note 137 

B.1 gen_hom_mat3d_from_three_points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 

B.2 parameters_image_to_world_plane_centered . . . . . . . . . . . . . . . . . . . . . . . 138 

B.3 parameters_image_to_world_plane_entire . . . . . . . . . . . . . . . . . . . . . . . . . 139 

B.4 tilt_correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 

B.5 visualize_results_of_find_marks_and_pose . . . . . . . . . . . . . . . . . . . . . . . . 140 

B.6 display_calplate_coordinate_system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 

B.7 display_3d_coordinate_system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 

B.8 select_values_for_ith_image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 

B.9 calc_base_start_pose_movingcam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 

B.10 calc_cam_start_pose_movingcam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 

B.11 calc_calplate_pose_movingcam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 

B.12 calc_base_start_pose_stationarycam . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 

B.13 calc_cam_start_pose_stationarycam . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 

B.14 calc_calplate_pose_stationarycam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 

B.15 define_reference_coord_system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144


1 Can You Really Perform 3D Machine Vision with HALCON? 

Do you have an application at hand where it is necessary to perform 3D measurements from images? 

Then, HALCON is exactly the right software solution for you. This application note will introduce you 

to the world of 3D machine vision with HALCON. 

What makes HALCON very powerful in the area of 3D measurements is its ability to model the whole 

imaging process in 3D with the camera calibration. Among other things, this allows to transform the 

image processing results into arbitrary 3D coordinate systems and with this to derive metrical information 

from the images, regardless of the position and orientation of the camera with respect to the object. 

In general, no 3D measurements are possible if only one image of the object is available. But nevertheless, 

by using HALCON’s camera calibration you can perform inspection tasks in 3D coordinates in 

specified object planes (section 3 on page 27). These planes can be oriented arbitrarily with respect to 

the camera. This is, e.g., useful if the camera cannot be mounted such that it looks perpendicular to the 

object surface. In any case, you can perform the image processing in the original images. Afterwards, 

the results can be transformed into 3D coordinates. 

It is also possible to rectify the images such that they appear as if they were acquired from a camera 

that has no lens distortions and that looks exactly perpendicular onto the object surface (section 3.3 on 

page 49). This is useful for tasks like OCR or the recognition and localization of objects, which rely on 

images that are not distorted too much with respect to the training images. 

If the object is too large to be covered by one image with the desired resolution, multiple images, each 

covering only a part of the object, can be combined into one larger mosaic image. This can be done either 

based on a calibrated camera setup with very high precision (section 4 on page 59) or highly automated 

for arbitrary and even varying image configurations (section 5 on page 69). 

The position and orientation of 3D objects with respect to a given 3D coordinate system can be determined 

with the HALCON camera calibration (section 6 on page 84). This is, e.g., necessary for 

pick-and-place applications. 

If you need to determine the 3D shape of arbitrary objects, you can use HALCON’s binocular stereo 

vision functionality (section 7 on page 88). The 3D coordinates of any point on the object surface can 

be determined based on two images that are acquired suitably from different points of view. Thus, 3D 

inspection becomes possible. 

A typical application area for 3D machine vision is robot vision, i.e., using the results of machine vision 

to command a robot. In such applications you must perform an additional calibration: the so-called 

hand-eye calibration, which determines the relation between camera and robot coordinates. Again, this 

calibration must be performed only once (offline), its results allow you to quickly transform machine 

vision results from camera into robot coordinates. 

Thus, we can answer the question that was posed at the beginning: Yes, you can perform 3D machine 

vision with HALCON, but you have to calibrate your camera first. Don’t be afraid of the calibration 

process: In HALCON, this can be done with just a few lines of code. 

What is more, if you want to achieve accurate results it is essential to calibrate the camera. It is of no use 

to extract edges with an accuracy of 1/40 pixel if the lens distortion of the uncalibrated camera accounts 

for a couple of pixels. This also applies if you use cameras with telecentric lenses. 

We propose to read section 2.2 on page 19 first, as the camera model is described there. Then, depending 

on the task at hand, you can step into the appropriate section. To find this section, you can use the 

Introduction


overview given in figure 1 and figure 2. For details on 3D transformations and poses, please refer to 

section 2.1 on page 8. 

High-precision 3D 

measurements from 

images 

✲ Object with planar 

surfaces 

✲ Object with arbitrary 

surface 

Determine the position 

✲ and orientation of 

known 3D objects 

✲ Robot vision 

✲ Object fits into one 

image 

Object is too large to 

✲ be covered by one 

image 

✲ section 3 on page 27 





Figure 1: How to find the appropriate section of this application note (part 1).

Rectification of 

distorted images 

Transformation of 3D 

coordinates into 

images for ROI 

definition or 

visualization 

Combination of 

images into a larger 

mosaic image 

✲ Correction for lens 

distortion only 

✲ Images of objects with 

planar surfaces 

✲ Images of objects with 

arbitrary surfaces 

✲ For high-precision 

measurement tasks 

✲ Flexible and highly 

automated 


✲ Object fits into one 

image 

Object is too large to 

✲ be covered by one 

image 

✲ section 3.3.2 on page 

55 


49 




47 



Figure 2: How to find the appropriate section of this application note (part 2). 

Introduction


2 Basics 

2.1 3D Transformations and Poses 

Before we start explaining how to perform machine vision in world coordinates with HALCON, we take 

a closer look at some basic questions regarding the use of 3D coordinates: 

• How to describe the transformation (translation and rotation) of points and coordinate systems, 

• how to describe the position and orientation of one coordinate system relative to another, and 

• how to determine the coordinates of a point in different coordinate systems, i.e., how to transform 

coordinates between coordinate systems. 

In fact, all these tasks can be solved using one and the same means: homogeneous transformation matrices 

and their more compact equivalent, 3D poses. 

2.1.1 3D Coordinates 

The position of a 3D point P is described by its three coordinates (xp, yp, zp). The coordinates can 

also be interpreted as a 3D vector (indicated by a bold-face lower-case letter). The coordinate system 

in which the point coordinates are given is indicated to the upper right of a vector or coordinate. For 

example, the coordinates of the point P in the camera coordinate system (denoted by the letter c) and in 

the world coordinate system (denoted by the letter w) would be written as: 

⎛ 

p c = ⎝ 

x c p 

y c p 

z c p 

⎞ 

⎛ 

⎠ p w = ⎝ 

Figure 3 depicts an example point lying in a plane where measurements are to be performed and its 

coordinates in the camera and world coordinate system, respectively. 

Measurement plane 

x c y c z c Camera coordinate system 

( , , ) 

p = 

c 

0 

2 

4 

c 

y 

P 

p = 

w 

z c 

4 

3.3 

0 

x c 

x w 

x w p 

y w p 

z w p 

z w 

y w 

⎞ 

⎠ 

x w y w z w World coordinate system 

( , , ) 

Figure 3: Coordinates of a point in two different coordinate systems.

2.1.2 Translation 

Translation of Points 

In figure 4, our example point has been translated along the x-axis of the camera coordinate system. 

x c y c z c ( , , ) 

x c 

Camera coordinate system 

c 

y 

z c 

p = 1 

0 

2 

4 

p = 2 

P1 

t = 

4 

0 

0 

P2 

Figure 4: Translating a point. 

4 

2 

4 

2.1.2 Translation 9 

The coordinates of the resulting point P2 can be calculated by adding two vectors, the coordinate vector 

p1 of the point and the translation vector t: 

⎛ 

p2 = p1 + t = ⎝ 

xp1 + xt 

yp1 + yt 

zp1 + zt 

⎞ 

⎠ (1) 

Multiple translations are described by adding the translation vectors. This operation is commutative, i.e., 

the sequence of the translations has no influence on the result. 

Translation of Coordinate Systems 

Coordinate systems can be translated just like points. In the example in figure 5, the coordinate system 

c1 is translated to form a second coordinate system, c2. Then, the position of c2 in c1, i.e., the coordinate 

vector of its origin relative to c1 (o c1 

c2 

Coordinate Transformations 

), is identical to the translation vector: 

t c1 = o c1 

c2 

Let’s turn to the question how to transform point coordinates between (translated) coordinate systems. 

In fact, the translation of a point can also be thought of as translating it together with its local coordinate 

system. This is depicted in figure 5: The coordinate system c1, together with the point Q1, is translated 

(2) 

Basics


Coordinate system 1 

x c1 ( , 

y , c1 z c1 ) 

1 

c1 

q = 

c1 

y 

0 

0 

4 

z c1 

Q 

2 

c1 

q = 

1 

t = 

c2 

y 

2 

0 

6 

2 

0 

2 

x c1 

2 

z c2 

2 

c2 

q = 

x c2 y c2 z c2 


( , , ) 

x c2 

Figure 5: Translating a coordinate system (and point). 

by the vector t, resulting in the coordinate system c2 and the point Q2. The points Q1 and Q2 then have 

the same coordinates relative to their local coordinate system, i.e., q c1 

1 

Q 

0 

0 

4 

= qc2 

2 . 

If coordinate systems are only translated relative to each other, coordinates can be transformed very 

easily between them by adding the translation vector: 

q c1 

2 

= qc2 2 + tc1 = q c2 

2 + oc1 

c2 

In fact, figure 5 visualizes this equation: q c1 

2 , i.e., the coordinate vector of Q2 in the coordinate system 

c1, is composed by adding the translation vector t and the coordinate vector of Q2 in the coordinate 

system c2 (q c2 

2 ). 

The downside of this graphical notation is that, at first glance, the direction of the translation vector 

appears to be contrary to the direction of the coordinate transformation: The vector points from the 

coordinate system c1 to c2, but transforms coordinates from the coordinate system c2 to c1. According 

to this, the coordinates of Q1 in the coordinate system c2, i.e., the inverse transformation, can be obtained 

by subtracting the translation vector from the coordinates of Q1 in the coordinate system c1: 

Summary 

q c2 

1 

= qc1 1 − tc1 = q c1 

1 − oc1 

c2 

• Points are translated by adding the translation vector to their coordinate vector. Analogously, coordinate 

systems are translated by adding the translation vector to the position (coordinate vector) 

of their origin. 

• To transform point coordinates from a translated coordinate system c2 into the original coordinate 

system c1, you apply the same transformation to the points that was applied to the coordinate 

system, i.e., you add the translation vector used to translate the coordinate system c1 into c2. 

(3) 

(4)

2.1.3 Rotation 11 

• Multiple translations are described by adding all translation vectors; the sequence of the translations 

does not affect the result. 

2.1.3 Rotation 

Rotation of Points 

In figure 6a, the point p1 is rotated by −90 ◦ around the z-axis of the camera coordinate system. 

p = 

1 

0 

2 

4 

c 

y 

z c 

c 

x 

P 

3 

p = 

P R (−90°) 

P 

1 z 

1 

3 

2 

0 

4 

p = 

a) first rotation b) second rotation 

1 

0 

2 

4 

c 

y 

z c 

p = 

4 

x c 

p = 

Figure 6: Rotate a point: (a) first around z-axis; (b) then around y-axis. 

3 

4 

0 

−2 

P 

R (90°) 

Rotating a point is expressed by multiplying its coordinate vector with a 3 × 3 rotation matrix R. A 

rotation around the z-axis looks as follows: 

⎡ 

p3 = Rz(γ) · p1 = ⎣ 

cos γ − sin γ 0 

sin γ cos γ 0 

0 0 1 

⎤ 

⎛ 

⎦ · ⎝ 

xp1 

yp1 

zp1 

⎞ 

⎛ 

⎠ = ⎝ 

3 

y 

2 

0 

4 

cos γ · xp1 − sin γ · yp1 

sin γ · xp1 + cos γ · yp1 

Rotations around the x- and y-axis correspond to the following rotation matrices: 

⎡ 

Ry(β) = ⎣ 

cos β 

0 

0 

1 

sin β 

0 

⎤ 

⎦ 

⎡ 

1 

Rx(α) = ⎣ 0 

0 

cos α 

0 

− sin α 

− sin β 0 cos β 

0 sin α cos α 

Chain of Rotations 

zp1 

⎤ 

P 

4 

⎞ 

⎠ (5) 

⎦ (6) 

In figure 6b, the rotated point is further rotated around the y-axis. Such a chain of rotations can be 

expressed very elegantly by a chain of rotation matrices: 

p4 = Ry(β) · p3 = Ry(β) · Rz(γ) · p1 

Note that in contrast to a multiplication of scalars, the multiplication of matrices is not commutative, i.e., 

if you change the sequence of the rotation matrices, you get a different result. 

(7) 

Basics


Rotation of Coordinate Systems 

In contrast to points, coordinate systems have an orientation relative to other coordinates systems. This 

orientation changes when the coordinate system is rotated. For example, in figure 7a the coordinate system 

c3 has been rotated around the y-axis of the coordinate system c1, resulting in a different orientation 

of the camera. Note that in order to rotate a coordinate system in your mind’s eye, it may help to image 

the points of the axis vectors being rotated. 


x c1 ( , 

y , c1 z c1 ) 

1 

c1 

0 

0 

4 

1 

y 

x 

z 

c3 y c3 z c3 


( , , ) 

4 

4 

0 

x 0 

3 

4 

0 

0 

3 

4 3 

0 

0 

0 

0 

3 

4 

4 

4 

c4 x c3 

x c3 

z c1 

x c1 

z c3 

c4 

y 

z c1 

q = 

c3 

q = 

Q 

c1 

z c4 

z c3 

c3 

c3 

y y 

c1 

c1 

q = 

y 

y 

c4 

q = 

c1 

Q = Q 

q = 

Q 

R (90°) R (−90°) 

1 

c1 

q = 

a) first rotation b) second rotation 

0 

0 

4 

Q 

1 

x c4 y c4 z c4 


( , , ) 

Figure 7: Rotate coordinate system with point: (a) first around z-axis; (b) then around y-axis. 

Just like the position of a coordinate system can be expressed directly by the translation vector (see 

equation 2 on page 9), the orientation is contained in the rotation matrix: The columns of the rotation 

matrix correspond to the axis vectors of the rotated coordinate system in coordinates of the original one: 

R = � x c1 

c3 

For example, the axis vectors of the coordinate system c3 in figure 7a can be determined from the 

corresponding rotation matrix Ry(90◦ ) as shown in the following equation; you can easily check the 

result in the figure. 

⎡ 

Ry(90◦ ) = ⎣ 

⇒ x c1 

c3 = 

yc1 

c3 

z c1 

c3 

cos(90 ◦ ) 0 sin(90 ◦ ) 

0 1 0 

− sin(90 ◦ ) 0 cos(90 ◦ ) 

⎛ 

⎝ 0 

0 

−1 

⎞ 

⎠ y c1 

c3 = 

⎛ 

⎝ 

0 

1 

0 

⎤ 

� 

⎡ 

⎦ = ⎣ 

⎞ 

⎠ z c1 

c3 = 

0 0 1 

0 1 0 

−1 0 0 

⎛ 

⎝ 

1 

0 

0 

⎞ 

⎠ 

⎤ 

⎦ 

(8)


2.1.3 Rotation 13 

Like in the case of translation, to transform point coordinates from a rotated coordinate system c3 into 

the original coordinate system c1, you apply the same transformation to the points that was applied to 

the coordinate system c3, i.e., you multiply the point coordinates with the rotation vector used to rotate 

the coordinate system c1 into c3: 

q c1 

3 = c1 Rc3 ·q c3 

3 

This is depicted in figure 7 also for a chain of rotations, which corresponds to the following equation: 

q c1 

4 = c1 Rc3 · c3 Rc4 ·q c4 

4 = Ry(β) · Rz(γ) · q c4 

4 = c1 Rc4 ·q c4 

4 

In Which Sequence and Around Which Axes are Rotations Performed? 

If you compare the chains of rotations in figure 6 and figure 7 and the corresponding equations 7 and 

10, you will note that two different sequences of rotations are described by the same chain of rotation 

matrices: In figure 6, the point was rotated first around the z-axis and then around the y-axis, whereas in 

figure 7 the coordinate system is rotated first around the y-axis and then around the z-axis. Yet, both are 

described by the chain Ry(β) · Rz(γ)! 

The solution to this seemingly paradox situation is that in the two examples the chain of rotation matrices 

can be “read” in different directions: In figure 6 it is read from the right to left, and in figure 7 from left 

to the right. 

However, there still must be a difference between the two sequences because, as we already mentioned, 

the multiplication of rotation matrices is not commutative. This difference lies in the second question in 

the title, i.e., around which axes the rotations are performed. 

Performing a chain of rotations: 

x c1 

y 

R y (90°) * R z (−90°) 

a) reading from left to right = rotating around "new" axes 

c1 

y 

z c1 

c1 

R (90°) 

c3’ 

y 

x c3’ 

b) reading from right to left = rotating around "old" axes 

c1 

y 

z c1 

x c1 

c1 

y 

z c3’ 

c1 

c1 

R z (−90°) 

c3 y 

z c3 

x c3 

y 

R (−90°) 

z 

R (90°) 

c4 

y x c4 

c4 

y x c4 

Figure 8: Performing a chain of rotations (a) from left to the right, or (b) from right to left. 

c3’ 

z c4 

z c4 

(9) 

(10) 

Basics


Let’s start with the second rotation of the coordinate system in figure 7b. Here, there are two possible 

sets of axes to rotate around: those of the “old” coordinate system c1 and those of the already rotated, 

“new” coordinate system c3. In the example, the second rotation is performed around the “new” z-axis. 

In contrast, when rotating points as in figure 6, there is only one set of axes around which to rotate: those 

of the “old” coordinate system. 

From this, we derive the following rules: 

• When reading a chain from the left to right, rotations are performed around the “new” axes. 

• When reading a chain from the right to left, rotations are performed around the “old” axes. 

As already remarked, point rotation chains are always read from right to left. In the case of coordinate 

systems, you have the choice how to read a rotation chain. In most cases, however, it is more intuitive to 

read them from left to right. 

Figure 8 shows that the two reading directions really yield the same result. 

Summary 

• Points are rotated by multiplying their coordinate vector with a rotation matrix. 

• If you rotate a coordinate system, the rotation matrix describes its resulting orientation: The column 

vectors of the matrix correspond to the axis vectors of the rotated coordinate system in coordinates 

of the original one. 

• To transform point coordinates from a rotated coordinate system c3 into the original coordinate 

system c1, you apply the same transformation to the points that was applied to the coordinate 

system, i.e., you multiply them with the rotation matrix that was used to rotate the coordinate 

system c1 into c3. 

• Multiple rotations are described by a chain of rotation matrices, which can be read in two directions. 

When read from left to right, rotations are performed around the “new” axes; when read 

from right to left, the rotations are performed around the “old” axes. 

2.1.4 Rigid Transformations and Homogeneous Transformation Matrices 

Rigid Transformation of Points 

If you combine translation and rotation, you get a so-called rigid transformation. For example, in figure 9, 

the translation and rotation of the point from figures 4 and 6 are combined. Such a transformation is 

described as follows: 

p5 = R ·p1 + t (11) 

For multiple transformations, such equations quickly become confusing, as the following example with 

two transformations shows: 

p6 = Ra ·(Rb ·p1 + tb) + ta = Ra · Rb ·p1 + Ra ·tb + ta 

(12)

p = 

1 

2.1.4 Rigid Transformations and Homogeneous Transformation Matrices 15 

0 

2 

4 

c 

y 

P 

1 

z c 

z 

4 4 −2 0 

4 

p = P 

x c 

R (−90°) 

P 

3 

p = 

3 

2 

0 

4 

y 

t = 

R (90°) 

Figure 9: Combining the translation from figure 4 on page 9 and the rotation of figure 6 on page 11 to form 

a rigid transformation. 

An elegant alternative is to use so-called homogeneous transformation matrices and the corresponding 

homogeneous vectors. A homogeneous transformation matrix H contains both the rotation matrix and 

the translation vector. For example, the rigid transformation from equation 11 can be rewritten as follows: 

� p5 

1 

� 

= 

� R t 

0 0 0 1 

� � 

p1 

· 

1 

� 

= 

� R ·p1 + t 

1 

4 

0 

0 

P 

5 

p = 

5 

� 

= H ·p1 

The usefulness of this notation becomes apparent when dealing with sequences of rigid transformations, 

which can be expressed as chains of homogeneous transformation matrices, similarly to the rotation 

chains: 

H1 · H2 = 

� Ra ta 

0 0 0 1 

� � 

Rb tb 

· 

0 0 0 1 

� � 

Ra · Rb 

= 

Ra ·tb + ta 

0 0 0 1 

As explained for chains of rotations, chains of rigid transformation can be read in two directions. When 

reading from left to right, the transformations are performed around the “new” axes, when read from 

right to left around the “old” axes. 

In fact, a rigid transformation is already a chain, since it consists of a translation and a rotation: 

� 

R 

H = 

0 0 0 

t 

1 

⎡ 

� 

⎢ 

= ⎢ 

⎣ 

1 

0 

0 

0 

1 

0 

0 

1 

1 

t 

⎤ 

⎥ 

⎦ 

0 0 0 1 

· 

⎡ 

⎢ 

⎣ R 

0 

0 

0 

⎤ 

⎥ 

⎦ = H(t) · H(R) (15) 

0 0 0 1 

If the rotation is composed of multiple rotations around axes as in figure 9, the individual rotations can 

8 

0 

−2 

� 

(13) 

(14) 

Basics


also be written as homogeneous transformation matrices: 

H = 

� 

Ry(β) · Rz(γ) 

0 0 0 

t 

1 

⎡ 

� 

⎢ 

= ⎢ 

⎣ 

1 

0 

0 

0 

1 

0 

0 

1 

1 

t 

⎤ 

⎥ 

⎦ 

0 0 0 1 

· 

⎡ 

⎢ 

⎣ Ry(β) 

0 

0 

0 

0 0 0 1 

⎤ 

⎥ 

⎦ · 

⎡ 

⎢ 

⎣ Rz(γ) 

0 

0 

0 

0 0 0 1 

Reading this chain from right to left, you can follow the transformation of the point in figure 9: First, it 

is rotated around the z-axis, then around the (“old”) y-axis, and finally it is translated. 

Rigid Transformation of Coordinate Systems 

Rigid transformations of coordinate systems work along the same lines as described for a separate translation 

and rotation. This means that the homogeneous transformation matrix c1 Hc5 describes the transformation 

of the coordinate system c1 into the coordinate system c5. At the same time, it describes the 

position and orientation of coordinate system c5 relative to coordinate system c1: Its column vectors 

contain the coordinates of the axis vectors and the origin. 

c1 Hc5 = 

� x c1 

c5 

yc1 

c5 

z c1 

c5 

o c1 

c5 

0 0 0 1 

As already noted for rotations, chains of rigid transformations of coordinate systems are typically read 

from left to right. Thus, the chain above can be read as first translating the coordinate system, then 

rotating it around its “new” y-axis, and finally rotating it around its “newest” z-axis. 


As described for the separate translation and the rotation, to transform point coordinates from a rigidly 

transformed coordinate system c5 into the original coordinate system c1, you apply the same transformation 

to the points that was applied to the coordinate system c5, i.e., you multiply the point coordinates 

with the homogeneous transformation matrix: 

� p c1 

5 

1 

� 

= c1 � c5 p 

Hc5 · 5 

1 

Typically, you leave out the homogeneous vectors if there is no danger of confusion and simply write: 

Summary 

p c1 

5 = c1 Hc5 ·p c5 

5 

• Rigid transformations consist of a rotation and a translation. They are described very elegantly by 

homogeneous transformation matrices, which contain both the rotation matrix and the translation 

vector. 

• Points are transformed by multiplying their coordinate vector with the homogeneous transformation 

matrix. 

� 

� 

⎤ 

⎥ 

⎦ 

(16) 

(17) 

(18)

2.1.5 3D Poses 17 

• If you transform a coordinate system, the homogeneous transformation matrix describes the coordinate 

system’s resulting position and orientation: The column vectors of the matrix correspond to 

the axis vectors and the origin of the coordinate system in coordinates of the original one. Thus, 

you could say that a homogeneous transformation matrix “is” the position and orientation of a 

coordinate system. 

• To transform point coordinates from a rigidly transformed coordinate system c5 into the original 

coordinate system c1, you apply the same transformation to the points that was applied to the 

coordinate system, i.e., you multiply them with the homogeneous transformation matrix that was 

used to transform the coordinate system c1 into c5. 

• Multiple rigid transformations are described by a chain of transformation matrices, which can be 

read in two directions. When read from left to the right, rotations are performed around the “new” 

axes; when read from the right to left, the transformations are performed around the “old” axes. 

HALCON Operators 

As we already anticipated at the beginning of section 2.1 on page 8, homogeneous transformation matrices 

are the answer to all our questions regarding the use of 3D coordinates. Because of this, they form the 

basis for HALCON’s operators for 3D transformations. Below, you find a brief overview of the relevant 

operators. For more details follow the links into the Reference Manual. 

• hom_mat3d_identity creates the identical transformation 

• hom_mat3d_translate translates along the “old” axes: H2 = H(t) · H1 

• hom_mat3d_translate_local translates along the “new” axes: H2 = H1 · H(t) 

• hom_mat3d_rotate rotates around the “old” axes: H2 = H(R) · H1 

• hom_mat3d_rotate_local rotates around the “new” axes: H2 = H1 · H(R) 

• hom_mat3d_compose multiplies two transformation matrices: H3 = H1 · H2 

• hom_mat3d_invert inverts a transformation matrix: H2 = H1 -1 

• affine_trans_point_3d transforms a point using a transformation matrix: p2 = H0 ·p1 

2.1.5 3D Poses 

Homogeneous transformation matrices are a very elegant means of describing transformations, but their 

content, i.e., the elements of the matrix, are often difficult to read, especially the rotation part. This 

problem is alleviated by using so-called 3D poses. 

A 3D pose is nothing more than an easier-to-understand representation of a rigid transformation: 

Instead of the 12 elements of the homogeneous transformation matrix, a pose describes 

the rigid transformation with 6 parameters, 3 for the rotation and 3 for the translation: 

(TransX, TransY, TransZ, RotX, RotY, RotZ). The main principle behind poses is that even a rotation 

around an arbitrary axis can always be represented by a sequence of three rotations around the axes 

of a coordinate system. 

In HALCON, you create 3D poses with create_pose; to transform between poses and homogeneous 

matrices you can use hom_mat3d_to_pose and pose_to_hom_mat3d. 

Basics


Sequence of Rotations 

However, there is more than one way to represent an arbitrary rotation by three parameters. This is 

reflected by the HALCON operator create_pose, which lets you choose between different pose types 

with the parameter OrderOfRotation. If you pass the value ’gba’, the rotation is described by the 

following chain of rotations: 

Rgba = Rx(RotX) · Ry(RotY) · Rz(RotZ) (19) 

You may also choose the inverse order by passing the value ’abg’: 

Rabg = Rz(RotZ) · Ry(RotY) · Rx(RotX) (20) 

For example, the transformation discussed in the previous sections can be represented by the homogeneous 

transformation matrix 

� 

Ry(β) · Rz(γ) 

H = 

0 0 0 

⎡ 

� cos β · cos γ 

t ⎢ 

= ⎢ sin γ 

1 ⎣ − sin β · cos γ 

− cos β · sin γ 

cos γ 

sin β · sin γ 

sin β 

0 

cos β 

xt 

yt 

zt 

⎤ 

⎥ 

⎦ 

0 0 0 1 

The corresponding pose with the rotation order ’gba’ is much easier to read: 

(TransX = xt, TransY = yt, TransZ = zt, RotX = 0, RotY = 90 ◦ , RotZ = 90 ◦ ) 

If you look closely at figure 7 on page 12, you can see that the rotation can also be described by the 

sequence Rz(90 ◦ ) · Rx(90 ◦ ). Thus, the transformation can also be described by the following pose with 

the rotation order ’abg’: 

(TransX = xt, TransY = yt, TransZ = zt, RotX = 90 ◦ , RotY = 0, RotZ = 90 ◦ ) 

HALCON Operators 

Below, the relevant HALCON operators for dealing with 3D poses are briefly described. For more details 

follow the links into the Reference Manual. 

• create_pose creates a pose 

• hom_mat3d_to_pose converts a homogeneous transformation matrix into a pose 

• pose_to_hom_mat3d converts a pose into a homogeneous transformation matrix 

• convert_pose_type changes the pose type 

• write_pose writes a pose into a file 

• read_pose reads a pose from a file 

• set_origin_pose translates a pose along its “new” axes

How to Determine the Pose of a Coordinate System 

2.2 Camera Model and Parameters 19 

The previous sections showed how to describe known transformations using translation vectors, rotations 

matrices, homogeneous transformation matrices, or poses. Sometimes, however, there is another task: 

How to describe the position and orientation of a coordinate system with a pose. This is necessary, e.g., 

when you want to use your own calibration object and need to determine starting values for the exterior 

camera parameters as described in section 3.1.3 on page 32. 

Figure 10 shows how to proceed for a rather simple example. The task is to determine the pose of the 

world coordinate system from figure 3 on page 8 relative to the camera coordinate system. 

4 

c’ 

Camera coordinate system t = −1.3 Intermediate coordinate system R y (180°) World coordinate system 

4 

c 

y 

x c y c z c ( , , ) 

z c 

x c 

x w 

z w 

y w 

c 

y 

( x , , c’ y c’ z c’ ) 

z c 

x c 

x w 

z w 

y 

c’ 

y 

w 

c’ 

z 

x c’ 

c 

y 

c 

P w = 

z c 

x c 

x w y w z w ( , , ) 

x w=c 

y w=c 

(4, −1.3, 4, 0, 180°, 0) 

Figure 10: Determining the pose of the world coordinate system in camera coordinates. 

In such a case, we recommend to build up the rigid transformation from individual translations and 

rotations from left to right. Thus, in figure 10 the camera coordinate system is first translated such that 

its origin coincides with that of the world coordinate system. Now, the y-axes of the two coordinate 

systems coincide; after rotating the (translated) camera coordinate system around its (new) y-axis by 

180 ◦ , it has the correct orientation. 

2.2 Camera Model and Parameters 

If you want to derive accurate world coordinates from your imagery, you first have to calibrate your 

camera. To calibrate a camera, a model for the mapping of the 3D points of the world to the 2D image 

generated by the camera, lens, and frame grabber is necessary. 

HALCON supports the calibration of two different kinds of cameras: area scan cameras and line scan 

cameras. While area scan cameras aquire the image in one step, line scan cameras generate the image 

line by line (see Application Note on Image Acquisition, section 6.3 on page 34). Therefore, the line 

scan camera must move relative to the object during the acquisition process. 

Two different types of lenses are relevant for machine vision tasks. The first type of lens effects a 

perspective projection of the world coordinates into the image, just like the human eye does. With this 

type of lens, objects become smaller in the image the farther they are away from the camera. This 

combination of camera and lens is called a pinhole camera model because the perspective projection can 

z w=c 

Basics


also be achieved if a small hole is drilled in a thin planar object and this plane is held parallel in front of 

another plane (the image plane). 

The second type of lens that is relevant for machine vision is called a telecentric lens. Its major difference 

is that it effects a parallel projection of the world coordinates onto the image plane (for a certain range 

of distances of the object from the camera). This means that objects have the same size in the image 

independent of their distance to the camera. This combination of camera and lens is called a telecentric 

camera model. 

In the following, first the camera model for area scan cameras is described in detail, then, the camera 

model for line scan cameras is explained. 

2.2.1 Area scan cameras 

Figure 11 displays the perspective projection effected by a pinhole camera graphically. The world point 

P is projected through the optical center of the lens to the point P ′ in the image plane, which is located 

at a distance of f (the focal length) behind the optical center. Actually, the term focal length is not 

quite correct and would be appropriate only for an infinite object distance. To simplify matters, in the 

following always the term focal length is used even if the image distance is meant. Note that the 

focal length and thus the focus must not be changed after applying the camera calibration. 

Although the image plane in reality lies behind the optical center of the lens, it is easier to pretend that 

it lies at a distance of f in front of the optical center, as shown in figure 12. This causes the image 

coordinate system to be aligned with the pixel coordinate system (row coordinates increase downward 

and column coordinates to the right) and simplifies most calculations. 

With this, we are now ready to describe the projection of objects in 3D world coordinates to the 2D 

image plane and the corresponding camera parameters. First, we should note that the points P are given 

in a world coordinate system (WCS). To make the projection into the image plane possible, they need 

to be transformed into the camera coordinate system (CCS). The CCS is defined so that its x and y axes 

are parallel to the column and row axes of the image, respectively, and the z axis is perpendicular to the 

image plane. 

The transformation from the WCS to the CCS is a rigid transformation, which can be expressed by a pose 

or, equivalently, by the homogeneous transformation matrix c Hw . Therefore, the camera coordinates 

p c = (x c , y c , z c ) T of point P can be calculated from its world coordinates p w = (x w , y w , z w ) T 

simply by 

p c = c Hw ·p w 

The six parameters of this transformation (the three translations tx, ty, and tz and the three rotations α, 

β, and γ) are called the exterior camera parameters because they determine the position of the camera 

with respect to the world. In HALCON, they are stored as a pose, i.e, together with a code that describes 

the order of translation and rotations. 

The next step is the projection of the 3D point given in the CCS into the image plane coordinate system 

(IPCS). For the pinhole camera model, the projection is a perspective projection, which is given by 

� � 

u 

= 

v 

f 

zc � c x 

yc � 

(22) 

(21)

CCD chip 

c 

Camera with 

optical center 

y w 

Sx 

z w 

u 

y c 

x w 

P’ 

Cx 

z c 

f 

v 

C y 

x c 

Sy 

P 

r 

Image plane coordinate system 

Image coordinate system ( r, c) 

Figure 11: Perspective projection by a pinhole camera. 

2.2.1 Area scan cameras 21 

( u, v) 

x c y c z c Camera coordinate system ( , , ) 

x w y w z w 

World coordinate system ( , , ) 

Basics


Camera with 


Virtual image plane 

y w 

r 

S y 

z w 

C 

y 

y c 

v 

x w 

z c 

f 

C 

P’ 

x 

x c 

u 

Sx 

P 

Figure 12: Image plane and virtual image plane. 

c 




( u, v) 

x w y w z w 


For the telecentric camera model, the projection is a parallel projection, which is given by 

� � � c u x 

= 

v yc � 

As can be seen, there is no focal length f for telecentric cameras. Furthermore, the distance z of the 

object to the camera has no influence on the image coordinates. 

After the projection into the image plane, the lens distortion causes the coordinates (u, v) T to be modified. 

If no lens distortions were present, the projected point P ′ would lie on a straight line from P 

(23)

2.2.1 Area scan cameras 23 

through the optical center, indicated by the dotted line in figure 13. Lens distortions cause the point P ′ 

to lie at a different position. 

CCD chip 

Optical center 

P’ 

Figure 13: Schematic illustration of the effect of the lens distortion. 

The lens distortion is a transformation that can be modeled in the image plane alone, i.e., 3D information 

is unnecessary. For most lenses, the distortion can be approximated sufficiently well by a radial 

distortion, which is given by 

� � 

ũ 

= 

˜v 

2 

1 + � 1 − 4κ(u 2 + v 2 ) 

f 

� � 

u 

v 

The parameter κ models the magnitude of the radial distortions. If κ is negative, the distortion is barrelshaped, 

while for positive κ it is pincushion-shaped (see figure 14). This model for the lens distortions 

has the great advantage that the distortion correction can be calculated analytically by 

� � 

u 

v 

= 

1 

1 + κ(ũ 2 + ˜v 2 ) 

Finally, the point (ũ, ˜v) T is transformed from the image plane coordinate system into the image coordinate 

system (the pixel coordinate system): 

⎛ 

� � 

r 

= ⎝ 

c 

˜v 

Sy 

ũ 

Sx 

+ Cy 

+ Cx 

� ũ 

˜v 

⎞ 

� 

P 

(24) 

(25) 

⎠ (26) 

Here, Sx and Sy are scaling factors. For pinhole cameras, they represent the horizontal and vertical 

distance of the sensors elements on the CCD chip of the camera. For cameras with telecentric lenses, 

they represent the size of a pixel in world coordinates (not taking into account the radial distortions). 

The point (Cx, Cy) T is the principal point of the image. For pinhole cameras, this is the perpendicular 

Basics


Figure 14: Effect of radial distortions for κ > 0 (left), κ = 0 (middle), and κ < 0 (right). 

projection of the optical center onto the image plane, i.e., the point in the image from which a ray through 

the optical center is perpendicular to the image plane. It also defines the center of the radial distortions. 

For telecentric cameras, no optical center exists. Therefore, the principal point is solely defined by the 

radial distortions. 

The six parameters (f, κ, Sx, Sy, Cx, Cy) of the pinhole camera and the five parameters 

(κ, Sx, Sy, Cx, Cy) of the telecentric camera are called the interior camera parameters because they 

determine the projection from 3D to 2D performed by the camera. 

In HALCON, the differentiation among the two camera models (pinhole and telecentric) is done based 

on the value of the focal length. If it has a positive value, a pinhole camera with the given focal length is 

assumed. If the focal length is set to zero, the telecentric camera model is used. 

With this, we can see that camera calibration is the process of determining the interior camera parameters 

(f, κ, Sx, Sy, Cx, Cy) and the exterior camera parameters (tx, ty, tz, α, β, γ). 

2.2.2 Line scan cameras 

A line scan camera has only a one-dimensional line of sensor elements, i.e., to acquire an image, the 

camera must move relative to the object (see figure 15). This means that the camera moves over a fixed 

object, the object travels in front of a fixed camera, or camera and object are both moving. 

The relative motion between the camera and the object is modeled in HALCON as part of the interior 

camera parameters. In HALCON, the following assumptions for this motion are made: 

1. the camera moves — relative to the object — with constant velocity along a straight line 

2. the orientation of the camera is constant with respect to the object 

3. the motion is equal for all images 

The motion is described by the motion vector V = (Vx, Vy, Vz) T , which must be given in [meters/scanline] 

in the camera coordinate system. The motion vector describes the motion of the camera, 

i.e., it assumes a fixed object. In fact, this is equivalent to the assumption of a fixed camera with the 

object traveling along −V .

CCD sensor line 

Motion vector 

⎛ ⎞ 

⎜ Vx ⎟ 

⎜ ⎟ 

⎜ ⎟ 

⎜ Vy 

⎟ 

⎜ ⎟ 

⎝ ⎠ 

Vz 

Optical center 

Figure 15: Principle of line scan image acquisition. 

2.2.2 Line scan cameras 25 

The camera coordinate system of line scan cameras is defined as follows (see figure 16 on page 26): The 

origin of the coordinate system is the center of projection. The z-axis is identical to the optical axis and 

it is directed so that the visible points have positive z coordinates. The y-axis is perpendicular to the 

sensor line and to the z-axis. It is directed so that the motion vector has a positive y-component, i.e., if a 

fixed object is assumed, the y-axis points in the direction in which the camera is moving. The x-axis is 

perpendicular to the y- and z-axis, so that the x-, y-, and z-axis form a right-handed coordinate system. 

Similarly to area scan cameras, the projection of a point given in world coordinates into the image is 

modeled in two steps: First, the point is transformed into the camera coordinate system. Then, it is 

projected into the image. 

As the camera moves over the object during the image acquisition, also the camera coordinate system 

moves relative to the object, i.e., each image line has been imaged from a different position. This means 

that there would be an individual pose for each image line. To make things easier, in HALCON all 

transformations from world coordinates into camera coordinates and vice versa are based on the pose of 

the first image line only. The motion V is taken into account during the projection of the point p c into 

the image. 

The transformation from the WCS to the CCS of the first image line is a rigid transformation, which can 

be expressed by a pose or, equivalently, by the homogeneous transformation matrix c Hw . Therefore, 

the camera coordinates pc = (xc , yc , zc ) T of point P can be calculated from its world coordinates 

pw = (xw , yw , zw ) T simply by 

p c = c Hw ·p w 

(27) 

The six parameters of this transformation (the three translations tx, ty, and tz and the three rotations α, 

β, and γ) are called the exterior camera parameters because they determine the position of the camera 

Basics


y w 

Camera with 



z w 

r s 

Cy 

x w 

Sy 

⎛ 

⎜ 

⎝ 

y c 

v 

−Vx 

−Vy 

−Vz 

z c 

⎞ 

f 

⎟ 

⎠ 

Cx 

P ′ 

x c 

u 

Sx 

c s 

Camera coordinate system (x c , y c , z c ) 

Sensor line coordinate system (r s , c s ) 

Image plane coordinate system (u, v) 

P 

World coordinate system (x w , y w , z w ) 

Figure 16: Coordinate systems in regard to a line scan camera. 

with respect to the world. In HALCON, they are stored as a pose, i.e, together with a code that describes 

the order of translation and rotations. 

For line scan cameras, the projection of the point p c that is given in the camera coordinate system of the 

first image line into a (sub-)pixel [r,c] in the image is defined as follows: 

Assuming 

⎛ 

p c = ⎝ 

x 

y 

z 

⎞ 

⎠ , 

the following set of equations must be solved for m, ũ, and t: 

with 

m · D · ũ = x − t · Vx 

−m · D · pv = y − t · Vy 

D = 

m · f = z − t · Vz 

1 

1 + κ(ũ 2 + pv 2 )

3 3D Machine Vision in a Specified Plane With a Single Camera 27 

pv = Sy · Cy 

This already includes the compensation for radial distortions. 

Finally, the point is transformed into the image coordinate system, i.e., the pixel coordinate system: 

c = ũ 

Sx 

+ Cx and r = t 

Sx and Sy are scaling factors. Sx represents the distance of the sensor elements on the CCD line, Sy is 

the extent of the sensor elements in y-direction. The point (Cx, Cy) T is the principal point. Note that 

in contrast to area scan images, (Cx, Cy) T does not define the position of the principal point in image 

coordinates. It rather describes the relative position of the principal point with respect to the sensor line. 

The nine parameters (f, κ, Sx, Sy, Cx, Cy, Vx, Vy, Vz) of the pinhole line scan camera are called the 

interior camera parameters because they determine the projection from 3D to 2D performed by the 

camera. 

As for area scan cameras, the calibration of a line scan camera is the process of determining 

the interior camera parameters (f, κ, Sx, Sy, Cx, Cy, Vx, Vy, Vz) and the exterior camera parameters 

(tx, ty, tz, α, β, γ) of the first image line. 

3 3D Machine Vision in a Specified Plane With a Single Camera 

In HALCON it is easy to obtain undistorted measurements in world coordinates from images. In general, 

this can only be done if two or more images of the same object are taken at the same time with cameras 

at different spatial positions. This is the so-called stereo approach; see section 7 on page 88. 

In industrial inspection, we often have only one camera available and time constraints do not allow us 

to use the expensive process of finding corresponding points in the stereo images (the so-called stereo 

matching process). 

Nevertheless, it is possible to obtain measurements in world coordinates for objects acquired through 

telecentric lenses and objects that lie in a known plane, e.g., on an assembly line, for pinhole cameras. 

Both of these tasks can be solved by intersecting an optical ray (also called line of sight) with a plane. 

With this, it is possible to measure objects that lie in a plane, even when the plane is tilted with respect to 

the optical axis. The only prerequisite is that the camera has been calibrated. In HALCON, the calibration 

process is very easy as can be seen in the following first example, which introduces the operators that are 

necessary for the calibration process. 

The easiest way to perform the calibration is to use the HALCON standard calibration plates. You just 

need to take a few images of the calibration plate (see figure 17 for an example), where in one image the 

calibration plate has been placed directly on the measurement plane. 

Note that the calibration plate has an asymmetrical pattern such that the coordinate system can be 

uniquely determined. Older calibration plates do not have this pattern but you can easily add it by 

yourself (see appendix A on page 136). 

Single Camera


Figure 17: The HALCON calibration plate. 

After reading in the calibration images, the operators find_caltab and find_marks_and_pose can be 

used to detect the calibration plate and to determine the exact positions of the (dark) calibration targets on 

it. Additionally, some approximate values are determined, which are necessary for the further processing. 

find_caltab (Image, Caltab, CaltabName, SizeGauss, MarkThresh, 

MinDiamMarks) 

find_marks_and_pose (Image, Caltab, CaltabName, StartCamPar, 

StartThresh, DeltaThresh, MinThresh, Alpha, 

MinContLength, MaxDiamMarks, RCoord, CCoord, 

StartPose) 

After collecting the positions of the calibration targets and the approximate values for all the calibration 

images, the operator camera_calibration can be called. It determines the interior camera parameters 

as well as the pose of the calibration plate in each of the calibration images. 

camera_calibration (X, Y, Z, NRow, NCol, StartCamPar, NStartPose, ’all’, 

CamParam, NFinalPose, Errors) 

Now, you can pick the pose of the calibration plate from the image, where the calibration plate has been 

placed on the measurement plane. 

Based on this pose, it is easy to transform image coordinates into world coordinates. For example, to 

transform point coordinates, the operator image_points_to_world_plane can be used. 

image_points_to_world_plane (CamParam, Pose, Row, Col, ’mm’, X1, Y1) 

Alternatively, the image can be transformed into the world coordinate system by using the operator 

image_to_world_plane (see section 3.3.1 on page 49). 

image_to_world_plane (Image, ImageMapped, CamParam, PoseForCenteredImage, 

WidthMappedImage, HeightMappedImage, 

ScaleForCenteredImage, ’bilinear’) 

In the following sections, we will describe the calibration process as well as the transformation between 

the image and the world coordinates in detail.

3.1 Calibrating the Camera 

3.1 Calibrating the Camera 29 

In HALCON, area scan cameras as well as line scan cameras can be calibrated. In both cases, the same 

operators are used. The differentiation among area scan and line scan cameras is done based on the 

number of interior camera parameters. If the interior camera parameters contain the motion vector, a line 

scan camera is assumed. Otherwise, the camera model of an area scan camera is used. See section 2.2 

on page 19 for the description of the underlying camera models. 

As you have seen above, in HALCON the calibration is determined simply by using the operator camera_calibration. 

Its input can be grouped into two categories: 

1. Corresponding points, given in world coordinates as well as in image coordinates 

2. Initial values for the camera parameters. 

The first category of input parameters requires the location of a sufficiently large number of 3D points in 

world coordinates and the correspondence between the world points and their projections in the image. 

To define the 3D points in world coordinates, usually objects or marks that are easy to extract, e.g., 

circles or linear grids, are placed into known locations. If the location of a camera must be known with 

respect to a given coordinate system, e.g., with respect to the building plan of, say, a factory building, 

then each mark location must be measured very carefully within this coordinate system. Fortunately, 

in most cases it is sufficient to know the position of a reference object with respect to the camera to 

be able to measure the object precisely, since the absolute position of the object in world coordinates is 

unimportant. Therefore, we propose to use a HALCON calibration plate (figure 18). See section 3.1.6 on 

page 43 on how to obtain this calibration plate. You can place it almost anywhere in front of the camera 

to determine the camera parameters. 

Figure 18: Examples of calibration plates used by HALCON . 

The determination of the correspondence of the known world points and their projections in the image is 

in general a hard problem. The HALCON calibration plate is constructed such that this correspondence 

can be determined automatically. 

Also the second category of input parameters, the starting values, can be determined automatically if the 

HALCON calibration plate is used. 

Single Camera


The results of the operator camera_calibration are the interior camera parameters and the pose of 

the calibration plate in each of the images from which the corresponding points were determined. If the 

calibration plate was placed directly on the measurement plane its pose can be used to easily derive the 

exterior camera parameters, which are the pose of the measurement plane. 

Note that the determination of the interior and of the exterior camera parameters can be separated. For 

this, the operator camera_calibration must be called twice. First, for the determination of the interior 

camera parameters only. Then, for the determination of the exterior camera parameters with the interior 

camera parameters remaining unchanged. This may be useful in cases where the measurements should 

be carried out in several planes when using a single camera. 

In the following, the calibration process is described in detail, especially the determination of the necessary 

input values. Additionally, some hints are given on how to obtain precise results. 

3.1.1 Camera Calibration Input I: Corresponding Points 

The first category of input parameters for the operator camera_calibration comprises corresponding 

points, i.e., points, for which the world coordinates as well as the image coordinates of their projections 

into the image are given. 

If the HALCON calibration plate is used, the world coordinates of the calibration marks can be read 

from the calibration plate description file using the operator caltab_points. It returns the coordinates 

stored in the tuples X, Y, and Z. The length of these tuples depends on the number of calibration marks. 

Assume we have a calibration plate with m calibration marks. Then, X, Y, and Z are of length m. 

caltab_points (CaltabName, X, Y, Z) 

As mentioned above, it is necessary to extract the marks of the calibration plate and to know the correspondence 

between the marks extracted from the image and the respective 3D points. If the HALCON 

calibration plate is used, this can be achieved by using the operator find_caltab to find the inner part 

of the calibration plate and find_marks_and_pose to locate the centers of the circles and to determine 

the correspondence. 

for I := 1 to NumImages by 1 

read_image (Image, ImgPath+’calib_’+I$’02d’) 


MinDiamMarks) 




StartPose) 

NStartPose := [NStartPose,StartPose] 

NRow := [NRow,RCoord] 

NCol := [NCol,CCoord] 

endfor 

find_caltab searches for the calibration plate based on the knowledge that it appears bright with dark 

calibration marks on it. SizeGauss determines the size of the Gauss filter that is used to smooth the 

input image. A larger value leads to a stronger smoothing, which might be necessary if the image is 

very noisy. After smoothing the image, a thresholding operator with minimum gray value MarkThresh

3.1.2 Rules for Taking Calibration Images 31 

and maximum gray value 255 is applied with the intention to find the calibration plate. Therefore, 

MarkThresh should be set to a gray value that is lower than that of the white parts of the calibration 

plate, but preferably higher than that of any other large bright region in the image. Among the regions 

resulting from the threshold operation, the most convex region with an almost correct number of holes 

(corresponding to the dark marks of the calibration plate) is selected. Holes with a diameter smaller than 

MinDiamMarks are eliminated to reduce the impact of noise. The number of marks is read from the 

calibration plate description file CalTabDescrFile. 

find_marks_and_pose extracts the calibration marks and precisely determines their image coordinates. 

Therefore, in the input image Image an edge detector is applied to the region CalTabRegion, which 

can be found by the operator find_caltab. The edge detector can be controlled via the parameter 

Alpha. Larger values for Alpha lead to a higher sensitivity of the edge detector with respect to small 

details, but also to less robustness to noise. 

In the edge image, closed contours are extracted. For the detection of the contours a threshold operator 

is applied to the amplitude of the edges. All points with a high amplitude (i.e., borders of marks) are 

selected. First, the threshold value is set to StartThresh. If the search for the closed contours or 

the successive pose estimate (see section 3.1.3) fails, this threshold value is successively decreased by 

DeltaThresh down to a minimum value of MinThresh. 

The number of closed contours must correspond to the number of calibration marks as described in 

the calibration plate description file CalTabDescrFile and the contours must have an elliptical shape. 

Contours shorter than MinContLength are discarded, just as contours enclosing regions with a diameter 

larger than MaxDiamMarks (e.g., the border of the calibration plate). 

The image coordinates of the calibration marks are determined by applying find_marks_and_pose for 

each image separately. They must be concatenated such that all row coordinates are together in one tuple 

and all column coordinates are in a second tuple. 

The length of these tuples depends on the number of calibration marks and on the number of calibration 

images. Assume we have a calibration plate with m calibration marks and l calibration images. Then, 

the tuples containing all the image coordinates of the calibration marks have a length of m · l, because 

they contain the coordinates of the m calibration marks extracted from each of the l images. The order of 

the values is “image by image”, i.e., the first m values correspond to the coordinates of the m calibration 

marks extracted from the first image, namely in the order in which they appear in the parameters X, Y, 

and Z, which are returned by the operator caltab_points. The next m values correspond to the marks 

extracted from the second image, etc. 

Note that the order of all the parameter values must be followed strictly. Therefore, it is very important 

that each calibration mark is extracted in each image. 

3.1.2 Rules for Taking Calibration Images 

If you want to achieve accurate results, please follow the rules given in this section: 

• Use a clean calibration plate. 

• Cover the whole field of view with multiple images, i.e, place the calibration plate in all areas of 

the field of view at least once. 

Single Camera


• Vary the orientations of the calibration plate. This includes rotations around the x- and y-axis 

of the calibration plate, such that the perspective distortions of the calibration pattern are clearly 

visible. 

• Use at least 10 – 15 images. 

• Use an illumination where the background is darker than the calibration plate. 

• The bright parts of the calibration plate should have a gray value of at least 100. 

• The contrast between the bright and the dark parts of the calibration plate should be more than 100 

gray values. 

• Use an illumination where the calibration plate is homogeneous 

• The images should not be overexposed. 

• The diameter of a circle should be at least 10 pixels. 

• The calibration plate should be completely inside the image. 

• The images should contain as little noise as possible. 

If you take into account these few rules for the acquisition of the calibration images, you can expect all 

HALCON operators used for the calibration process to work properly. 

If only one image is used for the calibration process or if the orientations of the calibration plate do not 

vary over the different calibration images it is not possible to determine both the focal length and the pose 

of the camera correctly; only the ratio between the focal length and the distance between calibration plate 

and camera can be determined in this case. Nevertheless, it is possible to measure world coordinates in 

the plane of the calibration plate but it is not possible to adapt the camera parameters in order to measure 

in another plane, e.g., the plane onto which the calibration plate was placed. 

The accuracy of the resulting world coordinates depends — apart of the measurement accuracy in the 

image — very much on the number of images used for the calibration process. The more images are 

used, the more accurate results will be achieved. 

3.1.3 Camera Calibration Input II: Initial Values 

The second category of input parameters of the operator camera_calibration comprises initial values 

for the camera parameters. 

As the camera calibration is a difficult non-linear optimization problem, good initial values are required 

for the parameters. 

The initial values for the interior camera parameters can be determined from the specifications of the 

CCD sensor and the lens. They must be given as a tuple of the form [f, κ, Sx, Sy, Cx, Cy, NumColumns, 

NumRows] for area scan cameras and [f, κ, Sx, Sy, Cx, Cy, NumColumns, NumRows, Vx, Vy, Vz] for 

line scan cameras, respectively, i.e., in addition to the interior camera parameters, the width (number of 

columns) and height (number of rows) of the image must be given. See section 2.2 on page 19 for a 

description of the interior camera parameters.

StartCamPar := [0.016,0,0.0000074,0.0000074,326,247,652,494] 

3.1.3 Camera Calibration Input II: Initial Values 33 

In the following, some hints for the determination of the initial values for the interior camera parameters 

of an area scan camera are given: 

Focus f: The initial value is the nominal focal length of the used lens, e.g., 0.016 m. 

κ: Use 0.0 as initial value. 

Sx: For pinhole cameras, the initial value for the horizontal distance between two neighboring 

CCD cells depends on the dimension of the used CCD chip of the camera 

(see technical specifications of the camera). Generally, common CCD chips are 

either 1/3”-Chips (e.g., SONY XC-73, SONY XC-777), 1/2”-Chips (e.g., SONY 

XC-999, Panasonic WV-CD50), or 2/3”-Chips (e.g., SONY DXC-151, SONY XC- 

77). Notice: The value of Sx increases if the image is sub-sampled! Appropriate 

initial values are: 

Full image (640*480) Subsampling (320*240) 

1/3"-Chip 0.0000055 m 0.0000110 m 

1/2"-Chip 0.0000086 m 0.0000172 m 

2/3"-Chip 0.0000110 m 0.0000220 m 

The value for Sx is calibrated, since the video signal of a CCD camera normally is 

not sampled pixel-synchronously. 

Sy: Since most off-the-shelf cameras have square pixels, the same values for Sy are 

valid as for Sx. In contrast to Sx the value for Sy will not be calibrated for pinhole 

cameras because the video signal of a CCD camera normally is sampled linesynchronously. 

Thus, the initial value is equal to the final value. Appropriate initial 

values are: 


1/3"-Chip 0.0000055 m 0.0000110 m 

1/2"-Chip 0.0000086 m 0.0000172 m 

2/3"-Chip 0.0000110 m 0.0000220 m 

Cx and Cy: Initial values for the coordinates of the principal point are the coordinates of the 

image center, i.e., the half image width and the half image height. Notice: The 

values of Cx and Cy decrease if the image is subsampled! Appropriate initial values 

are, for example: 

ImageWidth 

and 

ImageHeight: 


Cx 320.0 160.0 

Cy 240.0 120.0 

These two parameters are set by the the used frame grabber and therefore are not 

calibrated. Appropriate initial values are, for example: 


ImageWidth 640 320 

ImageHeight 480 240 

Single Camera


In the following, some hints for the determination of the initial values for the interior camera parameters 

of a line scan camera are given: 

Focus f: The initial value is the nominal focal length of the the used lens, e.g., 0.035 m. 

κ: Use 0.0 as initial value. 

Sx: The initial value for the horizontal distance between two neighboring sensor elements 

can be taken from the technical specifications of the camera. Typical initial 

values are 7·10 −6 m, 10·10 −6 m, and 14·10 −6 m. Notice: The value of Sx increases 

if the image is subsampled! 

Sy: The initial value for the size of a cell in the direction perpendicular to the sensor 

line can also be taken from the technical specifications of the camera. Typical 

initial values are 7·10 −6 m, 10·10 −6 m, and 14·10 −6 m. Notice: The value of Sy 

increases if the image is subsampled! In contrast to Sx, the value for Sy will NOT 

be calibrated for line scan cameras because it cannot be determined separately from 

the parameter Cy. 

Cx: The initial value for the x-coordinate of the principal point is half the image width. 

Notice: The values of Cx decreases if the image is subsampled! Appropriate initial 

values are: 

Image width: 1024 2048 4096 8192 

Cx: 512 1024 2048 4096 

Cy: Normally, the initial value for the y-coordinate of the principal point can be set to 

0. 

ImageWidth 

and 

ImageHeight: 

These two parameters are determined by the used frame grabber and therefore are 

not calibrated.

3.1.3 Camera Calibration Input II: Initial Values 35 

Vx, Vy, Vz: The initial values for the x-, y-, and z-component of the motion vector depend on 

the image acquisition setup. Assuming a fixed camera that looks perpendicularly 

onto a conveyor belt, such that the y-axis of the camera coordinate system is antiparallel 

to the moving direction of the conveyor belt (see figure 19 on page 36), 

the initial values are Vx = Vz = 0. The initial value for Vy can then be determined, 

e.g., from a line scan image of an object with known size (e.g., calibration plate or 

ruler): 

Vy = L[m]/L[row] 

with: 

L[m] = Length of the object in object coordinates [meter] 

L[row] = Length of the object in image coordinates [rows] 

If, compared to the above setup, the camera is rotated 30 degrees around its optical 

axis, i.e., around the z-axis of the camera coordinate system (figure 20 on page 37), 

the above determined initial values must be changed as follows: 

Vzx = sin(30 ◦ ) · Vy 

Vzy = cos(30 ◦ ) · Vy 

Vzz = Vz = 0 

If, compared to the first setup, the camera is rotated -20 degrees around the x-axis 

of the camera coordinate system (figure 21 on page 38), the following initial values 

result: 

Vxx = Vx = 0 

Vxy = cos(−20 ◦ ) · Vy 

Vxz = sin(−20 ◦ ) · Vy 

The quality of the initial values for Vx, Vy, and Vz are crucial for the success of the 

whole calibration. If they are not accurate enough, the calibration may fail. 

The initial values for the exterior parameters are in general harder to obtain. For the HALCON calibration 

plate, good starting values are computed by the operator find_marks_and_pose based on the geometry 

and size of the projected calibration marks. Again, these values are determined for each calibration image 

separately. They must be concatenated into one tuple. Assume we have l calibration images. Then, the 

length of this tuple is l · 7 (l times the 6 exterior camera parameters together with the code for the pose 

type). The first 7 values correspond to the camera pose of the first image, the next 7 values to the pose 

of the second image, etc. 

If you use another calibration object the operator find_marks_and_pose cannot be used. In this case, 

you must determine the initial values for the exterior parameters yourself. 

Single Camera


y w 

Camera with 



z w 

r s 

Cy 

x w 

Sy 

⎛ 

⎜ 

⎝ 

y c 

v 

−Vx 

−Vy 

−Vz 

z c 

⎞ 

f 

⎟ 

⎠ 

Cx 

P ′ 

x c 

u 

Sx 

c s 




P 


Figure 19: Line scan camera looking perpendicularly onto a conveyor belt. 

3.1.4 Determining the Interior Camera Parameters 

Given some initial values for the camera parameters, the known 3D locations of the calibration marks 

can be projected into the CCS. Then, the camera parameters can be determined such that the distance of 

the projections of the calibration marks and the mark locations extracted from the imagery is minimized. 

This minimization process will return fairly accurate values for the camera parameters. However, to 

obtain the camera parameters with the highest accuracy, it is essential that more than one image of the 

calibration plate is taken, where the plate is placed and rotated differently in each image so as to use 

all degrees of freedom of the exterior orientation. A typical sequence of images used for calibration is 

displayed in figure 22. 

If l images of the calibration plate are taken, the parameters to optimize are the interior parameters and 

l sets of the exterior parameters. Now, the aim of the optimization is to determine all these parameters 

such that in each of the l images the distance of the extracted mark locations and the projections of the respective 

3D locations is minimal. In HALCON, this is exactly what the operator camera_calibration 

does.

y w 

Camera with 


Sy 


z w 

r s 

x w 

y c 

z c 

f 

x c 

3.1.4 Determining the Interior Camera Parameters 37 


P ′ Sensor line coordinate system (rs , cs Cx Sx 

) 

Cy cs ⎛ 

⎜ 

⎝ 

v 

−Vx 

−Vy 

−Vz 

⎞ 

⎟ 

⎠ 

u 


P 


Figure 20: Line scan camera rotated around the optical axis. 



The operator camera_calibration needs the coordinates of the corresponding points in the world coordinate 

system and the pixel coordinate system as well as some initial values for the camera parameters. 

See section 3.1.1 on page 30 and section 3.1.3 on page 32 for a description on how to obtain these input 

values. 

If the parameter EstimateParams is set to ’all’, the interior parameters for the used camera 

are determined as well as the exterior parameters for each image. If the parameter is 

set to ’pose’, only the exterior parameters are determined. To determine just selected parameters, 

EstimateParams can be set to a list that contains the respective parameter names ([’alpha’,’beta’,’gamma’,’transx’,’transy’,’transz’,’focus’,’kappa’,’cx’,’cy’,’sx’,’sy’]). 

It is also possible to prevent 

the determination of certain parameters by adding their names with the prefix ~ to the list, e.g., if 

EstimateParams is set to [’all’, ’~focus’], all parameters but the focal length are determined. 

The computed average errors (Errors) give an impression of the accuracy of the calibration. The error 

values (deviations in x- and y-coordinates) are given in pixels. 

Up to now, the exterior parameters are not necessarily related to the measurement plane. 

Single Camera


y w 

Camera with 



r s 

z w 

Cy 

Sy 

x w 

v 

⎛ 

⎜ 

⎝ 

y c 

f 

−Vx 

−Vy 

−Vz 

z c 

⎞ 

⎟ 

⎠ 

Cx 

P ′ 

u 

x c 

Sx 

c s 

P 





Figure 21: Line scan camera rotated around the x-axis. 

Figure 22: A sequence of calibration images..

3.1.5 Determining the Exterior Camera Parameters 

3.1.5 Determining the Exterior Camera Parameters 39 

The exterior camera parameters describe the relation between the measurement plane and the camera, 

i.e., only if the exterior parameters are known it is possible to transform coordinates from the CCS into 

the coordinate system of the measurement plane and vice versa. In HALCON, the measurement plane is 

defined as the plane z = 0 of the world coordinate system (WCS). The exterior camera parameters can 

be determined in different ways: 

1. Use the pose of the calibration plate present in one of the calibration images. In this case, it is not 

necessary to call the operator camera_calibration a second time. 

2. Obtain an additional calibration image where the calibration plate has been placed directly on 

the measurement plane. Apply find_caltab and find_marks_and_pose to extract the calibration 

marks. Then, use the operator camera_calibration to determine only the exterior camera 

parameters. 

3. Determine the correspondences between 3D world points and their projections in the image by 

yourself. Again, use the operator camera_calibration to determine the exterior camera parameters. 

If it is only necessary to measure accurately the dimensions of an object, regardless of the absolute 

position of the object in a given coordinate system, one of the first two cases can be used. 

The latter two cases have the advantage that the exterior camera parameters can be determined independently 

from the interior camera parameters. This is more flexible and might be useful if the measurements 

should be done in several planes from one camera only or if it is not possible to calibrate the camera in 

situ. 

In the following, these different cases are described in more detail. 

The first case is the easiest way of determining the exterior parameters. The calibration plate must be 

placed directly on the measurement plane, e.g., the assembly line, in one of the (many) images used for 

the determination of the interior parameters. 

Since the pose of the calibration plate is determined by the operator camera_calibration, you can 

just pick the respective pose from the output parameter NFinalPose. In this way, interior and exterior 

parameters are determined in one single calibration step. The following code fragment from the program 

hdevelop\camera_calibration_multi_image.dev is an example for this easy way of determining 

the exterior parameters. Here, the pose of the calibration plate in the eleventh calibration image is 

determined. Please note that each pose consists of seven values. 

NumImage := 11 

Pose := NFinalPose[(NumImage-1)*7:(NumImage-1)*7+6] 

The resulting pose would be the true pose of the measurement plane if the calibration plate were infinitely 

thin. Because real calibration plates have a thickness d > 0, the pose of the calibration plate is shifted 

by an amount −d perpendicular to the measurement plane, i.e., along the z axis of the WCS. To correct 

this, we need to shift the pose by d along the z axis of the WCS. To perform this shift, the operator 

set_origin_pose can be used. The corresponding HALCON code is: 

Single Camera


set_origin_pose (Pose, 0, 0, 0.00075, Pose) 

In general, the calibration plate can be oriented arbitrarily within the WCS (see figure 23). In this case, 

to derive the pose of the measurement plane from the pose of the calibration plate a rigid transformation 

is necessary. In the following example, the pose of the calibration plate is adapted by a translation along 

the y axis followed by a rotation around the x axis. 

pose_to_hom_mat3d (FinalPose, HomMat3D) 

hom_mat3d_translate_local (HomMat3D, 0, 3.2, 0, HomMat3DTranslate) 

hom_mat3d_rotate_local (HomMat3DTranslate, rad(-14), ’x’, HomMat3DAdapted) 

hom_mat3d_to_pose (HomMat3DAdapted, PoseAdapted) 

Camera with 



Calibration plate 

y w 

S y 

cp 

Hw Measurement plane (z w= 

0) 

r 

c 

Hw z w 

C 

y 

y c 

v 

x w 

z c 

f 

c 

Hcp 

y cp 

C 

z cp 

x 

x c 

u 

x cp 

Sx 

c 




( u, v) 

Calibration plate coordinate system 

x w y w z w 


Figure 23: Relation between calibration plate and measurement plane. 

x cp y cp z cp ( , , )

3.1.5 Determining the Exterior Camera Parameters 41 

If the advantages of using the HALCON calibration plate should be combined with the flexibility given 

by the separation of the interior and exterior camera parameters the second method for the determination 

of the exterior camera parameters can be used. 

At first, only the interior parameters are determined as described in section 3.1.4 on page 36. This can 

be done, e.g., prior to the deployment of the camera. 

This is shown in the example program hdevelop\camera_calibration_interior.dev, which is 

similar to the example program given above, except that no image is used in which the calibration plate 

is positioned on the object and that the calculated interior camera parameters are written to a file. 

Note that we do not use any of the images to derive the pose of the measurement plane. 

for I := 1 to NumImages by 1 

read_image (Image, ImgPath+’calib_’+I$’02d’) 


MinDiamMarks) 




StartPose) 

NStartPose := [NStartPose,StartPose] 



endfor 



Then, the interior camera parameters can be written to a file: 

write_cam_par (CamParam, ’camera_parameters.dat’) 

Then, after installing the camera at its usage site, the exterior parameters can be determined. The only 

thing to be done is to take an image where the calibration plate is placed directly on the measurement 

plane, from which the exterior parameters can be determined. 

Again the operators find_caltab and find_marks_and_pose can be used to extract the calibration 

marks. Then, the operator camera_calibration with the parameter EstimateParams set to ’pose’ 

determines just the pose of the calibration plate and leaves the interior camera parameter unchanged. 

Alternatively, EstimateParams can be set to the tuple [’alpha’, ’beta’, ’gamma’, ’transx’, ’transy’, 

’transz’], which also means that the six exterior parameters will be estimated. Again, the pose must be 

corrected for the thickness of the calibration plate as described above. 

The program hdevelop\camera_calibration_exterior.dev shows how to determine the exterior 

camera parameters from a calibration plate that is positioned on the object’s surface. 

First, the interior camera parameters, the image, where the calibration plate was placed directly on the 

measurement plane, and the world coordinates of the calibration marks are read from file: 

read_cam_par (’camera_parameters.dat’, CamParam) 

read_image (Image, ImgPath+’calib_11’) 


Then, the calibration marks are extracted.: 

Single Camera



MinDiamMarks) 

find_marks_and_pose (Image, Caltab, CaltabName, CamParam, StartThresh, 

DeltaThresh, MinThresh, Alpha, MinContLength, 

MaxDiamMarks, RCoord, CCoord, 

InitialPoseForCalibrationPlate) 

Now, the actual calibration can be carried out: 

camera_calibration (X, Y, Z, RCoord, CCoord, CamParam, 

InitialPoseForCalibrationPlate, ’pose’, 

CamParamUnchanged, FinalPoseFromCalibrationPlate, 

Errors) 

Finally, to take the thickness of the calibration plate into account, the z value of the origin given by the 

camera pose must be translated by the thickness of the calibration plate: 

set_origin_pose (FinalPoseFromCalibrationPlate, 0, 0, 0.00075, 

FinalPoseFromCalibrationPlate) 

Note that it is very important to fix the focus of your camera if you want to separate the calibration 

process into two steps as described in this section, because changing the focus is equivalent to changing 

the focal length, which is part of the interior parameters. 

If it is necessary to perform the measurements within a given world coordinate system, the third case for 

the determination of the exterior camera parameters can be used. Here, you need to know the 3D world 

coordinates of at least three points that do not lie on a straight line. Then, you must determine the corresponding 

image coordinates of the projections of these points. Now, the operator camera_calibration 

with the parameter EstimateParams set to ’pose’ can be used for the determination of the exterior camera 

parameters. 

Note that in this case, no calibration plate needs to be placed on the measurement plane. This means also 

that the operator find_marks_and_pose cannot be used to extract the calibration marks. Therefore, 

you must generate the input parameter tuples NX, NY, and NZ as well as NRow and NCol yourself. Also 

the initial values for the pose must be set appropriately because in this case they are not determined 

automatically since the operator find_marks_and_pose is not used. 

An example for this possibility of determining the exterior parameters is given in the following program. 

First, the world coordinates of three points are set: 

X := [0,50,100] 

Y := [5,0,5] 

Z := [0,0,0] 

Then, the image coordinates of the projections of these points in the image are determined. In this 

example, they are simply set to some approximate values. In reality, they should be determined with 

subpixel accuracy since they define the exterior camera parameters: 

RCoord := [414,227,85] 

CCoord := [119,318,550] 

Now, the starting value for the pose must be set appropriately:

3.1.6 How to Obtain a Suitable Calibration Plate 43 

create_pose (-50, 25, 400, 0, 0, -30, ’Rp+T’, ’gba’, ’point’, InitialPose) 

Finally, the actual determination of the exterior camera parameters can be carried out: 

camera_calibration (X, Y, Z, RCoord, CCoord, CamParam, InitialPose, ’pose’, 

CamParamUnchanged, FinalPose, Errors) 

Also in this case, it is very important to fix the focus of your camera because changing the focus is 

equivalent to changing the focal length, which is part of the interior parameters. 

3.1.6 How to Obtain a Suitable Calibration Plate 

The simplest method to determine the camera parameters of a CCD camera is to use the HALCON 

calibration plate. In this case, the whole process of finding the calibration plate, extracting the calibration 

marks, and determining the correspondences between the extracted calibration marks and the respective 

3D world coordinates can be carried out automatically. Even more important, these calibration plates are 

highly accurate, up to ± 150 nm (nanometers), which is a prerequisite for high accuracy applications. 

Therefore, we recommend to obtain such a calibration plate from the local distributor from which you 

purchased HALCON. 

The calibration plates are available in different materials (ceramics for front light and glass for back 

light applications) and sizes (e.g., 0.65 × 0.65 mm 2 , 10 × 10 mm 2 , 200 × 200 mm 2 ). Thus, you can 

choose the one that is optimal for your application. As a rule of thumb, the width of the calibration 

plate should be approximately one third of the image width. For example, if the image shows an area of 

100 mm × 70 mm, the 30 × 30 mm 2 calibration plate would be appropriate. Detailed information about 

the available materials, sizes, and the accuracy can be obtained from your distributor. 

Each calibration plate comes with a description file. Place this file in the subdirectory calib of the folder 

where you installed HALCON, then you can use its file name directly in the operator caltab_points 


For test purposes, you can create a calibration plate yourself with the operator gen_caltab. Print the 

resulting PostScript file and mount it on a planar and rigid surface, e.g., an aluminum plate or a solid 

cardboard. If you do not mount the printout on a planar and rigid surface, you will not get meaningful 

results by HALCON’s camera calibration as the operator gen_caltab assumes that the calibration marks 

lie within a plane. Such self-made calibration plates should only be used for test purposes as you will not 

achieve the high accuracy that can be obtained with an original HALCON calibration plate. Note that 

the printing process is typically not accurate enough to create calibration plates smaller than 3 cm. 

3.1.7 Using Your Own Calibration Object 

With HALCON, you are not restricted to using a planar calibration object like the HALCON calibration 

plate. The operator camera_calibration is designed such that the input tuples NX, NY, and NZ for 

the world coordinates of the calibration marks and NRow and NCol for the image coordinates of the 

locations of the calibration marks within the images can contain any 3D/2D correspondences (compare 

section 3.1.4 on page 36). 

Thus, it is not important how the required 3D model marks and the corresponding extracted 2D marks 

are determined. You can use a 3D calibration object or even arbitrary characteristic points (natural 

Single Camera


landmarks). The only requirement is that the 3D world position of the model points is known with high 

accuracy. 

However, if you use your own calibration object, you cannot use the operators find_caltab and 

find_marks_and_pose anymore. Instead, you must determine the 2D locations of the model points 

and the correspondence to the respective 3D points as well as the initial value for the poses yourself. 

3.1.8 Special information for the calibration of line scan cameras 

In general, the procedure for the calibration of line scan cameras is identical to the one for the calibration 

of area scan cameras. 

However, line scan imaging suffers from a high degree of parameter correlation. For example, any small 

rotation of the linear array around the x-axis of the camera coordinate system can be compensated by 

changing the y-component of the translation vector of the respective pose. Even the focal length is 

correlated with the scale factor Sx and with the z-component of the translation vector of the pose, i.e., 

with the distance of the object from the camera. 

The consequences of these correlations for the calibration of line scan cameras are that some parameters 

cannot be determined with high absolute accuracy. Nevertheless, the set of parameters is determined 

consistently, what means that the world coordinates can be measured with high accuracy. 

Another consequence of the parameter correlations is that the calibration may fail in some cases where 

the start values for the interior camera parameters are not accurate enough. If this happens, try the 

following approach: In many cases, the start values for the motion vector are the most difficult to set. 

To achieve better start values for the parameters Vx, Vy, and Vz, reduce the number of parameters to be 

estimated such that the camera calibration succeeds. Try first to estimate the parameters Vx, Vy, Vz, α, 

β, γ, tx, ty, and tz by setting EstimateParams to [’vx’, ’vy’, ’vz’, ’alpha’, ’beta’, ’gamma’, ’transx’, 

’transy’, ’transz’] and if this does not work, try to set EstimateParams to [’vx’, ’vy’, ’vz’, ’transx’, 

’transy’, ’transz’]. Then, determine the whole set of parameters using the above determined values for 

Vx, Vy, and Vz as start values. 

If this still does not work, repeat the determination of the start poses with the operator 

find_marks_and_pose using the above determined values for the interior camera parameters as start 

values. Then retry to calibrate the camera with the operator camera_calibration. 

If none of the above works, try to determine better start values directly from the camera setup. If possible, 

change the setup such that it is easier to determine appropriate start values, e.g., mount the camera such 

that it looks approximately perpendicularly onto the conveyor belt (see figure 19 on page 36). 

3.2 Transforming Image into World Coordinates and Vice Versa 

In this section, you learn how to obtain world coordinates from images based on the calibration data. On 

the one hand, it is possible to process the images as usual and then to transform the extraction results 

into the world coordinate system. In many cases, this will be the most efficient way of obtaining world 

coordinates. On the other hand, some applications may require that the segmentation itself must be 

carried out in images that are already transformed into the world coordinate system (see section 3.3 on 

page 49).

3.2.1 The Main Principle 45 

In general, the segmentation process reduces the amount of data that needs to be processed. Therefore, 

rectifying the segmentation results is faster than rectifying the underlying image. What is more, it is 

often better to perform the segmentation process directly on the original images because smoothing or 

aliasing effects may occur in the rectified image, which could disturb the segmentation and may lead to 

inaccurate results. These arguments suggest to rectify the segmentation results instead of the images. 

In the following, first some general remarks on the underlying principle of the transformation of image 

coordinates into world coordinates are given. Then, it is described how to transform points, contours, 

and regions into the world coordinate system. Finally, we show that it is possible to transform world 

coordinates into image coordinates as well, e.g., in order to visualize information given in the world 


3.2.1 The Main Principle 

Given the image coordinates of one point, the goal is to determine the world coordinates of the corresponding 

point in the measurement plane. For this, the line of sight, i.e., a straight line from the optical 

center of the camera through the given point in the image plane, must be intersected with the measurement 

plane (see figure 24). 

The calibration data is necessary to transform the image coordinates into camera coordinates and finally 

into world coordinates. 

All these calculations are performed by the operators of the family ..._to_world_plane. 

Again, please remember that in HALCON the measurement plane is defined as the plane z = 0 with 

respect to the world coordinate system. This means that all points returned by the operators of the family 

..._to_world_plane have a z-coordinated equal to zero, i.e., they lie in the plane z = 0 of the world 


3.2.2 World Coordinates for Points 

The world coordinates of an image point (r, c) can be determined using the operator 

image_points_to_world_plane. In the following code example, the row and column coordinates 

of pitch lines are transformed into world coordinates. 

image_points_to_world_plane (CamParam, FinalPose, RowPitchLine, 

ColPitchLine, 1, X1, Y1) 

As input, the operator requires the interior and exterior camera parameters as well as the row and column 

coordinates of the point(s) to be transformed. 

Additionally, the unit in which the resulting world coordinates are to be given is specified by the parameter 

Scale (see also the description of the operator image_to_world_plane in section 3.3.1 on page 

49). This parameter is the ratio between the unit in which the resulting world coordinates are to be given 

and the unit in which the world coordinates of the calibration target are given (equation 28). 

Scale = 

unit of resulting world coordinates 

unit of world coordinates of calibration target 

(28) 

Single Camera


Camera with 



y w 

S y 

c 

Hw z w z w 

Measurement plane ( = 0) 

r 

C 

y 

y c 

v 

x w 

z c 

f 

C 

P’ 

x 

x c 

u 

Line of sight 

Sx 

P 

c 




( u, v) 

x w y w z w 


Figure 24: Intersecting the line of sight with the measurement plane.. 

In many cases the coordinates of the calibration target are given in meters. In this case, it is possible to 

set the unit of the resulting coordinates directly by setting the parameter Scale to ’m’ (corresponding 

to the value 1.0, which could be set alternatively for the parameter Scale), ’cm’ (0.01), ’mm’ (0.001), 

’microns’ (1e-6), or ’µm’ (again, 1e-6). Then, if the parameter Scale is set to, e.g., ’m’, the resulting 

coordinates are given in meters. If, e.g., the coordinates of the calibration target are given in µm and the

esulting coordinates have to be given in millimeters, the parameter Scale must be set to: 

3.2.3 World Coordinates for Contours 

3.2.3 World Coordinates for Contours 47 

Scale = mm 

µm = 1 · 10−3 m 

1 · 10−6 = 1000 (29) 

m 

If you want to convert an XLD object containing pixel coordinates into world coordinates, the operator 

contour_to_world_plane_xld can be used. Its parameters are similar to those of the operator 

image_points_to_world_plane, as can be seen from the following example program: 

lines_gauss (ImageReduced, Lines, 1, 3, 8, ’dark’, ’true’, ’true’, ’true’) 

contour_to_world_plane_xld (Lines, ContoursTrans, CamParam, PoseAdapted, 1) 

3.2.4 World Coordinates for Regions 

In HALCON, regions cannot be transformed directly into the world coordinate system. Instead, you 

must first convert them into XLD contours using the operator gen_contour_region_xld, then apply 

the transformation to these XLD contours as described in the previous section. 

If the regions have holes and if these holes would influence your further calculations, set the parameter 

Mode of the operator gen_contour_region_xld to ’border_holes’. Then, in addition to the outer 

border of the input region the operator gen_contour_region_xld returns the contours of all holes. 

3.2.5 Transforming World Coordinates into Image Coordinates 

In this section, the transformation between image coordinates and world coordinates is performed in 

the opposite direction, i.e., from world coordinates to image coordinates. This is useful if you want to 

visualize information given in world coordinates or it may be helpful for the definition of meaningful 

regions of interest (ROI). 

First, the world coordinates must be transformed into the camera coordinate system. For this, the homogeneous 

transformation matrix CCS HW CS is needed, which can easily be derived from the pose of the 

measurement plane with respect to the camera by the operator pose_to_hom_mat3d. The transformation 

itself can be carried out using the operator affine_trans_point_3d. Then, the 3D coordinates, 

now given in the camera coordinate system, can be projected into the image plane with the operator 

project_3d_point. An example program is given in the following: 

First, the world coordinates of four points defining a rectangle in the WCS are defined. 

ROI_X_WCS := [-2,-2,112,112] 

ROI_Y_WCS := [0,0.5,0.5,0] 

ROI_Z_WCS := [0,0,0,0] 

Then, the transformation matrix CCS HW CS is derived from the respective pose. 

pose_to_hom_mat3d (FinalPose, CCS_HomMat_WCS) 

Finally, the world points are transformed into the image coordinate system. 

Single Camera


affine_trans_point_3d (CCS_HomMat_WCS, ROI_X_WCS, ROI_Y_WCS, ROI_Z_WCS, 

CCS_RectangleX, CCS_RectangleY, CCS_RectangleZ) 

project_3d_point (CCS_RectangleX, CCS_RectangleY, CCS_RectangleZ, 

CamParamUnchanged, RectangleRow, RectangleCol) 

3.2.6 Compensate for Radial Distortions Only 

All operators discussed above automatically compensate for radial distortions. In some cases, you might 

want to compensate for radial distortions only without transforming results or images into world coordinates. 

The procedure is to specify the original interior camera parameters and those of a virtual camera that 

does not produce radial distortions, i.e., with κ = 0. 

The easiest way to obtain the interior camera parameters of the virtual camera would be to simply set κ 

to zero. This can be done directly by changing the respective value of the interior camera parameters. 

CamParVirtualFixed := CamParOriginal 

CamParVirtualFixed[1] := 0 

Alternatively, the operator change_radial_distortion_cam_par can be used with the parameter 

Mode set to ’fixed’ and the parameter Kappa set to 0. 

change_radial_distortion_cam_par (’fixed’, CamParOriginal, 0, 

CamParVirtualFixed) 

Then, for the rectification of the segmentation results, the HALCON operator 

change_radial_distortion_contours_xld can be used, which requires as input parameters 

the original and the virtual interior camera parameters. 

change_radial_distortion_contours_xld (Edges, EdgesRectifiedFixed, 

CamParOriginal, CamParVirtualFixed) 

This changes the visible part of the scene (see figure 25b). To obtain virtual camera parameters such that 

the whole image content lies within the visible part of the scene, the parameter Mode of the operator 

change_radial_distortion_cam_par must be set to ’fullsize’ (see figure 25c). Again, to eliminate 

the radial distortions, the parameter Kappa must be set to 0. 

change_radial_distortion_cam_par (’fullsize’, CamParOriginal, 0, 

CamParVirtualFullsize) 

If the radial distortions are eliminated in the image itself using the rectification procedure described in 

section 3.3.2 on page 55, the mode ’fullsize’ may lead to undefined pixels in the rectified image. The 

mode ’adaptive’ (see figure 25d) slightly reduces the visible part of the scene to prevent such undefined 

pixels. 

change_radial_distortion_cam_par (’adaptive’, CamParOriginal, 0, 

CamParVirtualAdaptive) 

Note that this compensation for radial distortions is not possible for line scan images because of the 

acquisition geometry of line scan cameras. To eliminate radial distortions from segmentation results of

(a) (b) 

(c) (d) 

3.3 Rectifying Images 49 

Figure 25: Eliminating radial distortions: The original image overlaid with (a) edges extracted from the 

original image; (b) edges rectified by setting κ to zero; (c) edges rectified with mode ’fullsize’; 

(d) edges rectified with mode ’adaptive’. 

line scan images, the segmentation results must be transformed into the WCS (see section 3.2.2 on page 

45, section 3.2.3 on page 47, and section 3.2.4 on page 47). 

3.3 Rectifying Images 

For applications like blob analysis or OCR, it may be necessary to have undistorted images. Imagine 

that an OCR has been trained based on undistorted image data. Then, it will not be able to recognize 

characters in heavily distorted images. In such a case, the image data must be rectified, i.e., the radial 

and perspective distortions must be eliminated before the OCR can be applied. 

3.3.1 Transforming Images into the WCS 

The operator image_to_world_plane rectifies an image by transforming it into the measurement plane, 

i.e., the plane z = 0 of the WCS. The rectified image shows no radial and no perspective distortions. It 

Single Camera


corresponds to an image captured by a camera that produces no radial distortions and that looks perpendicularly 

to the measurement plane. 

image_to_world_plane (Image, ImageMapped, CamParam, PoseForCenteredImage, 



If more than one image must be rectified, a projection map can be determined with 

the operator gen_image_to_world_plane_map, which is used analogously to the operator 

image_to_world_plane, followed by the actual transformation of the images, which is carried out 

by the operator map_image. 

gen_image_to_world_plane_map (Map, CamParam, PoseForCenteredImage, 

WidthOriginalImage, HeightOriginalImage, 



map_image (Image, Map, ImageMapped) 

The size of the rectified image can be chosen with the parameters Width and Height for the operator 

image_to_world_plane and with the parameters WidthMapped and HeightMapped for the operator 

gen_image_to_world_plane_map. The size of the rectified image must be given in pixels. 

The pixel size of the rectified image is specified by the parameter Scale (see also the description of 

the operator image_points_to_world_plane in section 3.2.2 on page 45). This parameter is the ratio 

between the pixel size of the rectified image and the unit in which the world coordinates of the calibration 

target are given (equation 30). 

Scale = 

pixel size of rectified image 

unit of world coordinates of calibration target 

In many cases the coordinates of the calibration targets are given in meters. In this case, it is possible to 

set the pixel size directly by setting the parameter Scale to ’m’ (corresponding to the value 1.0, which 

could be set alternatively for the parameter Scale), ’cm’ (0.01), ’mm’ (0.001), ’microns’ (1e-6), or ’µm’ 

(again, 1e-6). Then, if the parameter Scale is set to, e.g., ’µm’, one pixel of the rectified image has a 

size that corresponds to an area of 1 µm × 1 µm in the world. The parameter Scale should be chosen 

such that in the center of the area of interest the pixel size of the input image and of the rectified image 

is similar. Large scale differences would lead to aliasing or smoothing effects. See below for examples 

of how the scale can be determined. 

The parameter Interpolation specifies whether bilinear interpolation (’bilinear’) should be applied 

between the pixels in the input image or whether the gray value of the nearest neighboring pixel (’none’) 

should be used. 

The rectified image ImageWorld is positioned such that its upper left corner is located exactly at the 

origin of the WCS and that its column axis is parallel to the x-axis of the WCS. Since the WCS is defined 

by the exterior camera parameters CamPose the position of the rectified image ImageWorld can be 

translated by applying the operator set_origin_pose to the exterior camera parameters. Arbitrary 

transformations can be applied to the exterior camera parameters based on homogeneous transformation 

matrices. See below for examples of how the exterior camera parameters can be set. 

In figure 26, the WCS has been defined such that the upper left corner of the rectified image corresponds 

to the upper left corner of the input image. To illustrate this, in figure 26, the full domain of the rectified 

(30)

Camera with 



Height 

y w 

z w 

z w 


r 

y c 

c 

Hw z c 

x w 

Width 

Rectified image 

f 

x c 

Scale Unit 

c 

3.3.1 Transforming Images into the WCS 51 



Figure 26: Projection of the image into the measurement plane. 

World coordinate system ( ) 

x w y w z w 

, , 

image, transformed into the virtual image plane of the input image, is displayed. As can be seen, the 

upper left corner of the input image and of the projection of the rectified image are identical. 

Note that it is also possible to define the WCS such that the rectified image does not lie or lies only partly 

within the imaged area. The domain of the rectified image is set such that it contains only those pixels 

that lie within the imaged area, i.e., for which gray value information is available. In figure 27, the WCS 

has been defined such that the upper part of the rectified image lies outside the imaged area. To illustrate 

this, the part of the rectified image for which no gray value information is available is displayed dark 

gray. Also in figure 27, the full domain of the rectified image, transformed into the virtual image plane 

Single Camera


of the input image, is displayed. It can be seen that for the upper part of the rectified image no image 

information is available. 

Camera with 



y w 

z w 


r 

y c 

z c 

f 

c 

Hw z w 

x w 

x c 

Rectified image 

c 




x w y w z w 

, , 

Figure 27: Projection of the image into the measurement plane with part of the rectified image lying outside 

the image area. 

If several images must be rectified using the same camera parameters the operator 

gen_image_to_world_plane_map in combination with map_image is much more efficient than 

the operator image_to_world_plane because the transformation must be determined only once. In 

this case, a projection map that describes the transformation between the image plane and the world 

plane is generated first by the operator gen_image_to_world_plane_map. Then, this map is used by 

the operator map_image to transform the image very efficiently.

3.3.1 Transforming Images into the WCS 53 

The following example from hdevelop\transform_image_into_wcs.dev shows how to perform 

the transformation of images into the world coordinate system using the operators 

gen_image_to_world_plane_map together with map_image as well as the operator 

image_to_world_plane. 

In the first part of the example program the parameters Scale and CamPose are set such that a given 

point appears in the center of the rectified image and that in the surroundings of this point the scale of 

the rectified image is similar to the scale of the original image. 

First, the size of the rectified image is defined. 

WidthMappedImage := 652 

HeightMappedImage := 494 

Then, the scale is determined based on the ratio of the distance between points in the WCS and of the 

respective distance in the ICS. 

Dist_ICS := 1 

image_points_to_world_plane (CamParam, Pose, CenterRow, CenterCol, 1, 

CenterX, CenterY) 

image_points_to_world_plane (CamParam, Pose, CenterRow+Dist_ICS, 

CenterCol, 1, BelowCenterX, BelowCenterY) 

image_points_to_world_plane (CamParam, Pose, CenterRow, 

CenterCol+Dist_ICS, 1, RightOfCenterX, 

RightOfCenterY) 

distance_pp (CenterY, CenterX, BelowCenterY, BelowCenterX, 

Dist_WCS_Vertical) 

distance_pp (CenterY, CenterX, RightOfCenterY, RightOfCenterX, 

Dist_WCS_Horizontal) 

ScaleVertical := Dist_WCS_Vertical/Dist_ICS 

ScaleHorizontal := Dist_WCS_Horizontal/Dist_ICS 

ScaleForCenteredImage := (ScaleVertical+ScaleHorizontal)/2.0 

Now, the pose of the measurement plane is modified such that a given point will be displayed in the 

center of the rectified image. 

DX := CenterX-ScaleForCenteredImage*WidthMappedImage/2.0 

DY := CenterY-ScaleForCenteredImage*HeightMappedImage/2.0 

DZ := 0 

set_origin_pose (Pose, DX, DY, DZ, PoseForCenteredImage) 

These calculations are implemented in the HDevelop procedure 

procedure parameters_image_to_world_plane_centered (: : CamParam, Pose, 

CenterRow, CenterCol, 

WidthMappedImage, 

HeightMappedImage: 

ScaleForCenteredImage, 

PoseForCenteredImage) 

which is part of the example program hdevelop\transform_image_into_wcs.dev (see appendix B.2 

on page 138). 

Single Camera


Finally, the image can be transformed. 

gen_image_to_world_plane_map (Map, CamParam, PoseForCenteredImage, 

WidthOriginalImage, HeightOriginalImage, 



map_image (Image, Map, ImageMapped) 

The second part of the example program hdevelop\transform_image_into_wcs.dev shows how to 

set the parameters Scale and CamPose such that the entire image is visible in the rectified image. 

First, the image coordinates of the border of the original image are transformed into world coordinates. 

full_domain (Image, ImageFull) 

get_domain (ImageFull, Domain) 

gen_contour_region_xld (Domain, ImageBorder, ’border’) 

contour_to_world_plane_xld (ImageBorder, ImageBorderWCS, CamParam, 

Pose, 1) 

Then, the extent of the image in world coordinates is determined. 

smallest_rectangle1_xld (ImageBorderWCS, MinY, MinX, MaxY, MaxX) 

ExtentX := MaxX-MinX 

ExtentY := MaxY-MinY 

The scale is the ratio of the extent of the image in world coordinates and of the size of the rectified image. 

ScaleX := ExtentX/WidthMappedImage 

ScaleY := ExtentY/HeightMappedImage 

Now, the maximum value must be selected as the final scale. 

ScaleForEntireImage := max([ScaleX,ScaleY]) 

Finally, the origin of the pose must be translated appropriately. 

set_origin_pose (Pose, MinX, MinY, 0, PoseForEntireImage) 

These calculations are implemented in the HDevelop procedure 

procedure parameters_image_to_world_plane_entire (Image: : CamParam, Pose, 



ScaleForEntireImage, 

PoseForEntireImage) 

which is part of the example program hdevelop\transform_image_into_wcs.dev (see appendix B.3 

on page 139). 

If the object is not planar the projection map that is needed by the operator map_image may be determined 

by the operator gen_grid_rectification_map, which is described in section 9.3 on page 

127. 

If only the radial distortions should be eliminated the projection map can be determined by the operator 

gen_radial_distortion_map, which is described in the following section.

3.3.2 Compensate for Radial Distortions Only 

3.3.2 Compensate for Radial Distortions Only 55 

The principle of the compensation for radial distortions has already be described in section 3.2.6 on page 

48. 

If only one image must be rectified the operator change_radial_distortion_image can be used. It 

is used analogously to the operator change_radial_distortion_contours_xld described in section 

3.2.6, with the only exception that a region of interest (ROI) can be defined with the parameter 

Region. 

change_radial_distortion_image (GrayImage, ROI, ImageRectifiedAdaptive, 

CamParOriginal, CamParVirtualAdaptive) 

Again, the interior parameters of the virtual camera (with κ = 0) can be determined by setting only 

κ to zero (see figure 28b) or by using the operator change_radial_distortion_cam_par with the 

parameter Mode set to ’fixed’ (equivalent to setting κ to zero; see figure 28b), ’adaptive’ (see figure 28c), 

or ’fullsize’ (see figure 28d). 

(a) (b) 

(c) (d) 

Figure 28: Eliminating radial distortions: (a) The original image; (b) the image rectified by setting κ to 

zero; (c) the image rectified with mode ’fullsize’; (d) the image rectified with mode ’adaptive’. 

If more than one image must be rectified, a projection map can be determined with 

the operator gen_radial_distortion_map, which is used analogously to the operator 

Single Camera


change_radial_distortion_image, followed by the actual transformation of the images, which is 

carried out by the operator map_image, described in section 3.3.1 on page 49. If a ROI is to be specified, 

it must be rectified separately (see section 3.2.4 on page 47). 

gen_radial_distortion_map (MapFixed, CamParOriginal, CamParVirtualFixed, 

’bilinear’) 

map_image (GrayImage, MapFixed, ImageRectifiedFixed) 

Note that this compensation for radial distortions is not possible for line scan images because of the 

acquisition geometry of line scan cameras. To eliminate radial distortions from line scan images, the 

images must be transformed into the WCS (see section 3.3.1 on page 49). 

3.4 Inspection of Non-Planar Objects 

Note that the measurements described so far will only be accurate if the object to be measured is planar, 

i.e., if it has a flat surface. If this is not the case the perspective projection of the pinhole camera (see 

equation 22 on page 20) will make the parts of the object that lie closer to the camera appear bigger than 

the parts that lie farther away. In addition, the respective world coordinates are displaced systematically. 

If you want to measure the top side of objects with a flat surface that have a significant thickness that is 

equal for all objects it is best to place the calibration plate onto one of these objects during calibration. 

With this, you can make sure that the optical rays are intersected with the correct plane. 

The displacement that results from deviations of the object surface from the measurement plane can be 

estimated very easily. Figure 29 shows a vertical section of a typical measurement configuration. The 

measurement plane is drawn as a thick line, the object surface as a dotted line. Note that the object surface 

does not correspond to the measurement plane in this case. The deviation of the object surface from the 

measurement plane is indicated by ∆z, the distance of the projection center from the measurement plane 

by z, and the displacement by ∆r. The point N indicates the perpendicular projection of the projection 

center (P C) onto the measurement plane. 

For the determination of the world coordinates of point Q, which lies on the object surface, the optical 

ray from the projection center of the camera through Q ′ , which is the projection of Q into the 

image plane, is intersected with the measurement plane. For this reason, the operators of the family 

..._to_world_plane do not return the world coordinates of Q, but the world coordinates of point P , 

which is the perspective projection of point Q ′ onto the measurement plane. 

If we know the distance r from P to N, the distance z, which is the shortest distance from the projection 

center to the measurement plane, and the deviation ∆z of the object’s surface from the measurement 

plane, the displacement ∆r can be calculated by: 

∆r = ∆z · r 

z 

Often, it will be sufficient to have just a rough estimate for the value of ∆r. Then, the values r, z, and 

∆z can be approximately determined directly from the measurement setup. 

If you need to determine ∆r more precisely, you first have to calibrate the camera. Then you have to 

select a point Q ′ in the image for which you want to know the displacement ∆r. The transformation 

of Q ′ into the WCS using the operator image_points_to_world_plane yields the world coordinates 

of point P . Now, you need to derive the world coordinates of the point N. An easy way to 

(31)

P 


∆r 

Q 

∆z 

Camera 

Projection center (PC) 

r 

P’ = Q’ 

N 

z 

3.4 Inspection of Non-Planar Objects 57 

Object surface 

Measurement plane 

Figure 29: Displacement ∆r caused by a deviation of the object surface from the measurement plane. 

do this is to transform the camera coordinates of the projection center P C, which are (0, 0, 0) T , into 

the world coordinate system, using the operator affine_trans_point_3d. To derive the homogeneous 

transformation matrix W CS HCCS needed for the above mentioned transformation, first, generate 

the homogeneous transformation matrix CCS HW CS from the pose of the measurement plane via 

the operator pose_to_hom_mat3d and then, invert the resulting homogeneous transformation matrix 

(hom_mat3d_invert). Because N is the perpendicular projection of P C onto the measurement plane, 

its x and y world coordinates are equal to the respective world coordinates of P C and its z coordinate 

is equal to zero. Now, r and z can be derived as follows: r is the distance from P to N, which can be 

calculated by the operator distance_pp; z is simply the z coordinate of P C, given in the WCS. 

The following HALCON program (hdevelop\height_displacement.dev) shows how to implement 

this approach. First, the camera parameters are read from file. 

read_cam_par (’camera_parameters.dat’, CamParam) 

read_pose (’pose_from_three_points.dat’, Pose) 

Then, the deviation of the object surface from the measurement plane is set. 

DeltaZ := 2 

Finally, the displacement is calculated, according to the method described above. 

Single Camera


get_mbutton (WindowHandle, RowQ, ColumnQ, _) 

image_points_to_world_plane (CamParam, Pose, RowQ, ColumnQ, 1, WCS_PX, 

WCS_PY) 

pose_to_hom_mat3d (Pose, CCS_HomMat_WCS) 

hom_mat3d_invert (CCS_HomMat_WCS, WCS_HomMat_CCS) 

affine_trans_point_3d (WCS_HomMat_CCS, 0, 0, 0, WCS_PCX, WCS_PCY, WCS_PCZ) 

distance_pp (WCS_PX, WCS_PY, WCS_PCX, WCS_PCY, r) 

z := fabs(WCS_PCZ) 

DeltaR := DeltaZ*r/z 

Assuming a constant ∆z, the following conclusions can be drawn for ∆r: 

• ∆r increases with increasing r. 

• If the measurement plane is more or less perpendicular to the optical axis, ∆r increases towards 

the image borders. 

• At the point N, ∆r is always equal to zero. 

• ∆r increases the more the measurement plane is tilted with respect to the optical axis. 

The maximum acceptable deviation of the object’s surface from the measurement plane, given a maximum 

value for the resulting displacement, can be derived by the following formula: 

∆z = ∆r · z 

r 

The values for r and z can be determined as described above. 

If you want to inspect an object that has a surface that consists of several parallel planes you can first use 

equation 32 to evaluate if the measurement errors stemming from the displacements are acceptable within 

your project or not. If the displacements are too large, you can calibrate the camera such that the measurement 

plane corresponds to, e.g., the uppermost plane of the object. Now, you can derive a pose for 

each plane, which is parallel to the uppermost plane simply by applying the operator set_origin_pose. 

This approach is also useful if objects of different thickness may appear on the assembly line. If it is possible 

to classify these objects into classes corresponding to their thickness, you can select the appropriate 

pose for each object. Thus, it is possible to derive accurate world coordinates for each object. 

Note that if the plane in which the object lies is severely tilted with respect to the optical axis, and if the 

object has a significant thickness, the camera will likely see some parts of the object that you do not want 

to measure. For example, if you want to measure the top side of a cube and the plane is tilted, you will 

see the side walls of the cube as well, and therefore might measure the wrong dimensions. Therefore, 

it is usually best to align the camera so that its optical axis is perpendicular to the plane in which the 

objects are measured. If the objects do not have significant thickness, you can measure them accurately 

even if the plane is tilted. 

What is more, it is even possible to derive world coordinates for an object’s surface that consists of 

several non-parallel planes if the relation between the individual planes is known. In this case, you may 

define the relative pose of the tilted plane with respect to an already known measurement plane. 

RelPose := [0,3.2,0,-14,0,0,0] 

Then, you can transform the known pose of the measurement plane into the pose of the tilted plane. 

(32)

pose_to_hom_mat3d (FinalPose, HomMat3D) 

pose_to_hom_mat3d (RelPose, HomMat3DRel) 

hom_mat3d_compose (HomMat3D, HomMat3DRel, HomMat3DAdapted) 


4 Calibrated Mosaicking 59 

Alternatively, you can use the operators of the family hom_mat3d_..._local to adapt the pose. 

hom_mat3d_translate_local (HomMat3D, 0, 3.2, 0, HomMat3DTranslate) 

hom_mat3d_rotate_local (HomMat3DTranslate, rad(-14), ’x’, HomMat3DAdapted) 


Now, you can obtain world coordinates for points lying on the tilted plane, as well. 

contour_to_world_plane_xld (Lines, ContoursTrans, CamParam, PoseAdapted, 1) 

If the object is to complex to be approximated by planes, or if the relations between the planes are 

not known, it is not possible to perform precise measurements in world coordinates using the methods 

described in this section. In this case, it is necessary to use two cameras and to apply the HALCON 

stereo operators described in section 7 on page 88. 

4 Calibrated Mosaicking 

Some objects are too large to be covered by one single image. Multiple images that cover different parts 

of the object must be taken in such cases. You can measure precisely across the different images if 

the cameras are calibrated and their exterior parameters are known with respect to one common world 


It is even possible to merge the individual images into one larger image that covers the whole object. This 

is done by rectifying the individual images with respect to the same measurement plane (see section 3.3.1 

on page 49). In the resulting image, you can measure directly in world coordinates. 

Note that the 3D coordinates of objects are derived based on the same principle as described in section 3 

on page 27, i.e., a measurement plane that coincides with the object surface must be defined. Although 

two or more cameras are used, this is no stereo approach. For more information on 3D machine vision 

with a binocular stereo system, please refer to section 7 on page 88. 

If the resulting image is not intended to serve for high-precision measurements in world coordinates, you 

can generate it using the mosaicking approach described in section 5 on page 69. With this approach, it 

is not necessary to calibrate the cameras. 

A setup for generating a high-precision mosaic image from two cameras is shown in figure 30. The 

cameras are mounted such that the resulting pair of images has a small overlap. The cameras are first 

calibrated and then the images are merged together into one larger image. All further explanations within 

this section refer to such a two-camera setup. 

Typically, the following steps must be carried out: 

1. Determination of the interior camera parameters for each camera separately. 

Mosaicking I


Calibration 

Figure 30: Two-camera setup. 

2. Determination of the exterior camera parameters, using one calibration object, to facilitate that the 

relation between the cameras can be determined. 

3. Merge the images into one larger image that covers the whole object. 

4.1 Setup 

Two or more cameras must be mounted on a stable platform such that each image covers a part of the 

whole scene. The cameras can have an arbitrary orientation, i.e., it is not necessary that they are looking 

parallel or perpendicular onto the object surface. 

To setup focus, illumination, and overlap appropriately, use a big reference object that covers all fields 

of view. To permit that the images are merged into one larger image, they must have some overlap (see 

figure 31 for an example). The overlapping area can be even smaller than depicted in figure 31, since the 

overlap is only necessary to ensure that there are no gaps in the resulting combined image.

4.2 Calibration 

{ 

{ 

Overlapping area 

Figure 31: Overlapping images. 

The calibration of the images can be broken down into two separate steps. 

4.2 Calibration 61 

The first step is to determine the interior camera parameters for each of the cameras in use. This can be 

done for each camera independently, as described in section 3.1.4 on page 36. 

The second step is to determine the exterior camera parameters for all cameras. Because the final coordinates 

should refer to one world coordinate system for all images, a big calibration object that appears 

in all images has to be used. We propose to use a calibration object like the one displayed in figure 32, 

which consists of as many calibration plates as the number of cameras that are used. 

For the determination of the exterior camera parameters, it is sufficient to use one calibration image from 

each camera only. Note that the calibration object must not be moved in between the acquisition of the 

individual images. Ideally, the images are acquired simultaneously. 

In each image, at least three points of the calibration object, i.e., points for which the world coordinates 

are known, have to be measured in the images. Based on these point correspondences, the operator 

camera_calibration can determine the exterior camera parameters for each camera. See section 3.1.5 

on page 39 for details. 

The calibration is easy if standard HALCON calibration plates mounted on some kind of carrier plate 

are used such that in each image one calibration plate is completely visible. An example for such a 

calibration object for a two-camera setup is given in figure 32. The respective calibration images for the 

determination of the exterior camera parameters are shown in figure 33. Note that the relative position 

of the calibration plates with respect to each other must be known precisely. This can be done with the 

pose estimation described in section 6 on page 84. 

Note also that only the relative position of the calibration marks among each other shows the high accuracy 

stated in section 3.1.6 on page 43 but not the borders of the calibration plate. The rows of calibration 

Mosaicking I


marks may be slanted with respect to the border of the calibration plate and even the distance of the calibration 

marks from the border of the calibration plate may vary. Therefore, aligning the calibration plates 

along their boundaries may result in a shift in x- and y-direction with respect to the coordinate system of 

the calibration plate in its initial position. 

Figure 32: Calibration object for two-camera setup. 

Figure 33: Calibration images for two-camera setup. 

The world coordinates of the calibration marks of each calibration plate can be read from the respective 

calibration plate description file. 


If an arbitrary calibration object is used, these coordinates must be determined precisely in advance. For 

HALCON calibration plates, the image coordinates of the calibration marks as well as the initial values 

for the poses can be determined easily with the operators find_caltab and find_marks_and_pose.

4.3 Merging the Individual Images into One Larger Image 63 

min_max_gray (Image1, Image1, 3, Min, _, _) 

find_caltab (Image1, Caltab, CaltabName, 3, min([Min+40,200]), 5) 

find_marks_and_pose (Image1, Image1, CaltabName, CamParam1, 128, 10, 18, 

0.7, 15, 100, RCoord1, CCoord1, StartPose1) 

min_max_gray (Image2, Image2, 3, Min, _, _) 

find_caltab (Image2, Caltab, CaltabName, 3, min([Min+40,200]), 5) 

find_marks_and_pose (Image2, Image2, CaltabName, CamParam2, 128, 10, 18, 

0.7, 15, 100, RCoord2, CCoord2, StartPose2) 

The following code fragments shows the calibration of the two-camera setup. It assumes that the interior 

camera parameters of both cameras are already known. 

camera_calibration (X, Y, Z, RCoord1, CCoord1, CamParam1, StartPose1, 

’pose’, _, Pose1, Errors1) 

camera_calibration (X, Y, Z, RCoord2, CCoord2, CamParam2, StartPose2, 

’pose’, _, Pose2, Errors2) 

4.3 Merging the Individual Images into One Larger Image 

At first, the individual images must be rectified, i.e, transformed so that they exactly fit together. This 

can be achieved by using the operators gen_image_to_world_plane_map and map_image. Then, 

the mosaic image can be generated by the operator tile_images, which tiles multiple images into one 

larger image. These steps are visualized in figure 34. 

The operators gen_image_to_world_plane_map and map_image are described in section 3.3.1 on 

page 49. In the following, we will only discuss the problem of defining the appropriate image detail, i.e., 

the position of the upper left corner and the size of the rectified images. Again, the description is based 

on the two-camera setup. 

4.3.1 Definition of the Rectification of the First Image 

For the first (here: left) image, the determination of the necessary shift of the pose is straightforward. 

You can define the upper left corner of the rectified image in image coordinates, e.g., interactively or, as 

in the example program, based on a preselected border width. 

ULRow := HeightImage1*BorderInPercent/100.0 

ULCol := WidthImage1*BorderInPercent/100.0 

Then, this point must be transformed into world coordinates. 

image_points_to_world_plane (CamParam1, Pose1, ULRow, ULCol, ’m’, ULX, ULY) 

The resulting coordinates can be used directly, together with the shift that compensates the thickness of 

the calibration plate (see section 3.1.5 on page 39) to modify the origin of the world coordinate system 

in the left image. 

set_origin_pose (Pose1, ULX, ULY, DiffHeight, PoseNewOrigin1) 

This means that we shift the origin of the world coordinate system from the center of the calibration plate 

to the position that defines the upper left corner of the rectified image (figure 35). 

Mosaicking I


map_image map_image 

tile_images 

Figure 34: Image rectification and tiling. 

The size of the rectified image, i.e., its width and height, can be determined from points originally defined 

in image coordinates, too. In addition, the desired pixel size of the rectified image must be specified. 

PixelSize := 0.0001 

For the determination of the height of the rectified image we need to define a point that lies near the 

lower border of the first image. 

LowerRow := HeightImage1*(100-BorderInPercent)/100.0 

Again, this point must be transformed into the world coordinate system.

Image 1 

Rectified image 1 

4.3.1 Definition of the Rectification of the First Image 65 

Image 2 

Figure 35: Definition of the upper left corner of the first rectified image. 

image_points_to_world_plane (CamParam1, Pose1, LowerRow, ULCol, ’m’, _, 

LowerY) 

The height can be determined as the vertical distance between the upper left point and the point near the 

lower image border, expressed in pixels of the rectified image. 

HeightRect := int((LowerY-ULY)/PixelSize) 

Analogously, the width can be determined from a point that lies in the overlapping area of the two images, 

i.e., near the right border of the first image. 

RightCol := WidthImage1*(100-OverlapInPercent/2.0)/100.0 

image_points_to_world_plane (CamParam1, Pose1, ULRow, RightCol, ’m’, 

RightX, _) 

WidthRect := int((RightX-ULX)/PixelSize) 

Note that the above described definitions of the image points, from which the upper left corner and the 

size of the rectified image are derived, assume that the x- and y-axes of the world coordinate system are 

approximately aligned to the column- and row-axes of the first image. This can be achieved by placing 

the calibration plate in the first image approximately aligned with the image borders. Otherwise, the 

distances between the above mentioned points make no sense and the upper left corner and the size of 

the rectified image must be determined in a manner that is adapted for the configuration at hand. 

With the shifted pose and the size of the rectified image, the rectification map for the first image can be 

derived. 

Mosaicking I


gen_image_to_world_plane_map (MapSingle1, CamParam1, PoseNewOrigin1, Width, 

Height, WidthRect, HeightRect, PixelSize, 


4.3.2 Definition of the Rectification of the Second Image 

The second image must be rectified such that it fits exactly to the right of the first rectified image. This 

means that the upper left corner of the second rectified image must be identical with the upper right 

corner of the first rectified image. Therefore, we need to know the coordinates of the upper right corner 

of the first rectified image in the coordinate system that is defined by the calibration plate in the second 

image. 

First, we express the upper right corner of the first rectified image in the world coordinate system that 

is defined by the calibration plate in the first image. It can be determined by a transformation from 

the origin into the upper left corner of the first rectified image (a translation in the example program) 

followed by a translation along the upper border of the first rectified image. Together with the shift that 

compensates the thickness of the calibration plate, this transformation is represented by the homogeneous 

transformation matrix cp1 Hur1 (see figure 36), which can be defined in HDevelop by: 

hom_mat3d_translate_local (HomMat3DIdentity, ULX+PixelSize*WidthRect, ULY, 

DiffHeight, cp1Hur1) 

Image 1 


cp1 H ur1 


Image 2 

Figure 36: Definition of the upper right corner of the first rectified image. 

Then, we need the transformation between the two calibration plates of the calibration object. The 

homogeneous transformation matrix cp1 Hcp2 describes how the world coordinate system defined by 

the calibration plate in the first image is transformed into the world coordinate system defined by the

4.3.2 Definition of the Rectification of the Second Image 67 

calibration plate in the second image (figure 37). This transformation must be known beforehand from a 

precise measurement of the calibration object. 

Image 1 


cp1 

H 

cp2 


Image 2 

Figure 37: Transformation between the two world coordinate systems, each defined by the respective 

calibration plate. 

From these two transformations, it is easy to derive the transformation that transforms the world coordinate 

system of the second image such that its origin lies in the upper left corner of the second rectified 

image. For this, the two transformations have to be combined appropriately (see figure 38): 

This can be implemented in HDevelop as follows: 

hom_mat3d_invert (cp1Hcp2, cp2Hcp1) 

hom_mat3d_compose (cp2Hcp1, cp1Hur1, cp2Hul2) 

cp2 Hul2 = cp2 Hcp1 · cp1 Hur1 (33) 

= ( cp1 Hcp2) −1 · cp1 Hur1 (34) 

With this, the pose of the calibration plate in the second image can be modified such that the origin of 

the world coordinate system lies in the upper left corner of the second rectified image: 

pose_to_hom_mat3d (Pose2, cam2Hcp2) 

hom_mat3d_compose (cam2Hcp2, cp2Hul2, cam2Hul2) 

hom_mat3d_to_pose (cam2Hul2, PoseNewOrigin2) 

With the resulting new pose and the size of the rectified image, which can be the same as for the first 

rectified image, the rectification map for the second image can be derived. 

Mosaicking I


Image 1 


cp1 H ur1 

cp2 

H 

cp1 

cp2 

H 

ul2 


Image 2 

Figure 38: Definition of the upper left corner of the second rectified image. 

gen_image_to_world_plane_map (MapSingle2, CamParam2, PoseNewOrigin2, Width, 

Height, WidthRect, HeightRect, PixelSize, 


4.3.3 Rectification of the Images 

Once the rectification maps are created, every image pair from the two-camera setup can be rectified and 

tiled very efficiently. The resulting mosaic image consists of the two rectified images and covers a part 

as indicated in figure 39. 

The rectification is carried out by the operator map_image. 

map_image (Image1, MapSingle1, RectifiedImage1) 

map_image (Image2, MapSingle2, RectifiedImage2) 

This transforms the two images displayed in figure 40, into the two rectified images that are shown in 

figure 41. 

As a preparation for the tiling, the rectified images must be concatenated into one tuple, which then 

contains both images. 

Concat := [RectifiedImage1,RectifiedImage2] 

Then the two images can be tiled.

Image 1 

Mosaic image 

Figure 39: The position of the final mosaic image. 

Figure 40: Two test images acquired with the two-camera setup. 

tile_images (Concat, Combined, 2, ’vertical’) 

The resulting mosaic image is displayed in figure 42. 

5 Uncalibrated Mosaicking 


Image 2 

If you need an image of a large object, but the field of view of the camera does not allow to cover the 

entire object with the desired resolution, you can use image mosaicking to generate a large image of the 

Mosaicking II


Figure 41: Rectified images. 

Figure 42: Mosaic image. 

entire object from a sequence of overlapping images of parts of the object. 

An example for such an application is given in figure 43. On the left side, six separate images are 

displayed stacked upon each other. On the right side, the mosaic image generated from the six separate 

images is shown. Note that the folds visible in the image do not result from the mosaicking. They are 

due to some degradations on the PCB, which can be seen already in the separate images. 

The mosaicking approach described in this section is designed for applications where it is not necessary 

to achieve the high-precision mosaic images as described in section 4 on page 59. The advantages 

compared to this approach are that no camera calibration is necessary and that the individual images can 

be arranged automatically. 

The following example program (hdevelop\mosaicking.dev) generates the mosaic image displayed 

in figure 49 on page 76. First, the images are read from file and collected in one tuple.

→ 

Figure 43: A first example for image mosaicking. 


Mosaicking II


Images := [] 

for J := 1 to 10 by 1 

read_image (Image, ImgPath+ImgName+J$’02’) 

Images := [Images,Image] 

endfor 

Then, the image pairs must be defined, i.e., which image should be mapped to which image. 

From := [1,2,3,4,6,7,8,9,3] 

To := [2,3,4,5,7,8,9,10,8] 

Now, characteristic points must be extracted from the images, which are then used for the matching 

between the image pairs. The resulting projective transformation matrices 1 must be accumulated. 

Num := |From| 

ProjMatrices := [] 

for J := 0 to Num-1 by 1 

F := From[J] 

T := To[J] 

ImageF := Images[F] 

ImageT := Images[T] 

points_harris (ImageF, SigmaGrad, SigmaSmooth, Alpha, Threshold, 

RowFAll, ColFAll) 

points_harris (ImageT, SigmaGrad, SigmaSmooth, Alpha, Threshold, 

RowTAll, ColTAll) 

proj_match_points_ransac (ImageF, ImageT, RowFAll, ColFAll, RowTAll, 

ColTAll, ’sad’, MaskSize, RowMove, ColMove, 

RowTolerance, ColTolerance, Rotation, 

MatchThreshold, ’gold_standard’, 

DistanceThreshold, RandSeed, ProjMatrix, 

Points1, Points2) 

ProjMatrices := [ProjMatrices,ProjMatrix] 

endfor 

Finally, the image mosaic can be generated. 

gen_projective_mosaic (Images, MosaicImage, StartImage, From, To, 

ProjMatrices, StackingOrder, ’false’, 

MosaicMatrices2D) 

Note that image mosaicking is a tool for a quick and easy generation of large images from several 

overlapping images. For this task, it is not necessary to calibrate the camera. If you need a high-precision 

image mosaic, you should use the method described in section 4 on page 59. 

In the following sections, the individual steps for the generation of a mosaic image are described. 

5.1 Rules for Taking Images for a Mosaic Image 

The following rules for the acquisition of the separate images should be considered: 

1 A projective transformation matrix describes a perspective projection. It consists of 3×3 values. If the last row contains the 

values [0,0,1], it corresponds to a homogeneous transformation matrix of HALCON and therefore describes an affine transformation.

• The images must overlap each other. 

5.1 Rules for Taking Images for a Mosaic Image 73 

• The overlapping area of the images must be textured in order to allow the automatic matching 

process to identify identical points in the images. The lack of texture in some overlapping areas 

may be overcome by an appropriate definition of the image pairs (see section 5.2 on page 75). If 

the whole object shows little texture, the overlapping areas should be chosen larger. 

• Overlapping images must have approximately the same scale. In general, the scale differences 

should not exceed 5-10 %. 

• The images should be radiometrically similar, at least in the overlapping areas, as no radiometric 

adaption of the images is carried out. Otherwise, i.e., if the brightness differs heavily between 

neighboring images, the seams between them will be clearly visible as can be seen in figure 44. 

→ 

Figure 44: A second example for image mosaicking. 

The images are mapped onto a common image plane using a projective transformation. Therefore, to 

generate a geometrically accurate image mosaic from images of non-flat objects, the separate images 

must be acquired from approximately the same point of view, i.e., the camera can only be rotated around 

its optical center (see figure 45). 

When dealing with flat objects, it is possible to acquire the images from arbitrary positions and with 

arbitrary orientations if the scale difference between the overlapping images is not too large (figure 46). 

The radial distortions of the images are not compensated by the mosaicking process. Therefore, if 

radial distortions are present in the images, they cannot be mosaicked with high accuracy, i.e., small 

distortions at the seams between neighboring images cannot be prevented (see figure 50 on page 77). To 

Mosaicking II


Camera in three orientations 

Common optical center 


Camera in three positions 


Figure 45: Image acquisition for non-flat objects. 

Figure 46: Image acquisition for flat objects. 

eliminate this effect, the radial distortions can be compensated before starting the mosaicking process 


If processing time is an issue, it is advisable to acquire the images in the same orientation, i.e., neither 

the camera nor the object should be rotated around the optical axis, too much. Then, the rotation range 

can be restricted for the matching process (see section 5.4 on page 79).

5.2 Definition of Overlapping Image Pairs 

5.2 Definition of Overlapping Image Pairs 75 

As shown in the introductory example, it is necessary to define the overlapping image pairs between 

which the transformation is to be determined. The successive matching process will be carried out for 

these image pairs only. 

1 2 3 

1 2 

3 4 

(a) (b) 

Figure 47: Two configurations of overlapping images. 

Figure 47 shows two configurations of separate images. For configuration (a), the definition of the image 

pairs is simply (1,2) and (2,3), which can be defined in HDevelop as: 

From:=[1,2] 

To:=[2,3] 

In any case, it is important to ensure that each image must be “connected” to all the other images. For 

example, for configuration (b) of figure 47, it is not possible to define the image pairs as (1,2) and (3,4), 

only, because images 1 and 2 would not be connected to images 3 and 4. In this case, it would, e.g., be 

possible to define the three image pairs (1,2), (1,3), and (2,4): 

From:=[1,1,2] 

To:=[2,3,4] 

Assuming there is no texture in the overlapping area of image two and four, the matching could be carried 

out between images three and four instead: 

From:=[1,1,3] 

To:=[2,3,4] 

If a larger number of separate images are mosaicked, or, e.g., an image configuration similar to the 

one displayed in figure 48, where there are elongated rows of overlapping images, it is important to 

thoroughly arrange the image pair configuration. Otherwise it is possible that some images do not fit 

together precisely. This happens since the transformations between the images cannot be determined 

with perfect accuracy because of very small errors in the point coordinates due to noise. These errors are 

propagated from one image to the other. 

Mosaicking II


1 2 3 4 5 

6 7 8 9 10 

Figure 48: A configuration of ten overlapping images. 

Figure 49 shows such an image sequence of ten images of a BGA and the resulting mosaic image. 

Figure 50 shows a cut-out of that mosaic image. It depicts the seam between image 5 and image 10 for 

two image pair configurations, using the original images and the images where the radial distortions have 

been eliminated, respectively. The position of the cut-out is indicated in figure 49 by a rectangle. 

→ 

Figure 49: Ten overlapping images and the resulting (rigid) mosaic image. 

First, the matching has been carried out in the two image rows separately and the two rows are connected 

via image pair 1 → 6:

unfavorable configuration 

good configuration 

5.2 Definition of Overlapping Image Pairs 77 

with radial distortions radial distortions eliminated 

Figure 50: Seam between image 5 and image 10 for various configurations. 

From:=[1,2,3,4,6,7,8,9,1] 

To:=[2,3,4,5,7,8,9,10,6] 

In this configuration the two neighboring images 5 and 10 are connected along a relatively long path 

(figure 51). 

1 2 3 4 5 

6 7 8 9 10 

Figure 51: Unfavorable configuration of image pairs. 

To improve the geometrical accuracy of the image mosaic, the connections between the two image rows 

could be established by the image pair (3,8), as visualized in (figure 52)). 

1 2 3 4 5 

6 7 8 9 10 

Figure 52: Good configuration of image pairs. 

This can be achieved by defining the image pairs as follows. 

From:=[1,2,3,4,6,7,8,9,3] 

To:=[2,3,4,5,7,8,9,10,8] 

As can be seen in figure 50, now the neighboring images fit better. 

Recapitulating, there are three basic rules for the arrangement of the image pairs: 

Take care that 

1. each image is connected to all the other images. 

Mosaicking II


2. the path along which neighboring images are connected is not too long. 

3. the overlapping areas of image pairs are large enough and contain enough texture to ensure a 

proper matching. 

In principle, it is also possible to define more image pairs than required (number of images minus one). 

However, then it cannot be controlled which pairs are actually used. Therefore, we do not recommend 

this approach. 

5.3 Detection of Characteristic Points 

HALCON provides you with various operators for the extraction of characteristic points (interest points). 

The most important of these operators are 

• points_foerstner 

• points_harris 

• points_sojka 

• saddle_points_sub_pix 

All of these operators can determine the coordinates of interest points with subpixel accuracy. 

In figure 53, a test image together with typical results of these interest operators is displayed. 

The operator points_foerstner classifies the interest points into two categories: junction-like features 

and area-like features. The results are very reproducible even in images taken from a different point of 

view. Therefore, it is very well suited for the extraction of points for the subsequent matching. It is very 

accurate but computationally the most expensive operator out of the four interest operators presented in 

this section. 

The results of the operator points_harris are very reproducible, too. Admittedly, the points extracted 

by the operator points_harris are sometimes not meaningful to a human, e.g., they often 

lie slightly beside a corner or an eye-catching image structure. Nevertheless, it is faster than the operator 

points_foerstner. 

The operator points_sojka is specialized in the extraction of corner points. It is the fastest out of the 

four operators presented in this section. 

The operator saddle_points_sub_pix is designed especially for the extraction of saddle points, i.e., 

points whose image intensity is minimal along one direction and maximal along a different direction. 

The number of interest points influence the execution time and the result of the subsequent matching 

process. The more interest points are used, the longer the matching takes. If too few points are used the 

probability of an erroneous result increases. 

In most cases, the default parameters of the interest operators need not be changed. Only if too many 

or too few interest points are found adaptations of the parameters might be necessary. For a description 

of the parameters, please refer to the respective pages of the reference manual (points_foerstner, 

points_harris, points_sojka, saddle_points_sub_pix).

(a) (b) (c) 

(d) (e) (f) 

5.4 Matching of Characteristic Points 79 

Figure 53: Comparison of typical results of interest operators. a) Test image; b) Förstner, junctions; c) 

Förstner, area; d) Harris; e) Sojka; f) Saddle points. 

5.4 Matching of Characteristic Points in Overlapping Areas and Determination 

of the Transformation between the Images 

The most demanding task during the generation of an image mosaic is the matching process. The operator 

proj_match_points_ransac is able to perform the matching even if the two images are shifted 

and rotated arbitrarily. 

proj_match_points_ransac (ImageF, ImageT, RowFAll, ColFAll, RowTAll, 

ColTAll, ’sad’, MaskSize, RowMove, ColMove, 

RowTolerance, ColTolerance, Rotation, 

MatchThreshold, ’gold_standard’, 

DistanceThreshold, RandSeed, ProjMatrix, 

Points1, Points2) 

The only requirement is that the images should have approximately the same scale. If information about 

shift and rotation is available it can be used to restrict the search space, which speeds up the matching 

process and makes it more robust. 

In case the matching fails, ensure that there are enough characteristic points and that the search space 

and the maximum rotation are defined appropriately. 

Mosaicking II


If the images that should be mosaicked contain repetitive patterns, like the two images of a BGA shown 

in figure 54a, it may happen that the matching does not work correctly. In the resulting erroneous mosaic 

image, the separate images may not fit together or may be heavily distorted. To achieve a correct matching 

result for such images, it is important to provide initial values for the shift between the images with 

the parameters RowMove and ColMove. In addition, the search space should be restricted to an area that 

contains only one instance of the repetitive pattern, i.e., the values of the parameters RowTolerance 

and ColTolerance should be chosen smaller than the distance between the instances of the repetitive 

pattern. With this, it is possible to obtain proper mosaic images, even for objects like BGAs (see 

figure 54b). 

(a) 

Figure 54: Separate images (a) and mosaic image (b) of a BGA. 

For a detailed description of the other parameters, please refer to the reference manual 

(proj_match_points_ransac). 

The results of the operator proj_match_points_ransac are the projective transformation matrix and 

the two tuples Points1 and Points2 that contain the indices of the matched input points from the two 

images. 

The projective transformation matrices resulting from the matching between the image pairs must be 

accumulated. 


Alternatively, if it is known that the mapping between the images is a rigid 2D transformation, the 

operator proj_match_points_ransac can be used to determine the point correspondences only, since 

it returns the indices of the corresponding points in the tuples Points1 and Points2. With this, the 

corresponding point coordinates can be selected. 

(b)

RowF := subset(RowFAll,Points1) 

ColF := subset(ColFAll,Points1) 

RowT := subset(RowTAll,Points2) 

ColT := subset(ColTAll,Points2) 

5.5 Generation of the Mosaic Image 81 

Then, the rigid transformation between the image pair can be determined with the operator 

vector_to_rigid. Note that we have to add 0.5 to the coordinates to make the extracted pixel positions 

fit the coordinate system that is used by the operator gen_projective_mosaic. 

vector_to_rigid (RowF+0.5, ColF+0.5, RowT+0.5, ColT+0.5, HomMat2D) 

Because gen_projective_mosaic expects a 3×3 transformation matrix, but vector_to_rigid returns 

a 2×3 matrix, we have to add the last row [0,0,1] to the transformation matrix before we can 

accumulate it. 

ProjMatrix := [HomMat2D,0,0,1] 


5.5 Generation of the Mosaic Image 

Once the transformations between the image pairs are known the mosaic image can be generated with 

the operator gen_projective_mosaic. 

gen_projective_mosaic (Images, MosaicImage, StartImage, From, To, 

ProjMatrices, StackingOrder, ’false’, 

MosaicMatrices2D) 

It requires the images to be given in a tuple. All images are projected into the image plane of a so-called 

start image. The start image can be defined by its position in the image tuple (starting with 1) with the 

parameter StartImage. 

Additionally, the image pairs must be specified together with the corresponding transformation matrices. 

The order in which the images are added to the mosaic image can be specified with the parameter 

StackingOrder. The first index in this array will end up at the bottom of the image stack while the 

last one will be on top. If ’default’ is given instead of an array of integers, the canonical order (the order 

in which the images are given) will be used. 

If the domains of the images should be transformed as well, the parameter TransformRegion must be 

set to ’true’. 

The output parameter MosaicMatrices2D contains the projective 3×3 transformation matrices for 

the mapping of the separate images into the mosaic image. These matrices can, e.g., be used to 

transform features extracted from the separate images into the mosaic image by using the operators 

projective_trans_pixel, projective_trans_region, projective_trans_contour_xld, or 

projective_trans_image. 

Mosaicking II


5.6 Bundle Adjusted Mosaicking 

It is also possible to generate the mosaic based on the matching results of all overlapping image pairs. The 

transformation matrices between the images are then determined together within one bundle adjustment. 

For this, the operators bundle_adjust_mosaic and gen_bundle_adjusted_mosaic are used. 

The main advantage of the bundle adjusted mosaicking compared with the mosaicking based on single 

image pairs is that the bundle adjustment determines the geometry of the mosaic as robustly as possible. 

Typically, this leads to more accurate results. Another advantage is that there is no need to figure out 

a good pair configuration, you simply pass the matching results of all overlapping image pairs. What 

is more, it is possible to define the class of transformations that is used for the transformation between 

the individual images. A disadvantage of the bundle adjusted mosaicking is that it takes more time to 

perform the matching between all overlapping image pairs instead of just using a subset. Furthermore, if 

the matching between two images was erroneous, sometimes the respective image pair is difficult to find 

in the set of all image pairs. 

With this, it is obvious that the bundle adjustment is worthwhile if there are multiple overlaps between 

the images, i.e., if there are more than n − 1 overlapping image pairs, with n being the number of 

images. Another reason for using the bundle adjusted mosaicking is the possibility to define the class of 

transformations. 

The example program hdevelop\bundle_adjusted_mosaicking.dev shows how to generate the 

bundle adjusted mosaic from the ten images of the BGA displayed in figure 49 on page 76. The design 

of the program is very similar to that of the example program hdevelop\mosaicking.dev, which 

is described in the introduction of section 5 on page 69. The main differences are that 

• the matching is carried out between all overlapping images, 

• in addition to the projective transformation matrices also the coordinates of the corresponding 

points must be accumulated, and 

• the operator gen_projective_mosaic is replaced with the operators bundle_adjust_mosaic 

and gen_bundle_adjusted_mosaic. 

First, the matching is carried out between all overlapping image pairs, which can be defined as follows: 

From := [1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5] 

To := [6,7,2,6,7,8,3,7,8,9,4,8,9,10,5,9,10] 

In addition to the accumulation of the projective transformation matrices, as described in section 5.4 on 

page 79, also the coordinates of the corresponding points as well as the number of corresponding points 

must be accumulated. 

Rows1 := [Rows1,subset(RowFAll,Points1)] 

Cols1 := [Cols1,subset(ColFAll,Points1)] 

Rows2 := [Rows2,subset(RowTAll,Points2)] 

Cols2 := [Cols2,subset(ColTAll,Points2)] 

NumCorrespondences := [NumCorrespondences,|Points1|] 

This data is needed by the operator bundle_adjust_mosaic, which determines the bundle adjusted 

transformation matrices.

undle_adjust_mosaic (10, StartImage, From, To, ProjMatrices, Rows1, 

Cols1, Rows2, Cols2, NumCorrespondences, 

Transformation, MosaicMatrices2D, Rows, Cols, 

Error) 

5.7 Spherical Mosaicking 83 

The parameter Transformation defines the class of transformations that is used for the transformation 

between the individual images. Possible values are ’projective’, ’affine’, ’similarity’, and ’rigid’. Thus, if 

you know, e.g., that the camera looks perpendicular onto a planar object and that the camera movement 

between the images is restricted to rotations and translations in the object plane, you can choose the 

transformation class ’rigid’. If translations may also occur in the direction perpendicular to the object 

plane, you must use ’similarity’ because this transformation class allows scale differences between the 

images. If the camera looks tilted onto the object, the transformation class ’projective’ must be used, 

which can be approximated by the transformation class ’affine’. Figure 55 shows cut-outs of the resulting 

mosaic images. They depict the seam between image 5 and image 10. The mosaic images have been 

created using the images where the radial distortions have been eliminated. The position of the cut-out 

within the whole mosaic image is indicated by the rectangle in figure 49 on page 76. 

Finally, with the transformation matrices MosaicMatrices2D, which are determined by 

the operator bundle_adjust_mosaic, the mosaic can be generated with the operator 

gen_bundle_adjusted_mosaic. 

gen_bundle_adjusted_mosaic (Images, MosaicImage, MosaicMatrices2D, 

StackingOrder, TransformRegion, TransMat2D) 

(a) (b) 

(c) (d) 

Figure 55: Seam between image 5 and image 10 for different classes of transformations: (a) projective, 

(b) affine, (c) similarity, and (d) rigid. 

5.7 Spherical Mosaicking 

The methods described in the previous sections arranged the images on a plane. As the name suggests, 

using spherical mosaicking you can arrange them on a sphere instead. Note that this method can only be 

used if the camera is only rotated around its optical center or zoomed. If the camera movement includes 

a translation or if the rotation is not performed exactly around the optical center, the resulting mosaic 

image will not be accurate and can therefore not be used for high-accuracy applications. 

To create a spherical mosaic, you first perform the matching as described in the previous sections to 

determine the projective transformation matrices between the individual images. This information is 

the input for the operator stationary_camera_self_calibration, which determines the interior 

Mosaicking II


camera parameters of the camera and the rotation matrices for each image. Based on this information, the 

operator gen_spherical_mosaic then creates the mosaic image. Please have a look at the HDevelop 

example program hdevelop\Tools\Calibration\stationary_camera_self_calibration.dev 

for more information about how to use these operators. 

6 Pose Estimation of Known 3D Objects With a Single Camera 

With HALCON, it is possible to determine the pose of known 3D objects with a single camera. This is, 

e.g., necessary if you want to pick up objects that may be placed in an arbitrary position and orientation. 

Such an example application is described in section 8.4.1 on page 120. Besides the pose estimation for 

general 3D objects, HALCON offers an alternative approach for the determination of circle poses. 

6.1 Pose Estimation for General 3D Objects 

The basic idea is that the pose of the object, i.e., the exterior camera parameters with respect to the object, 

can be determined by a call of the operator camera_calibration. 

The individual steps are illustrated based on the example program 

hdevelop\pose_of_known_3d_object.dev, which determines the pose of a metal part with 

respect to a given world coordinate system. 

First, the camera must be calibrated, i.e., the interior camera parameters and, if the pose of the object 

is to be determined relative to a given world coordinate system, the exterior camera parameters must 

be determined. See section 3.1 on page 29 for a detailed description of the calibration process. The 

world coordinate system can either be identical to the calibration plate coordinate system belonging 

to the calibration plate from one of the calibration images, or it can be modified such that it fits to 

some given reference coordinate system (figure 56). This can be achieved, e.g., by using the operator 

set_origin_pose 

set_origin_pose (PoseOfWCS, -0.0568, 0.0372, 0, PoseOfWCS) 

or if other transformations than translations are necessary, via homogeneous transformation matrices 

(section 2.1 on page 8). 

pose_to_hom_mat3d (PoseOfWCS, camHwcs) 

hom_mat3d_rotate_local (camHwcs, rad(180), ’x’, camHwcs) 

hom_mat3d_to_pose (camHwcs, PoseOfWCS) 

With the homogeneous transformation matrix c Hw , which corresponds to the pose of the world coordinate 

system, world coordinates can be transformed into camera coordinates. 

Then, the pose of the object can be determined from at least three points (control points) for which both 

the 3D object coordinates and the 2D image coordinates are known. 

The 3D coordinates of the control points need to be determined only once. They must be given in a 

coordinate system that is attached to the object. You should choose points that can be extracted easily

Camera with 


Calibration plate 

z w 

c 

Hw y w 

cp 

Hw y c 

x w 

z c 

c 

Hcp 

y cp 

z cp 

x c 

x cp 

6.1 Pose Estimation for General 3D Objects 85 


Calibration plate coordinate system 


Figure 56: Determination of the pose of the world coordinate system. 

x w y w z w 

, , 

x cp y cp z cp ( , , ) 

and accurately from the images. The 3D coordinates of the control points are then stored in three tuples, 

one for the x coordinates, one for the y coordinates, and the last one for the z coordinates. 

In each image from which the pose of the object should be determined, the control points must be 

extracted. This task depends heavily on the object and on the possible poses of the object. If it is known 

that the object will not be tilted with respect to the camera the detection can, e.g., be carried out by shapebased 

matching (for a detailed description of shape-based matching, please refer to the Application Note 

on Shape-Based Matching). 

Once the image coordinates of the control points are determined, they must be stored in two tuples that 

contain the row and the column coordinates, respectively. Note that the 2D image coordinates of the 

control points must be stored in the same order as the 3D coordinates. 

In the example program, the centers of the three holes of the metal part are used as control points. Their 

Pose Estimation


image coordinates are determined with the HDevelop procedure 

procedure determine_control_points (Image: : : RowCenter, ColCenter) 

which is part of the example program hdevelop\pose_of_known_3d_object.dev. 

Now, the operator camera_calibration can be used to determine the pose of the object. For this, the 

3D object coordinates and the 2D image coordinates of the control points must be passed to the operator 

camera_calibration via the parameters NX, NY, and NZ (3D object coordinates) as well as NRow 

and NCol (2D image coordinates). The known interior camera parameters are given via the parameter 

StartCamParam and an initial pose of the object with respect to the camera must be specified by the 

parameter NStartPose. The parameter EstimateParams must be set to ’pose’. 

camera_calibration (ControlX, ControlY, ControlZ, RowCenter, ColCenter, 

CamParam, StartPose, ’pose’, _, PoseOfObject, Errors) 

You can determine the initial pose of the object from one image where the object is in a typical position 

and orientation. Place the HALCON calibration plate on the object and apply the operators 

find_caltab and find_marks_and_pose (see section 3.1.1 on page 30). The resulting estimation 

for the pose can be used as initial pose, even for images where the object appears slightly shifted or 

rotated. In the example program, the (initial) pose of the calibration plate that was used for the definition 

of the WCS is used as initial pose of the object. 

If both the pose of the world coordinate system and the pose of the object coordinate system are known 

with respect to the camera coordinate system (see figure 57), it is easy to determine the transformation 

matrices for the transformation of object coordinates into world coordinates and vice versa: 

w Ho = w Hc · c Ho (35) 

= ( c Hw ) −1 · c Ho (36) 

where w Ho is the homogeneous transformation matrix for the transformation of object coordinates into 

world coordinates and c Hw and c Ho are the homogeneous transformation matrices corresponding to 

the pose of the world coordinate system and the pose of the object coordinate system, respectively, each 

with respect to the camera coordinate system. 

The transformation matrix for the transformation of world coordinates into object coordinates can be 

derived by: 

o Hw = ( w Ho) −1 

The calculations described above can be implemented in HDevelop as follows. First, the homogeneous 

transformation matrices are derived from the respective poses. 

pose_to_hom_mat3d (PoseOfWCS, camHwcs) 

pose_to_hom_mat3d (PoseOfObject, camHobj) 

Then, the transformation matrix for the transformation of object coordinates into world coordinates is 

derived. 

(37)

Camera with 


Object 

z w 

c 

Hw y c 

o 

Hw y w 

x w 

z c 

z o 

w 

Ho c 

Ho y o 

x o 

x c 

6.1 Pose Estimation for General 3D Objects 87 


x o y o z o Object coordinate system ( , , ) 


x w y w z w 

, , 

Figure 57: Pose of the object coordinate system and transformation between object coordinates and world 

coordinates. 

hom_mat3d_invert (camHwcs, wcsHcam) 

hom_mat3d_compose (wcsHcam, camHobj, wcsHobj) 

Now, known object coordinates can be transformed into world coordinates with: 

affine_trans_point_3d (wcsHobj, CornersXObj, CornersYObj, CornersZObj, 

CornersXWCS, CornersYWCS, CornersZWCS) 

In the example program hdevelop\pose_of_known_3d_object.dev, the world coordinates of the 

four corners of the rectangular hole of the metal part are determined from their respective object coordinates. 

The object coordinate system and the world coordinate system are visualized as well as the 

Pose Estimation


respective coordinates for the four points (see figure 58). 

z 

Object coordinates: World coordinates: 

1:(6.39,26.78,0.00) [mm] 1:(44.98,31.58,2.25) [mm] 

2:(6.27,13.62,0.00) [mm] 2:(49.55,19.62,5.28) [mm] 

3:(17.62,13.60,0.00) [mm] 3:(60.12,23.73,5.20) [mm] 

4:(17.68,26.66,0.00) [mm] 4:(55.54,35.58,2.20) [mm] 

y 

WCS x 

y 

1 

2 

z 

OBJ 

4 

3 

Figure 58: Object coordinates and world coordinates for the four corners of the rectangular hole of the 

metal part. 

6.2 Pose Estimation for 3D Circles 

HALCON offers an alternative approach to estimate the pose of 3D circles, which can be applied with 

less efford than the general approach. It is based on the known geometrical behavior of perspectively 

distorted circles. In particular, 3D circles are usually represented as ellipses in the image. Using the 

extracted 2D ellipse of a 3D circle together with the interior camera parameters and the known radius of 

the circle, the two possible 3D poses of the circle (having the same position but opposite orientations) 

can be obtained easily using the operator get_circle_pose. The HDevelop example HALCONROOT 

\examples\hdevelop\Applications\Calibration\3d_position_of_circles.dev shows in 

detail how to apply the approach. 

7 3D Machine Vision With a Binocular Stereo System 

With a binocular stereo system, it is possible to derive 3D information of the surface of arbitrarily shaped 

objects. Figure 59 shows an image of a stereo camera system, the resulting stereo image pair, and the 

x

height map that has been derived from the images. 

The most important requirements for acquiring a pair of stereo images are that the images 

• show the same object, 

• at the same time, but 

• taken from different positions. 

7.1 The Principle of Stereo Vision 89 

The images must be calibrated and rectified. Thereafter, the 3D information can be determined either 

in form of disparities or as the distance of the object surface from the stereo camera system. The 3D 

information is available as images that encode the disparities or the distances, respectively. 

Additionally, it is possible to directly determine 3D coordinates for points of the object’s surface. 

Applications for a binocular stereo system comprise, but are not limited to, completeness checks, inspection 

of ball grid arrays, etc. 

The example programs used in this section are 

• hdevelop\stereo_calibration.dev 

• hdevelop\height_above_reference_plane_from_stereo.dev 

• hdevelop\3d_information_for_selected_points.dev 

As an alternative to fully calibrated stereo, HALCON offers the so-called uncalibrated stereo vision. 

Here, the relation between the cameras is determined from the scene itself, i.e., without needing a special 

calibration object. Please refer to section 7.5 on page 108 for more information about this method. 

7.1 The Principle of Stereo Vision 

The basic principle of binocular stereo vision is very easy to understand. Assume the simplified configuration 

of two parallel looking 1D cameras with identical interior parameters as shown in figure 60. 

Furthermore, the basis, i.e., the straight line connecting the two optical centers of the two cameras, is 

assumed to be coincident with the x-axis of the first camera. 

Then, the image plane coordinates of the projections of the point P (x c , z c ) into the two images can be 

expressed by 

u1 = f xc 

z c 

u2 = f xc − b 

z c 

where f is the focal length and b the length of the basis. 

(38) 

(39) 

Stereo Vision


✏ 

✏ 

✏ 

✏ 

✏ 

✏✮ 

� �� 

��✐ 

�� 

� �� 

✏ 

✏ 

✏ 

✏ 

✏ 

✏✮ 

Figure 59: Top: Stereo camera system; Center: Stereo image pair; Bottom: Height map.

z c 

f 

u 1 

xc 

P 

b 

u 2 

7.1 The Principle of Stereo Vision 91 

Image 1 u u Image 2 

Figure 60: Vertical section of a binocular stereo camera system. 

The pair of image points that results from the projection of one object point into the two images is often 

referred to as conjugate points or homologous points. 

The difference between the two image locations of the conjugate points is called the disparity d: 

d = (u1 − u2) = 

Given the camera parameters and the image coordinates of two conjugate points, the z c coordinate of the 

corresponding object point P , i.e., its distance from the stereo camera system, can be computed by 

z c = 

f · b 

d 

Note that the interior camera parameters of both cameras and the relative pose of the second camera in 

relation to the first camera are necessary to determine the distance of P from the stereo camera system. 

f · b 

z c 

f 

(40) 

(41) 

Stereo Vision


Thus, the tasks to be solved for stereo vision are: 

1. Determination of the camera parameters 

2. Determination of conjugate points 

The first task is achieved by the calibration of the binocular stereo camera system, which is described in 

section 7.3. This calibration is quite similar to the calibration of a single camera, described in section 3.1 

on page 29. 

The second task is the so-called stereo matching process, which in HALCON is just a call of the operator 

binocular_disparity or binocular_distance, respectively. These operators are described in 

section 7.4, together with the operators doing all the necessary calculations to obtain world coordinates 

from the stereo images. 

7.2 The Setup of a Stereo Camera System 

The stereo camera system consists of two cameras looking at the same object from different positions 

(see figure 61). 

It is very important to ensure that neither the interior camera parameters (e.g., the focal length) nor 

the relative pose (e.g., the distance between the two cameras) of the two cameras changes during the 

calibration process or between the calibration process and the ensuing application of the calibrated stereo 

camera system. Therefore, it is advisable to mount the two cameras on a stable platform. 

The manner in which the cameras are placed influences the accuracy of the results that is achievable with 

the stereo camera system. 

The distance resolution ∆z, i.e., the accuracy with which the distance z of the object surface from the 

stereo camera system can be determined, can be expressed by 

∆z = z2 

· ∆d (42) 

f · b 

To achieve a high distance resolution, the setup should be chosen such that the length b of the basis as 

well as the focal length f are large, and that the stereo camera system is placed as close as possible 

to the object. In addition, the distance resolution depends directly on the accuracy ∆d with which the 

disparities can be determined. Typically, the disparities can be determined with an accuracy of 1/5 up to 

1/10 pixel, which corresponds to approximately 1 µm for a camera with 7.4 µm pixel size. 

In figure 62, the distance resolutions that are achievable in the ideal case are plotted as a function of the 

distance for four different configurations of focal lengths and base lines, assuming ∆d to be 1 µm. 

Note that if the ratio between b and z is very large, problems during the stereo matching process may 

occur, because the two images of the stereo pair differ too much. The maximum reasonable ratio b/z 

depends on the surface characteristics of the object. In general, objects with little height differences can 

be imaged with a higher ratio b/z, whereas objects with larger height differences should be imaged with 

a smaller ratio b/z. 

In any case, to ensure a stable calibration the overlapping area of the two stereo images should be as 

large as possible and the cameras should be approximately aligned, i.e., the rotation around the optical 

axis should not differ too much between the two cameras.

Cameras with 

optical centers 

Virtual image planes 

Object point 

r 

y c 

f 

z c 

v 

x c 

u 

P’ 

Figure 61: Stereo camera system. 

7.3 Calibrating the Stereo Camera System 

c 

7.3 Calibrating the Stereo Camera System 93 

As mentioned above, the calibration of the binocular stereo camera system is very similar to the calibration 

of a single camera (section 3.1 on page 29). The major differences are that the calibration plate 

must be placed such that it lies completely within both images of each stereo image pair and that the 

calibration of both images is carried out simultaneously within the operator binocular_calibration. 

Finally, the stereo images must be rectified in order to facilitate all further calculations. 

In this section only a brief description of the calibration process is given. More details can be found in 

section 3.1 on page 29. Only the stereo-specific parts of the calibration process are described in depth. 

b 

r 

P 

v 

y c 

P’ 

z c 

u 

x c 

c 

Stereo Vision


Distance resolution [mm] 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

b = 10cm, f = 8mm 

b = 10cm, f = 12.5mm 

b = 20cm, f = 8mm 

b = 20cm, f = 12.5mm 

0 

0 0.2 0.4 0.6 0.8 1 

Distance [m] 

Figure 62: Distance resolution plotted over the distance. 

7.3.1 Rules for Taking Calibration Images 

For the calibration of the stereo camera system, multiple stereo images of the calibration plate are necessary, 

where the calibration plate must be completely visible in both images of each stereo image pair. 

A typical sequence of stereo calibration image pairs is displayed in figure 63. 

The rules for taking the calibration images for the single camera calibration (see section 3.1.2 on page 

31) apply accordingly. 

In general, the overlapping area of the two stereo images is smaller than the field of view of each individual 

camera. Therefore, it is not possible to place the calibration plate in all areas of the field of view 

of each camera. Nevertheless, it is very important to place the calibration plate in the multiple images 

such that the whole overlapping area is covered as well as possible. 

7.3.2 Camera Calibration Input 

As in the case of the single camera calibration, the input parameters for the camera calibration can be 

grouped into two categories:

Figure 63: A sequence of eleven stereo calibration image pairs. 

7.3.3 Determining the Camera Parameters 95 

1. Corresponding points, given in world coordinates as well as in image coordinates of both images 

2. Initial values for the camera parameters of both cameras 

The corresponding points as well as the initial values for the camera parameters are determined similar 

to the single camera calibration by the use of the operators find_caltab and find_marks_and_pose. 

Both operators are described in detail in section 3.1.1 on page 30. The only difference is that the operators 

must be applied to both stereo images separately. 

The initial values for the interior camera parameters can be determined as explained in section 3.1.3 on 

page 32. 

7.3.3 Determining the Camera Parameters 

The actual calibration of the stereo camera system is carried out with the operator 

binocular_calibration. 

binocular_calibration (X, Y, Z, RowsL, ColsL, RowsR, ColsR, StartCamParL, 

StartCamParR, StartPosesL, StartPosesR, ’all’, 

CamParamL, CamParamR, NFinalPoseL, NFinalPoseR, 

cLPcR, Errors) 

The only differences to the operator camera_calibration, described in section 3.1.4 on page 36 are 

that the operator binocular_calibration needs the image coordinates of the calibration marks from 

both images of each stereo pair, that it needs the initial values for the camera parameters of both cameras, 

and that it also returns the relative pose of the second camera in relation to the first camera. 

Note that it is assumed that the parameters of the first image stem from the left image and the parameters 

of the second image stem from the right image, whereas the notations ’left’ and ’right’ refer to the line 

Stereo Vision


of sight of the two cameras. If the images are used in the reverse order, they will appear upside down 

after the rectification (see section 7.3.4 for an explanation of the rectification step). 

Once the stereo camera system is calibrated, it should be left unchanged. If, however, the focus of one 

camera was modified, it is necessary to determine the interior camera parameters of that camera again 

(binocular_calibration with EstimateParams set to ’cam_par1’ or ’cam_par2’, respectively). 

In case one camera has been shifted or rotated with respect to the other camera, the relative pose of 

the second camera in relation to the first camera must be determined again (binocular_calibration 

with EstimateParams set to ’pose_rel’). Further, if the point of view of the camera allows only 

stereo images with marginal overlaps or the acquisition of an insufficient number of calibration images, 

binocular_calibration can be applied also to already calibrated cameras (with EstimateParams 

set to ’pose_rel’). 

7.3.4 Rectifying the Stereo Images 

After the rectification of stereo images, conjugate points lie on the same row in both rectified images. In 

figure 64 the original images of a stereo pair are shown, where the two cameras are rotated heavily with 

respect to each other. The corresponding rectified images are displayed in figure 65. 

Figure 64: Original stereo images. 

The rectification itself is carried out using the operators gen_binocular_rectification_map and 

map_image. 

gen_binocular_rectification_map (MapL, MapR, CamParamL, CamParamR, cLPcR, 

1, ’geometric’, ’bilinear’, RectCamParL, 

RectCamParR, CamPoseRectL, CamPoseRectR, 

RectLPosRectR) 

The operator gen_binocular_rectification_map requires the interior camera parameters of both 

cameras and the relative pose of the second camera in relation to the first camera. This data is returned 

by the operator binocular_calibration. 

The parameter SubSampling can be used to change the size and resolution of the rectified images with 

respect to the original images. A value of 1 indicates that the rectified images will have the same size as

Figure 65: Rectified stereo images. 

7.3.4 Rectifying the Stereo Images 97 

the original images. Larger values lead to smaller images with a resolution reduced by the given factor, 

smaller values lead to larger images. 

Reducing the image size has the effect that the following stereo matching process runs faster, but also 

that less details are visible in the result. In general, it is proposed not to used values below 0.5 or above 

2. Otherwise, smoothing or aliasing effects occur, which may disturb the matching process. 

The rectification process can be described as projecting the original images onto a common rectified 

image plane. The method to define this plane can be selected by the parameter Method. So far, only 

the method ’geometric’ can be selected, in which the orientation of the common rectified image plane is 

defined by the cross product of the base line and the line of intersection of the two original image planes. 

The rectified images can be thought of as being acquired by a virtual stereo camera system, called 

rectified stereo camera system, as displayed in figure 66. The optical centers of the rectified cameras 

are the same as for the real cameras, but the rectified cameras are rotated such that they are looking 

parallel and that their x-axes are collinear. In addition, both rectified cameras have the same focal length. 

Therefore, the two image planes coincide. Note that the principal point of the rectified images, which is 

the origin of the image plane coordinate system, may lie outside the image. 

The parameter Interpolation specifies whether bilinear interpolation (’bilinear’) should be applied 

between the pixels of the input images or whether the gray value of the nearest pixel (’none’) should be 

used. Bilinear interpolation yields smoother rectified images, whereas the use of the nearest neighbor is 

faster. 

The operator returns the rectification maps and the camera parameters of the virtual, rectified cameras. 

Finally, the operator map_image can be applied to both stereo images using the respective rectification 

map generated by the operator gen_binocular_rectification_map. 

map_image (ImageL, MapL, ImageRectifiedL) 

map_image (ImageR, MapR, ImageRectifiedR) 

If the calibration was erroneous, the rectification will produce wrong results. This can be checked very 

Stereo Vision


Rectified cameras with 

optical centers 

Rectified images 

Object point 

y c 

r 

v 

z c 

f 

x c 

u 

P’ 

Figure 66: Rectified stereo camera system. 

easily by comparing the row coordinates of conjugate points selected from the two rectified images. If the 

row coordinates of conjugate points are different within the two rectified images, they are not correctly 

rectified. In this case, you should check the calibration process carefully. 

An incorrectly rectified image pair may look like the one displayed in figure 67. 

7.4 Obtaining 3D Information from Images 

There are many possibilities to derive 3D information from rectified stereo images. If only non-metrical 

information about the surface of an object is needed, it may suffice to determine the disparities within 

the overlapping area of the stereo image pair by using the operator binocular_disparity. 

If metrical information is required, the operator binocular_distance can be used to extract the distance 

of the object surface from the stereo camera system (see section 7.4.3 on page 104 for the definition 

c 

b 

r 

P 

P’ 

v 

f 

c 

u

of the distance). 

Figure 67: Incorrectly rectified stereo images. 

7.4.1 Rules for Taking Stereo Image Pairs 99 

To derive metrical information for selected points only, the operators disparity_to_distance or 

disparity_to_point_3d can be used. The first of these two operators calculates the distance z of 

points from the stereo camera system based on their disparity. The second operator calculates the x, y, 

and z coordinates from the row and column position of a point in the first rectified image and from its 

disparity. 

Alternatively, the operator intersect_lines_of_sight can be used to calculate the x, y, and z 

coordinates of selected points. It does not require to determine the disparities in advance. Only the 

image coordinates of the conjugate points need to be given, together with the camera parameters. This 

operator can also handle image coordinates of the original stereo images. Thus, the rectification can be 

omitted. Admittedly, the conjugate points must be determined by yourself. 

Note that all operators, which deal with disparities or distances require all inputs to be based on the 

rectified images. This holds for the image coordinates as well as for the camera parameters. 

7.4.1 Rules for Taking Stereo Image Pairs 

The 3D coordinates of each object point are derived by intersecting the lines of sight of the respective 

conjugate image points. The conjugate points are determined by an automatic matching process. This 

matching process has some properties that should be accounted for during the image acquisition. 

For each point of the first image, the conjugate point in the second image must be determined. This point 

matching relies on the availability of texture. The conjugate points cannot be determined correctly in 

areas without sufficient texture (figure 68). 

If the images contain repetitive patterns, the matching process may be confused, since in this case many 

points look alike. In order to make the matching process fast and reliable, the stereo images are rectified 

such that pairs of conjugate points always have identical row coordinates in the rectified images, i.e., 

that the search space in the second rectified image is reduced to a line. With this, repetitive patterns can 

disturb the matching process only if they are parallel to the rows of the rectified images (figure 69). 

The following rules for the acquisition of the stereo image pairs should be considered: 

Stereo Vision


❅ 

❅ ❅❘ 

� 

� 

�✠ 

Figure 68: Rectified stereo images and matching result in a poorly textured area (regions where the 

matching process failed are displayed white). 

• Do not change the camera setup between the acquisition of the calibration images and the acquisition 

of the stereo images of the object to be investigated. 

• Ensure a proper illumination of the object, avoid reflections. 

• If the object shows no texture, consider to project texture onto it. 

• Place the object such that repetitive patterns are not aligned with the rows of the rectified images. 

7.4.2 Determining Disparities 

Disparities are an indicator for the distance z of object points from the stereo camera system, since points 

with equal disparities also have equal distances z (equation 41 on page 91). 

Therefore, in case it is only necessary to know whether there are locally high objects it suffices to derive 

the disparities. This is done by using the operator binocular_disparity.

❅ 

❅ ❅❘ 

� 

� 

�✠ 

7.4.2 Determining Disparities 101 

Figure 69: Rectified stereo images with repetitive patterns aligned to the image rows and cutout of the 

matching result (regions where the matching process failed are displayed white). 

binocular_disparity (ImageRectifiedL, ImageRectifiedR, DisparityImage, 

ScoreImageDisparity, ’ncc’, MaskWidth, MaskHeight, 

TextureThresh, MinDisparity, MaxDisparity, 

NumLevels, ScoreThresh, ’left_right_check’, 

’interpolation’) 

The operator requires the two rectified images as input. The disparities are only derived for those conjugate 

points that lie within the respective image domain in both images. With this, it is possible to speed 

up the calculation of the disparities if the image domain of at least one of the two rectified images is 

reduced to a region of interest, e.g., by using the operator reduce_domain. 

Several parameters can be used to control the behavior of the matching process that is performed by the 

operator binocular_disparity to determine the conjugate points: 

With the parameter Method, the matching function is selected. The methods ’sad’ (summed absolute 

differences) and ’ssd’ (summed squared differences) compare the gray values of the pixels within a 

matching window directly, whereas the method ’ncc’ (normalized cross correlation) compensates for the 

mean gray value and its variance within the matching window. Therefore, if the two images differ in 

brightness and contrast, the method ’ncc’ should be preferred. However, since the internal computations 

are less complex for the methods ’sad’ and ’ssd’, they are faster than the method ’ncc’. 

The width and height of the matching window can be set independently with the parameters MaskWidth 

and MaskHeight. The values should be odd numbers. Otherwise they will be increased by one. A larger 

matching window will lead to a smoother disparity image, but may result in the loss of small details. In 

contrary, the results of a smaller matching window tend to be noisy but they show more spatial details. 

Because the matching process relies on the availability of texture, low-textured areas can be excluded 

Stereo Vision


from the matching process. The parameter TextureThresh defines the minimum allowed variance 

within the matching window. For areas where the texture is too low no disparities will be determined. 

The parameters MinDisparity and MaxDisparity define the minimum and maximum disparity values. 

They are used to restrict the search space for the matching process. If the specified disparity range 

does not contain the actual range of the disparities, the conjugate points cannot be found correctly; 

therefore, the disparities will be incomplete and erroneous. On the other hand, if the disparity range is 

specified too large, the matching process will be slower and the probability of mismatches rises. 

Therefore, it is important to set the parameters MinDisparity and MaxDisparity carefully. There are 

several possibilities to determine the appropriate values: 

• If you know the minimum and maximum distance of the object from the stereo camera system 

(section 7.4.3 on page 104), you can use the operator distance_to_disparity to determine the 

respective disparity values. 

• You can also determine these values directly from the rectified images. For this, you should display 

the two rectified images and measure the approximate column coordinates of the point N, which 

is nearest to the stereo camera system (N image1 

col and N image2 

col ) and of the point F , which is the 

farthest away (F image1 

col and F image2 

col ), each in both rectified images. 

Now, the values for the definition of the disparity range can be calculated as follows: 

MinDisparity = N image2 

col 

MaxDisparity = F image2 

col 

− N image1 

col 

− F image1 

col 

The operator binocular_disparity uses image pyramids to improve the matching speed. The disparity 

range specified by the parameters MinDisparity and MaxDisparity is only used on the uppermost 

pyramid level, indicated by the parameter NumLevels. Based on the matching results on that level, the 

disparity range for the matching on the next lower pyramid levels is adapted automatically. 

The benefits with respect to the execution time are greatest if the objects have different regions between 

which the appropriate disparity range varies strongly. However, take care that the value for NumLevels is 

not set too large, as otherwise the matching process may fail because of lack of texture on the uppermost 

pyramid level. 

The parameter ScoreThresh specifies which matching scores are acceptable. Points for which the 

matching score is not acceptable are excluded from the results, i.e., the resulting disparity image has a 

reduced domain that comprises only the accepted points. 

Note that the value for ScoreThresh must be set according to the matching function selected via 

Method. The two methods ’sad’ (0 ≤ score ≤ 255) and ’ssd’ (0 ≤ score ≤ 65025) return lower matching 

scores for better matches. In contrast, the method ’ncc’ (-1 ≤ score ≤ 1) returns higher values for better 

matches, where a score of zero indicates that the two matching windows are totally different and a score 

of minus one denotes that the second matching window is exactly inverse to the first matching window. 

The parameter Filter can be used to activate a downstream filter by which the reliability of the resulting 

disparities is increased. Currently, it is possible to select the method ’left_right_check’, which verifies 

the matching results based on a second matching in the reverse direction. Only if both matching results 

correspond to each other, the resulting conjugate points are accepted. In some cases, this may lead to 

gaps in the disparity image, even in well textured areas, as this verification is very strict. If you do not 

want to verify the matching results based on the ’left_right_check’, set the parameter Filter to ’none’. 

(43) 

(44)

7.4.2 Determining Disparities 103 

The subpixel refinement of the disparities is switched on by setting the parameter SubDisparity to 

’interpolation’. It is switched off by setting the parameter to ’none’. 

The results of the operator binocular_disparity are the two images Disparity and Score, which 

contain the disparities and the matching score, respectively. In figure 70, a rectified stereo image pair is 

displayed, from which the disparity and score images, displayed in figure 71 were derived. 

Figure 70: Rectified stereo images. 

Both resulting images refer to the image geometry of the first rectified image, i.e., the disparity for the 

point (r,c) of the first rectified image is the gray value at the position (r,c) of the disparity image. The 

disparity image can, e.g., be used to extract the components of the board, which would be more difficult 

in the original images, i.e., without the use of 3D information. 

Figure 71: Disparity image (left) and score image (right). 

In figure 71, areas where the matching did not succeed, i.e., undefined regions of the images, are displayed 

white in the disparity image and black in the score image. 

Stereo Vision


7.4.3 Determining Distances 

The distance of an object point from the stereo camera system is defined as its distance from the xy-plane 

of the coordinate system of the first rectified camera. It can be determined by the operator 

binocular_distance, which is used analogously to the operator binocular_disparity described 

in the previous section. 

binocular_distance (ImageRectifiedL, ImageRectifiedR, DistanceImage, 

ScoreImageDistance, RectCamParL, RectCamParR, 

RectLPosRectR, ’ncc’, MaskWidth, MaskHeight, 

TextureThresh, MinDisparity, MaxDisparity, 

NumLevels, ScoreThresh, ’left_right_check’, 

’interpolation’) 

The three additional parameters, namely the camera parameters of the rectified cameras as well as the 

relative pose of the second rectified camera in relation to the first rectified camera can be taken directly 

from the output of the operator gen_binocular_rectification_map. 

Figure 72 shows the distance image and the respective score image for the rectified stereo pair of figure 

70 on page 103. Because the distance is calculated directly from the disparities and from the camera 

parameters, the distance image looks similar to the disparity image (figure 71). What is more, the score 

images are identical, since the underlying matching process is identical. 

Figure 72: Distance image (left) and score image (right). 

It can be seen from figure 72 that the distance of the board changes continuously from left to right. The 

reason is that, in general, the x-y-plane of the coordinate system of the first rectified camera will be tilted 

with respect to the object surface (see figure 73). 

If it is necessary that one reference plane of the object surface has a constant distance value of, e.g., 

zero, the tilt can be compensated easily: First, at least three points that lie on the reference plane must be 

defined. These points are used to determine the orientation of the (tilted) reference plane in the distance 

image. Therefore, they should be selected such that they enclose the region of interest in the distance 

image. Then, a distance image of the (tilted) reference plane can be simulated and subtracted from the 

distance image of the object. Finally, the distance values themselves must be adapted by scaling them 

with the cosine of the angle between the tilted and the corrected reference plane.

Rectified cameras 


y c 

c 

z 

x c 

x−y−plane of the rectified camera 

Distance 

7.4.3 Determining Distances 105 

Figure 73: Distances of the object surface from the x-y-plane of the coordinate system of the first rectified 

camera. 

These calculations are carried out in the procedure 

procedure tilt_correction (DistanceImage, RegionDefiningReferencePlane: 

DistanceImageCorrected: : ) 

which is part of the example program hdevelop\height_above_reference_plane_from_stereo.dev 

(appendix B.4 on page 140). In principle, this procedure can also be used to correct the disparity image, 

but note that you must not use the corrected disparity values as input to any operators that derive metric 

information. 

If the reference plane is the ground plane of the object, an inversion of the distance image generates an 

image that encodes the heights above the ground plane. Such an image is displayed on the left hand side 

in figure 74. 

Objects of different height above or below the ground plane can be segmented easily using a simple 

threshold with the minimal and maximal values given directly in units of the world coordinate system, 

e.g., meters. The image on the right hand side of figure 74 shows the results of such a segmentation, 

which can be carried out based on the corrected distance image or the image of the heights above the 

ground plane. 

Stereo Vision


Figure 74: Left: Height above the reference plane; Right: Segmentation of high objects (white: 0-0.4 mm, 

light gray: 0.4-1.5 mm, dark gray: 1.5-2.5 mm, black: 2.5-5 mm). 

7.4.4 Determining Distances or 3D Coordinates for Selected Points 

If only the distances or the 3D coordinates of selected points should be determined, the operators 

disparity_to_distance, disparity_to_point_3d, or intersect_lines_of_sight can be 

used. 

The operator disparity_to_distance simply transforms given disparity values into the respective 

distance values. For example, if you want to know the minimum and maximum distance of the object 

from the stereo camera system you can determine the minimum and maximum disparity values from the 

disparity image and transform them into distances. 

min_max_gray (CaltabL, Disparity, 0, MinDisparity, MaxDisparity, _) 

disparity_to_distance (RectCamParL, RectCamParR, RectLPosRectR, 

[MinDisparity,MaxDisparity], MinMaxDistance) 

This transformation is constant for the entire rectified image, i.e., all points having the same disparity 

have the same distance from the x-y-plane of the coordinate system of the first rectified camera. Therefore, 

besides the camera parameters of the rectified cameras, only the disparity values need to be given. 

To calculate the x, y, and z coordinates of points, two different operators are available: The operator 

disparity_to_point_3d derives the 3D coordinates from image coordinates and the respective disparities, 

whereas the operator intersect_lines_of_sight uses the image coordinates from the two 

stereo images to determine the 3D position of points. 

The operator disparity_to_point_3d requires the camera parameters of the two rectified cameras as 

well as the image coordinates and the disparities of the selected points. 

disparity_to_point_3d (RectCamParL, RectCamParR, RectLPosRectR, RL, CL, 

DisparityOfSelectedPoints, X_CCS_FromDisparity, 

Y_CCS_FromDisparity, Z_CCS_FromDisparity) 

The x, y, and z coordinates are returned in the coordinate system of the first rectified camera.

7.4.4 Determining Distances or 3D Coordinates for Selected Points 107 

The operator intersect_lines_of_sight determines the x, y, and z coordinates of points from the 

image coordinates of the respective conjugate points. Note that you must determine the image coordinates 

of the conjugate points yourself. 

intersect_lines_of_sight (RectCamParL, RectCamParR, RectLPosRectR, RL, CL, 

RR, CR, X_CCS_FromIntersect, Y_CCS_FromIntersect, 

Z_CCS_FromIntersect, Dist) 

The x, y, and z coordinates are returned in the coordinate system of the first (rectified) camera. 

The operator can also handle image coordinates of the original stereo images. Thus, the rectification can 

be omitted. In this case, the camera parameters of the original stereo cameras have to be given instead of 

the parameters of the rectified cameras. 

It is possible to transform the x, y, and z coordinates determined by the latter two operators from the 

coordinate system of the first (rectified) camera into a given coordinate system WCS, e.g., a coordinate 

system with respect to the building plan of, say, a factory building. For this, a homogeneous transformation 

matrix, which describes the transformation between the two coordinate systems is needed. 

This homogeneous transformation matrix can be determined in various ways. The easiest way is to take 

an image of a HALCON calibration plate with the first camera only. If the 3D coordinates refer to the 

rectified camera coordinate system, the image must be rectified as well. Then, the pose of the calibration 

plate in relation to the first (rectified) camera can be determined using the operators find_caltab, 

find_marks_and_pose, and camera_calibration. 

find_caltab (ImageRectifiedL, CaltabL, ’caltab_30mm.descr’, SizeGauss, 

MarkThresh, MinDiamMarks) 

find_marks_and_pose (ImageRectifiedL, CaltabL, ’caltab_30mm.descr’, 

RectCamParL, StartThresh, DeltaThresh, MinThresh, 

Alpha, MinContLength, MaxDiamMarks, RCoordL, CCoordL, 

StartPoseL) 

camera_calibration (X, Y, Z, RCoordL, CCoordL, RectCamParL, StartPoseL, 

’pose’, _, PoseOfCalibrationPlate, Errors) 

The resulting pose can be converted into a homogeneous transformation matrix. 

pose_to_hom_mat3d (PoseOfCalibrationPlate, HomMat3d_WCS_to_RectCCS) 

If necessary, the transformation matrix can be modified with the operators hom_mat3d_rotate_local, 

hom_mat3d_translate_local, and hom_mat3d_scale_local. 

hom_mat3d_translate_local (HomMat3d_WCS_to_RectCCS, 0.01125, -0.01125, 

0, HomMat3DTranslate) 

hom_mat3d_rotate_local (HomMat3DTranslate, rad(180), ’y’, 

HomMat3d_WCS_to_RectCCS) 

The homogeneous transformation matrix must be inverted in order to represent the transformation from 

the (rectified) camera coordinate system into the WCS. 

hom_mat3d_invert (HomMat3d_WCS_to_RectCCS, HomMat3d_RectCCS_to_WCS) 

Finally, the 3D coordinates can be transformed using the operator affine_trans_point_3d. 

Stereo Vision


affine_trans_point_3d (HomMat3d_RectCCS_to_WCS, X_CCS_FromIntersect, 

Y_CCS_FromIntersect, Z_CCS_FromIntersect, X_WCS, 

Y_WCS, Z_WCS) 

The homogeneous transformation matrix can also be determined from three specific points. If the origin 

of the WCS, a point on its x-axis, and a third point that lies in the x-y-plane, e.g., directly on the y-axis, 

are given, the transformation matrix can be determined using the procedure 

procedure gen_hom_mat3d_from_three_points (: : Origin, PointOnXAxis, 

PointInXYPlane: HomMat3d) 

which is part of the example program hdevelop\3d_information_for_selected_points.dev. 

The resulting homogeneous transformation matrix can be used as input for the operator 

affine_trans_point_3d, as shown above. 

7.5 Uncalibrated Stereo Vision 

Similar to uncalibrated mosaicking (see section 5 on page 69), HALCON also provides an “uncalibrated” 

version of stereo vision, which derives information about the cameras from the scene itself by matching 

characteristic points. The main advantage of this method is that you need no special calibration object. 

The main disadvantage of this method is that, without a precisely known calibration object, the accuracy 

of the results is highly dependent on the content of the scene, i.e., the accuracy of the results degrades 

if the scene does not contain enough 3D information or if the extracted characteristic points in the two 

images do not precisely correspond to the same world points, e.g., due to occlusion. 

In fact, HALCON provides two versions of uncalibrated stereo: Without any knowledge about 

the cameras and about the scene, you can rectify the stereo images and perform a segmentation 

similar to the method described in section 7.4.3 on page 104. For this, you first use the operator 

match_fundamental_matrix_ransac, which determines the so-called fundamental matrix. This matrix 

models the cameras and their relation; but in contrast to the model described in section 7.1 on page 

89, interior and exterior parameters are not available separately, thus no metric information can be derived. 

The fundamental matrix is then passed on to the operator gen_binocular_proj_rectification, 

which is the “uncalibrated” version of the operator gen_binocular_rectification_map. With the 

output of this operator, i.e., the rectification maps, you can then proceed to rectify the images as described 

in section 7.3.4 on page 96. Because the relative pose of the cameras is not known, you cannot 

generate the distance image and segment it as described in section 7.4.3 on page 104. The HDevelop example 

program hdevelop\Applications\Stereo\board_segmentation_uncalib.dev shows an 

alternative approach that can be used if the reference plane appears dominant in the images, i.e., if many 

correspondences are found on it. 

Because no calibration object is needed, this method can be used to perform stereo vision with a single 

camera. Note, however, that the method assumes that there are no radial distortions in the images. 

Therefore, the accuracy of the results degrades if the lens has significant radial distortions. 

If the interior parameters of the camera are known, you can determine the relative pose between 

the cameras using the operator match_rel_pose_ransac and then use all the stereo methods described 

for the fully calibrated case. There is, however, a limitation: The determined pose is relative 

in a second sense, because it can be determined only up to a scale factor. The reason for this

8 Robot Vision 109 

is that without any knowledge about the scene, the algorithm cannot know whether the points in the 

scene are further away or whether the cameras are further apart because the effect in the image is the 

same in both cases. If you have additional information about the scene, you can solve this ambiguity 

and determine the “real” relative pose. This method is shown in the HDevelop example program 

hdevelop\Tools\Stereo\uncalib_stereo_boxes.dev. 

8 Robot Vision 

A typical application area for 3D machine vision is robot vision, i.e., whenever robots are equipped with 

cameras that supply information about the parts to manufacture. Such systems are also called “hand-eye 

systems” because the robotic “hand” is guided by mechanical “eyes”. 

In order to use the information extracted by the camera, it must be transformed into the coordinate system 

of the robot. Thus, besides calibrating the camera(s) you must also calibrate the hand-eye system, i.e., 

determine the transformation between camera and robot coordinates. The following sections explain 

how to perform this hand-eye calibration with HALCON. 

a) b) 

Figure 75: Robot vision scenarios: (a) moving camera, (b) stationary camera. 

Please note that in order to use HALCON’s hand-eye calibration, the camera must observe the ! 

workspace of the robot. Figure 75 depicts the two possible scenarios: The camera can either be 

mounted on the robot and is moved to different positions by it, or it can be stationary. If the camera 

does not observe the workspace of the robot, e.g., if the camera observes parts on a conveyor belt, which 

are then handled by a robot further down the line, you must determine the relative pose of robot and 

camera with different means. 

The calibration result can be used for different tasks. Typically, the results of machine vision, e.g., the 

position of a part, are to be transformed from camera into robot coordinates to create the appropriate 

robot commands, e.g., to grasp the part. Section 8.4 on page 119 describes such an application. Another 

possible application for hand-eye systems with a moving camera is to transform the information extracted 

from different camera poses into a common coordinate system. 

Note that HALCON’s hand-eye calibration is not restricted to systems with a “hand”, i.e., a manipulator. 

You can also use it to calibrate cameras mounted on a pan-tilt head or surveillance cameras that rotate 

to observe a large area. Both systems correspond to a camera mounted on a robot; the calibration then 

allows you to accumulate visual information from different camera poses. 

Further note that, although in the following only systems with a single camera are described, you can of 

course also use a stereo camera system. In this case, you typically calibrate only the relation of the robot 

Robot Vision


! 

tool 

H base 

base 

z 

y 

x 

camera 

cam 

y 

y 

z 

x 

base 

H cal 

tool 

z 

cal. object 

cam 

H tool 

H cal 

y 

x 

z 

x 

cam 

H tool 

Figure 76: Chain of transformations for a moving camera system. 

base 

H tool 

base 

z 

y 

x 

tool 

cal. object 

cam 

H base 

cam 

H cal 

z 

camera 

y 

x 

y 

z 

tool 

H cal 

Figure 77: Chain of transformations for a stationary camera system. 

to one of the cameras, because the relation between the cameras is determined by the stereo calibration 


8.1 The Principle of Hand-Eye Calibration 

Like the camera calibration (see section 3.1 on page 29), the hand-eye calibration is based on providing 

multiple images of a known calibration object. But in contrast to the camera calibration, here the calibration 

object is not moved manually but by the robot, which moves either the calibration object in front 

of a stationary camera or the camera over a stationary calibration object. The pose, i.e., the position and 

orientation, of the robot in each calibration image must be known with high accuracy! 

This results in a chain of coordinate transformations (see figure 76 and figure 77). In this chain, two 

transformations (poses) are known: the robot pose base Htool and the pose of the calibration object in 

camera coordinates cam Hcal, which is determined from the calibration images before starting the hand- 

x 

z 

x 

y

8.1 The Principle of Hand-Eye Calibration 111 

eye calibration. The hand-eye calibration then estimates the other two poses, i.e., the relation between 

the robot and the camera and between the robot and the calibration object, respectively. 

Note that the chain consists of different poses depending on the used scenario. For a moving camera, 

the pose of the robot tool in camera coordinates and the pose of the calibration object in robot base 

coordinates are determined (see figure 76 on page 110): 

cam Hcal = cam Htool · tool Hbase · base Hcal 

Please beware that in this chain the inverse robot pose, i.e., the pose of the robot base in tool coordinates, 

is used. 

For a stationary camera, the pose of the robot base in camera coordinates and of the calibration object 

in robot tool coordinates are determined (see figure 77 on page 110): 

cam Hcal = cam Hbase · base Htool · tool Hcal 

The hand-eye calibration is performed with a single call to the operator hand_eye_calibration: 

hand_eye_calibration (NX, NY, NZ, NRow, NCol, MPointsOfImage, MRelPoses, 

BaseStartPose, CamStartPose, CamParam, ’all’, 

’CountIterations’, 100, 0.0005, BaseFinalPose, 

CamFinalPose, NumErrors) 

Let’s have a brief look at the parameters; the referenced sections contain more detailed information: 

• NX, NY, NZ, NRow, NCol, MPointsOfImage (see section 8.2.1) 

As for the camera calibration, you must supply 3D model points and their corresponding image 

points. Note, however, that for the hand-eye calibration the 3D point coordinates must be sup- ! 

plied for each image, together with the number of 3D points visible in each image. This requirement 

might seem tedious if you use the standard calibration plate, because then the same points 

are visible in each image, but it offers more flexibility for users of other calibration objects. 

• MRelPoses (see section 8.2.2 on page 113) 

This parameter contains the poses of the robot in each calibration image. 

• BaseStartPose, CamStartPose (see section 8.2.3 on page 114) 

For the poses that are to be calibrated, you must provide start values. 

• CamParam 

In this parameter you pass the interior camera parameters. In the HDevelop 

example programs hdevelop\handeye_movingcam_calibration.dev and 

hdevelop\handeye_stationarycam_calibration.dev, the camera is calibrated using 

the calibration images acquired for the hand-eye calibration. For detailed information about 

obtaining the interior camera parameters please refer to section 3.1.4 on page 36. 

• ToEstimate 

This parameter allows you to choose which pose parameters are to be calibrated and which are to 

stay the same as in the start values. 

(45) 

(46) 

Robot Vision


! 

• StopCriterion, MaxIterations, MinError 

With these parameters you can specify when the calibration algorithm should stop. 

• BaseFinalPose, CamFinalPose, NumErrors 

These are the calibration results. How to use them in robot vision applications is described in 

section 8.4 on page 119. 

Besides the coordinate systems described above, two others may be of interest in a robot vision application: 

First, sometimes results must be transformed into a reference (world) coordinate system. You can 

define such a coordinate system easily based on a calibration image. Secondly, especially if the robot 

system uses different tools, it might be useful to place the tool coordinate system used in the calibration 

at the mounting point of the tools and introduce additional coordinate systems at the respective tool 

center points. The example application in section 8.4 on page 119 shows how to handle both cases. 

8.2 Determining Suitable Input Parameters 

Below, we show how to determine values for the input parameters 

of hand_eye_calibration. The code examples stem from the HDevelop 

programs hdevelop\handeye_movingcam_calibration.dev and 

hdevelop\handeye_stationarycam_calibration.dev, which perform the calibration of hand-eye 

systems with a moving and stationary camera, respectively. The programs stop after processing each 

calibration image; press Run to continue. 

Section 8.2.4 on page 116 explains how to check the input parameters before performing the calibration. 

8.2.1 Corresponding World and Image Points 

As for the camera calibration, you must supply 3D model points and their corresponding image points 

(compare section 3.1.1 on page 30). First, we create empty tuples to accumulate data from all calibration 

images: 

NRow := [] 

NCol := [] 

NX := [] 

NY := [] 

NZ := [] 

MPointsOfImage := [] 

When using the standard calibration plate, the 3D model points, i.e., the 3D coordinates of the calibration 

marks, can be read from the description file: 

caltab_points (CalTabFile, X, Y, Z) 

In each calibration image, we then locate the calibration plate and extract the image coordinates of the 

calibration marks. Please note that for the hand-eye calibration we strongly recommend to use the 

asymmetric calibration plate introduced with HALCON 7.1 (see section 3.1.6 on page 43). If even 

only in a single calibration image the pose of the old, symmetric calibration plate is estimated wrongly 

because it is rotated by more than 90 degrees, the calibration will fail!

for i := 0 to NumImages-1 by 1 

read_image (Image, ImageNameStart+i$’02d’) 

find_caltab (Image, Caltab, CalTabFile, SizeGauss, MarkThresh, 

MinDiamMarks) 

find_marks_and_pose (Image, Caltab, CalTabFile, StartCamParam, 



StartPose) 

Finally, the corresponding coordinates are accumulated in the tuples: 



NX := [NX,X] 

NY := [NY,Y] 

NZ := [NZ,Z] 

MPointsOfImage := [MPointsOfImage,49] 

endfor 

8.2.2 Robot Poses 113 

Note that the 3D coordinates and number of marks per image are accumulated even if they don’t change 

betweeen images. As already explained, the possibility to use different model points in each image is 

not necessary when using the standard calibration plate, but can be very useful if you use your own 

calibration object, especially if it is a three-dimensional one. 

8.2.2 Robot Poses 

For each of the calibration images, you must specify the corresponding pose of the robot. Note that the ! 

accuracy of the poses is critical to obtain an accurate hand-eye calibration. There are two ways 

to “feed” the poses into HALCON: In many cases, you will simply read them from the robot control 

unit and then enter them into your HALCON program manually. For this, you can use the HDevelop 

example program hdevelop\handeye_create_robot_poses.dev, which lets you input the poses in 

a text window and writes them into files. 

As an alternative, if the robot has a serial interface, you can also send them via this connection to your 

HALCON program (see the section “System ⊲ Serial” in the Reference Manual for information about 

communicating via the serial interface). 

In both cases, you then convert the data into HALCON 3D poses using the operator create_pose. As 

described in section 2.1.5 on page 17 (and in the Reference Manual entry for create_pose), you can 

specify a pose in more than one way, because the orientation can be described by different sequences of 

rotations. Therefore, you must first check which sequence is used by your robot system. In many cases, 

it will correspond to 

Rabg = Rz(RotZ) · Ry(RotY) · Rx(RotX) (47) 

If this is the case, select the value ’abg’ for the parameter OrderOfRotation of create_pose. For the 

inverse order, select ’gba’. 

If your robot system uses yet another sequence, you cannot use create_pose but must create a corresponding 

homogeneous transformation matrix and convert it into a pose using hom_mat3d_to_pose. If, 

Robot Vision


e.g., your robot system uses the following sequence of rotations where the rotations are perfomed around 

the z-axis, then around the y-axis, and finally again around the z-axis 

the pose can be created with the following code: 

Rzyz = Rz(Rl) · Ry(Rm) · Rz(Rr) (48) 

hom_mat3d_identity (HomMat3DIdentity) 

hom_mat3d_translate (HomMat3DIdentity, Tx, Ty, Tz, 

HomMat3DTranslate) 

hom_mat3d_rotate_local (HomMat3DTranslate, rad(Rl), ’z’, 

HomMat3DT_Rl) 

hom_mat3d_rotate_local (HomMat3DT_Rl, rad(Rm), ’y’, 

HomMat3DT_Rl_Rm) 

hom_mat3d_rotate_local (HomMat3DT_Rl_Rm, rad(Rr), ’z’, 

HomMat3D) 

hom_mat3d_to_pose (HomMat3D, Pose) 

Note that the rotation operators expect angles to be given in radians, whereas create_pose expects 

them in degrees! 

The example program hdevelop\handeye_create_robot_poses.dev allows you to enter poses of 

the three types described above. If your robot system uses yet another sequence of rotations, you can 

easily extend the program by modifying (or copying and adapting) the code for ZYZ poses. 

The HDevelop example programs hdevelop\handeye_movingcam_calibration.dev and 

hdevelop\handeye_stationarycam_calibration.dev read the robot pose files in the loop 

of processing the calibration images. Before this, an empty tuple is created to accumulate the poses: 

MRelPoses := [] 

For each calibration image, the pose of the robot is read from file using read_pose and accumulated in 

the tuple: 

read_pose (DataNameStart+’robot_pose_’+i$’02d’+’.dat’, TmpRobotPose) 

MRelPoses := [MRelPoses,TmpRobotPose] 

If you are using a hand-eye system with a moving camera, you must invert the pose (compare the chain 

of transformations in figure 76 on page 110): 

hom_mat3d_invert (base_H_tool, tool_H_base) 

hom_mat3d_to_pose (tool_H_base, RobotPoseInverse) 

MRelPoses := [MRelPoses,RobotPoseInverse] 

8.2.3 Start Values BaseStartPose and CamStartPose 

Depending on the used hand-eye configuration, the starting values BaseStartPose and CamStartPose 

correspond to the following poses:

Moving camera (see figure 76 on page 110) 

8.2.3 Start Values for the Poses to Calibrate 115 

BaseStartPose = pose of the calibration object in robot base coordinates ( base Hcal) 

CamStartPose = pose of the robot tool in camera coordinates ( cam Htool) 

Stationary camera (see figure 77 on page 110) 

BaseStartPose = pose of the calibration object in robot tool coordinates ( tool Hcal) 

CamStartPose = pose of the robot base in camera coordinates ( cam Hbase) 

Please note that the parameter names are misleading for stationary cameras! ! 

We recommend to create these poses using create_pose and save them in files using write_pose, so 

that the calibration program can read them in later. 

In fact, you need a starting value only for one of the two poses. The other can be computed from 

one of the calibration images. This means that you can pick the pose that is easier to determine and 

let HALCON compute the other one for you. More precisely, we recommend to pick the pose whose 

orientation is easier to determine, because the hand-eye calibration may fail if there are large errors in 

the start values for the orientation. 

The main idea is to exploit the fact that the two poses for which we need starting values are connected 

via the already described chain of transformations (see equation 45 and equation 46 on page 111). 

If the camera is stationary, typically the pose of the calibration object in tool coordinates ( tool Hcal, 

BaseStartPose) is easier to determine, especially if the camera is oriented at an angle as in the hand-eye 

system depicted in figure 78: Here, the relation between tool and calibration plate is a pure translation. 

To determine cam Hbase, we rewrite equation 46 as follows: 

cam Hbase = cam Hcal ·( base Htool · tool Hcal) -1 = cam Hcal · cal Htool · tool Hbase 

In the example program hdevelop\handeye_stationarycam_calibration.dev, this computation 

is performed by the following HDevelop procedure 

procedure calc_cam_start_pose_stationarycam (: : CalplatePose, BaseStartPose, 

RobotPose: CamStartPose) 

pose_to_hom_mat3d (BaseStartPose, tool_H_calplate) 

hom_mat3d_invert (tool_H_calplate, calplate_H_tool) 

pose_to_hom_mat3d (RobotPose, base_H_tool) 

hom_mat3d_invert (base_H_tool, tool_H_base) 

pose_to_hom_mat3d (CalplatePose, cam_H_calplate) 

hom_mat3d_compose (cam_H_calplate, calplate_H_tool, cam_H_tool) 

hom_mat3d_compose (cam_H_tool, tool_H_base, cam_H_base) 

hom_mat3d_to_pose (cam_H_base, CamStartPose) 

return () 

In the example programs, the pose of the calibration plate in camera coordinates ( cam Hcal) is determined 

when calibrating the camera itself using the operator camera_calibration. If you performed 

the camera calibration separately with other calibration images, you can use the poses determined by 

find_marks_and_pose (see section 8.2.1 on page 112) instead. 

If the camera is mounted on the robot, it is typically easy to determine a starting value for the pose of the 

calibration plate in robot base coordinates, i.e., BaseStartPose. Thus, we “solve” equation 45 for the 

(49) 

Robot Vision


a) 

z 

base 

y 

x 

y 

tool 

z 

y z 

x 

gripper 

z 

y 

camera 

x 

x 

b) 

y 

z 

x 

camera 

y x 

tool 

z 

y x 

calibration plate 

z 

Figure 78: Example hand-eye system with a stationary camera: coordinate systems (a) of robot and 

camera, (b) with calibration plate. 

other pose ( cam Htool): 

cam Htool = cam Hcal ·( tool Hbase · base Hcal) -1 = cam Hcal · cal Hbase · base Htool 

In the example program hdevelop\handeye_movingcam_calibration.dev, this computation is performed 

by the HDevelop procedure 

procedure calc_cam_start_pose_movingcam (: : CalplatePose, BaseStartPose, 

RobotPoseInverse: CamStartPose) 

pose_to_hom_mat3d (BaseStartPose, base_H_calplate) 

hom_mat3d_invert (base_H_calplate, calplate_H_base) 

pose_to_hom_mat3d (RobotPoseInverse, tool_H_base) 

hom_mat3d_invert (tool_H_base, base_H_tool) 


hom_mat3d_compose (cam_H_calplate, calplate_H_base, cam_H_base) 

hom_mat3d_compose (cam_H_base, base_H_tool, cam_H_tool) 

hom_mat3d_to_pose (cam_H_tool, CamStartPose) 

return () 

Both example programs also contain procedures for determining a start value for the other pose to calibrate. 

8.2.4 Checking the Input Parameters 

Before performing the hand-eye calibration as described in the following section, we recommend to (let 

HALCON) check the input parameters. 

(50)

Figure 79: HALCON calibration plate and its coordinate system. 

8.2.4 Checking the Input Parameters 117 

You can check the point correspondences determined in section 8.2.1 on page 112 by displaying them 

overlaid on the calibration images. If you use the HALCON calibration plate, you can call the procedure 

visualize_results_of_find_marks_and_pose (see appendix B.5 on page 140) as shown in the 

example programs. This procedure displays the center of the calibration marks and the coordinate system 

corresponding to the estimated pose of the calibration plate. As already noted, it is very important to 

check that the coordinate system is oriented correctly. Figure 79 shows a correctly found asymmetric 

calibration plate. 

The robot poses can be checked by computing the start values for the poses to calibrate in all calibration 

images as described in the previous section. The pose parameters, especially the rotation parameters, 

should not change noticeably from image to image. If they do, we recommend to check whether you 

used the correct pose type, i.e., the correct sequence of rotations. 

The following code stems from the example hdevelop\handeye_movingcam_calibration.dev. 

First, we open inspection windows using dev_open_inspect_ctrl to view the computed start values 

for the poses we want to calibrate: 

dev_inspect_ctrl (TmpCamStartPose) 

dev_inspect_ctrl (TmpBaseStartPose) 

For each calibration image, we then extract the corresponding input parameters from the tuples where 

they have been accumulated using a procedure (see appendix B.8 on page 142). With this data, the start 

values for the poses are calculated as described in section 8.2.3 on page 114. The poses are automatically 

displayed in the inspection windows. Finally, the coordinate system of the calibration plate is displayed 

by a procedure shown in appendix B.6 on page 141: 

z 

y 

x 

Robot Vision


select_values_for_ith_image (NRow, NCol, NX, NY, NZ, NFinalPose, 

MRelPoses, i, TmpRows, TmpCols, TmpX, 

TmpY, TmpZ, TmpFinalPose, TmpRobotPose) 

calc_base_start_pose_movingcam (TmpFinalPose, CamStartPose, 

TmpRobotPose, TmpBaseStartPose) 

calc_cam_start_pose_movingcam (TmpFinalPose, BaseStartPose, 

TmpRobotPose, TmpCamStartPose) 

display_calplate_coordinate_system (CalTabFile, TmpFinalPose, CamParam, 

WindowHandle) 

After the last image, the inspection windows are closed again using dev_close_inspect_ctrl: 

dev_close_inspect_ctrl (TmpCamStartPose) 

dev_close_inspect_ctrl (TmpBaseStartPose) 

8.3 Performing the Calibration 

Similar to the camera calibration, the main effort lies in collecting the input data. The calibration itself 

is performed with a single operator call: 

hand_eye_calibration (NX, NY, NZ, NRow, NCol, MPointsOfImage, MRelPoses, 

BaseStartPose, CamStartPose, CamParam, ’all’, 

’CountIterations’, 100, 0.0005, BaseFinalPose, 

CamFinalPose, NumErrors) 

Typically, you then save the calibrated poses in files so that your robot vision application can read them 

at a later time. The following code does so for a system with a moving camera: 

write_pose (CamFinalPose, DataNameStart+’final_pose_cam_tool.dat’) 

write_pose (BaseFinalPose, DataNameStart+’final_pose_base_calplate.dat’) 

Of course, you should check whether the calibration was successful by looking at the output parameter 

NumErrors, which is a measure for the accuracy of the pose parameters. It contains the deviation of 

the model points in meters for each iteration. In the example programs, the parameter is displayed in an 

inspection window with the following line. You can scroll down to the last entry to get the final error. 

dev_inspect_ctrl (NumErrors) 

Similarly to checking the input parameters (see section 8.2.4 on page 116), the example programs then 

visualize the calibrated poses by displaying the coordinate system of the calibration plate in each calibration 

image. But now we compute the pose of the calibration plate in camera coordinates based on 

the calibrated poses. For a moving camera system, this corresponds to the following code (compare 

equation 45 on page 111):

* CalplatePose = cam_H_calplate = cam_H_tool * tool_H_base * base_H_calplate 

pose_to_hom_mat3d (BaseFinalPose, base_H_calplate) 

pose_to_hom_mat3d (CamFinalPose, cam_H_tool) 



hom_mat3d_compose (cam_H_base, base_H_calplate, cam_H_calplate) 

hom_mat3d_to_pose (cam_H_calplate, CalplatePose) 

This code is encapsulated in a procedure, which is called in a loop over all images: 

for i := 0 to NumImages-1 by 1 

read_image (Image, ImageNameStart+i$’02d’) 

TmpRobotPoseInverse := MRelPoses[i*7:i*7+6] 

calc_calplate_pose_movingcam (BaseFinalPose, CamFinalPose, 

TmpRobotPoseInverse, TmpCalplatePose) 

display_calplate_coordinate_system (CalTabFile, TmpCalplatePose, 

CamParam, WindowHandle) 

endfor 

8.4 Using the Calibration Data 119 

The corresponding procedure for a stationary camera system please is listed in appendix B.14 on page 

144. 

8.4 Using the Calibration Data 

Typically, you use the result of the hand-eye calibration to transform the results of machine vision from 

camera coordinates into robot coordinates to generate the appropriate robot commands, e.g., to grasp an 

object whose position has been determined in an image as in the application described in section 8.4.1. 

For a stationary camera, this transformation corresponds to the following equation (compare figure 77 

on page 110): 

base Hobj = base Hcam · cam Hobj = (CamStartPose) -1 · cam Hobj 

For a moving camera system, the equation also contains the pose of the robot (compare figure 76 on page 

110): 

base Hobj = base Htool · tool Hcam · cam Hobj = base Htool ·(CamStartPose) -1 · cam Hobj 

The 3D pose of the object in camera coordinates can stem from different sources: 

• With a binocular stereo system, you can determine the 3D pose of unknown objects directly (see 

section 7 on page 88). 

• With only a single camera, you can use pose estimation to determine the 3D pose of known objects 

(see section 8.4.1 for an example application and section 6 on page 84 for more details on pose 

estimation). 

• With a single camera, you can determine the 3D coordinates of unknown objects if you know that 

object points lie in a known plane (see section 8.4.1 for an example application and section 3.2 on 

page 44 for more details on determining 3D points in a known plane). 

(51) 

(52) 

Robot Vision


a) 

C1 − C4: corner points 

G1 & G2: grasping points 

grasping pose 

reference coordinate system 

b) 

Figure 80: (a) Determining the 3D pose for grasping a nut; (b) robot at grasping pose. 

8.4.1 Example Application with a Stationary Camera: Grasping a Nut 

This section describes an example application realized with the hand-eye system depicted in figure 78 

on page 116. The task is to localize a nut and determine a suitable grasping pose for the robot (see 

figure 80). The HDevelop example program hdevelop\handeye_stationarycam_grasp_nut.dev 

performs the machine vision part and transforms the resulting pose into robot coordinates using the calibration 

data determined with hdevelop\handeye_stationarycam_calibration.dev as described 

in the previous sections. As you will see, using the calibration data is the shortest part of the program, 

its main part is devoted to machine vision. 

First, the calibration data is read from files; for later computations, the poses are converted into homogeneous 

transformation matrices: 

read_cam_par (DataNameStart+’final_campar.dat’, CamParam) 

read_pose (DataNameStart+’final_pose_cam_base.dat’, PoseCamBase) 

pose_to_hom_mat3d (PoseCamBase, cam_H_base) 

read_pose (DataNameStart+’final_pose_tool_calplate.dat’, PoseToolCalplate) 

pose_to_hom_mat3d (PoseToolCalplate, tool_H_calplate) 

In the used hand-eye system, the tool coordinate system used in the calibration process is located at the 

mounting point of the tool; therefore, an additional coordinate system is needed between the fingers of 

the gripper (see figure 78a on page 116). Its pose in tool coordinates is also read from file: 

read_pose (DataNameStart+’pose_tool_gripper.dat’, PoseToolGripper) 

pose_to_hom_mat3d (PoseToolGripper, tool_H_gripper) 

Now, a reference coordinate system is defined based on one of the calibration images. In this image, 

the calibration plate has been placed into the plane on top of the nut. This allows to determine the 

3D coordinates of extracted image points on the nut with a single camera and without knowing the 

dimensions of the nut. The code for defining the reference coordinate system is contained in a procedure, 

which is listed in appendix B.15 on page 144:

8.4.1 Example Application with a Stationary Camera: Grasping a Nut 121 

define_reference_coord_system (ImageNameStart+’calib3cm_00’, CamParam, 

CalplateFile, WindowHandle, PoseRef) 

pose_to_hom_mat3d (PoseRef, cam_H_ref) 

The following code extracts grasping points on two opposite sides of the nut. The nut is found with 

simple blob analysis; its boundary is converted into XLD contours: 

threshold (Image, BrightRegion, 60, 255) 

connection (BrightRegion, BrightRegions) 

select_shape (BrightRegions, Nut, ’area’, ’and’, 500, 99999) 

fill_up (Nut, NutFilled) 

gen_contour_region_xld (NutFilled, NutContours, ’border’) 

The contours are then processed to find long, parallel straight line segments; their corners are accumulated 

in tuples: 

segment_contours_xld (NutContours, LineSegments, ’lines’, 5, 4, 2) 

fit_line_contour_xld (LineSegments, ’tukey’, -1, 0, 5, 2, RowBegin, 

ColBegin, RowEnd, ColEnd, Nr, Nc, Dist) 

Lines := [] 

for i := 0 to |RowBegin| -1 by 1 

gen_contour_polygon_xld (Contour, [RowBegin[i],RowEnd[i]], 

[ColBegin[i],ColEnd[i]]) 

Lines := [Lines,Contour] 

endfor 

gen_polygons_xld (Lines, Polygon, ’ramer’, 2) 

gen_parallels_xld (Polygon, ParallelLines, 50, 100, rad(10), ’true’) 

get_parallels_xld (ParallelLines, Row1, Col1, Length1, Phi1, Row2, Col2, 

Length2, Phi2) 

CornersRow := [Row1[0], Row1[1], Row2[0], Row2[1]] 

CornersCol := [Col1[0], Col1[1], Col2[0], Col2[1]] 

The grasping pose is calculated in 3D coordinates. For this, the 3D coordinates of the corner points in 

the reference coordinate system are determined using the operator image_points_to_world_plane. 

The origin of the grasping pose lies in the middle of the corners: 

image_points_to_world_plane (CamParam, PoseRef, CornersRow, CornersCol, 

’m’, CornersX_ref, CornersY_ref) 

CenterPointX_ref := sum(CornersX_ref)*0.25 

CenterPointY_ref := sum(CornersY_ref)*0.25 

The grasping pose is oriented almost like the reference coordinate system, only rotated around the z-axis 

so that the gripper “fingers” are parallel to the sides of the nut. To calculate the rotation angle, first the 

grasping points in the middle of the sides are determined. Their angle can directly be used as the rotation 

angle: 

Robot Vision


GraspPointsX_ref := [(CornersX_ref[0]+CornersX_ref[1])*0.5, 

(CornersX_ref[2]+CornersX_ref[3])*0.5] 

GraspPointsY_ref := [(CornersY_ref[0]+CornersY_ref[1])*0.5, 

(CornersY_ref[2]+CornersY_ref[3])*0.5] 

GraspPhiZ_ref := atan((GraspPointsY_ref[1]-GraspPointsY_ref[0])/ 

(GraspPointsX_ref[1]-GraspPointsX_ref[0])) 

With the origin and rotation angle, the grasping pose is first determined in the reference coordinate 

system and then transformed into camera coordinates: 

hom_mat3d_identity (HomMat3DIdentity) 

hom_mat3d_rotate (HomMat3DIdentity, GraspPhiZ_ref, ’z’, 0, 0, 0, 

HomMat3D_RZ_Phi) 

hom_mat3d_translate (HomMat3D_RZ_Phi, CenterPointX_ref, CenterPointY_ref, 

0, ref_H_grasp) 

hom_mat3d_compose (cam_H_ref, ref_H_grasp, cam_H_grasp) 

Alternatively, the example also shows how to calculate the grasping pose using pose estimation (see 

section 6 on page 84 for a detailed description). This method can be used when points on the object are 

known. In the example, we specify the 3D coordinates of the corners of the nut: 

NX := [0.009, -0.009, -0.009, 0.009] 

NY := [0.009, 0.009, -0.009, -0.009] 

The grasping pose is then calculated by simply calling the operator camera_calibration, using the 

reference coordinate system as the start pose. Before, however, the image coordinates of the corners must 

be sorted such that the first one lies close to the x-axis of the reference cooridinate system. Otherwise, 

the orientation of the reference coordinate system would differ to much from the grasping pose and the 

pose estimation would fail. 

sort_corner_points (CornersRow, CornersCol, WindowHandle, NRow, NCol) 

camera_calibration (NX, NY, NZ, NRow, NCol, CamParam, PoseRef, ’pose’, 

CamParam, PoseCamNut, Errors) 

display_3d_coordinate_system (PoseCamGripper, CamParam, 0.01, WindowHandle, 

The result of both methods is displayed in figure 80a on page 120. 

Now comes the moment to use the results of the hand-eye calibration: The grasping pose is transformed 

into robot coordinates with the formula shown in equation 51 on page 119: 

hom_mat3d_invert (cam_H_base, base_H_cam) 

hom_mat3d_compose (base_H_cam, cam_H_grasp, base_H_grasp) 

As already mentioned, the tool coordinate system used in the calibration process is placed at the mounting 

point of the tool, not between the fingers of the gripper. Thus, the pose of the tool in gripper coordinates 

must be added to the chain of transformations to obtain the pose of the tool in base coordinates: 

hom_mat3d_invert (tool_H_gripper, gripper_H_tool) 

hom_mat3d_compose (base_H_grasp, gripper_H_tool, base_H_tool) 

Finally, the pose is converted into the type used by the robot controller:

hom_mat3d_to_pose (base_H_tool, PoseRobotGrasp) 

convert_pose_type (PoseRobotGrasp, ’Rp+T’, ’abg’, ’point’, 

PoseRobotGrasp_ZYX) 

Figure 80b on page 120 shows the robot at the grasping pose. 

9 Rectification of Arbitrary Distortions 

9 Rectification of Arbitrary Distortions 123 

For many applications like OCR or bar code reading, distorted images must be rectified prior to the 

extraction of information. The distortions may be caused by the perspective projection and the radial 

lens distortions as well as by non-radial lens distortions, a non-flat object surface, or by any other reason. 

In the first two cases, i.e., if the object surface is flat and the camera shows only radial distortions, the 

rectification can be carried out very precisely as described in section 3.3.1 on page 49. For the remaining 

cases, a piecewise bilinear rectification can be carried out. In HALCON, this kind of rectification is 

called grid rectification. 

The following example (hdevelop\grid_rectification_ruled_surface.dev) shows how the grid 

rectification can be used to rectify the image of a cylindrically shaped object (figure 81). In the rectified 

image (figure 81b), the bar code could be read correctly, which was not possible in the original image 

(figure 81a). 

a) b) 

Figure 81: Cylindrical object: a) Original image; b) rectified image. 

The main idea of the grid rectification is that the mapping for the rectification is determined from an 

image of the object, where the object is covered by a known pattern. 

First, this pattern, which is called rectification grid, must be created with the operator 

create_rectification_grid. 

Rectification


create_rectification_grid (WidthOfGrid, NumSquares, 

’rectification_grid.ps’) 

The resulting PostScript file must be printed. An example for such a rectification grid is shown in 

figure 82a. 

a) b) 

Figure 82: a) Example of a rectification grid. b) Cylindrical object wrapped with the rectification grid. 

Now, the object must be wrapped with the rectification grid and an image of the wrapped object must be 

taken (figure 82b). 

From this image, the mapping that describes the transformation from the distorted image into the rectified 

image can be derived. For this, first, the rectification grid must be extracted. Then, the rectification map 

is derived from the distorted grid. This can be achieved by the following lines of code: 

find_rectification_grid (Image, GridRegion, MinContrast, Radius) 

reduce_domain (Image, GridRegion, ImageReduced) 

saddle_points_sub_pix (ImageReduced, ’facet’, SigmaSaddlePoints, Threshold, 

Row, Col) 

connect_grid_points (ImageReduced, ConnectingLines, Row, Col, 

SigmaConnectGridPoints, MaxDist) 

gen_grid_rectification_map (ImageReduced, ConnectingLines, Map, Meshes, 

GridSpacing, 0, Row, Col) 

Using the derived map, any image that shows the same distortions can be rectified such that the parts 

that were covered by the rectification grid appear undistorted in the rectified image (figure 81b). This 

mapping is performed by the operator map_image. 

map_image (ImageReduced, Map, ImageMapped) 

In the following section, the basic principle of the grid rectification is described. Then, some hints for 

taking images of the rectification grid are given. In section 9.3 on page 127, the use of the involved 

HALCON operators is described in more detail based on the above example application. Finally, it is 

described briefly how to use self-defined grids for the generation of rectification maps.

9.1 Basic Principle 

9.1 Basic Principle 125 

The basic principle of the grid rectification is that a mapping from the distorted image into the rectified 

image is determined from a distorted image of the rectification grid whose undistorted geometry is well 

known: The black and white fields of the printed rectification grid are squares (figure 83). 

Figure 83: Rectification grid. 

In the distorted image, the black and white fields do not appear as squares (figure 84a) because of the 

non-planar object surface, the perspective distortions, and the lens distortions. 

To determine the mapping for the rectification of the distorted image, the distorted rectification grid must 

be extracted. For this, first, the corners of the black and white fields must be extracted with the operator 

saddle_points_sub_pix (figure 84b). These corners must be connected along the borders of the 

black and white fields with the operator connect_grid_points (figure 84c). Finally, the connecting 

lines must be combined into meshes (figure 84d) with the operator gen_grid_rectification_map, 

which also determines the mapping for the rectification of the distorted image. 

If you want to use a self-defined grid, the grid points must be defined by yourself. Then, the operator 

gen_arbitrary_distortion_map can be used to determine the mapping (see section 9.4 on page 130 

for an example). 

The mapping is determined such that the distorted rectification grid will be mapped into its original 

undistorted geometry (figure 85). With this mapping, any image that shows the same distortions can 

be rectified easily with the operator map_image. Note that within the meshes a bilinear interpolation 

is carried out. Therefore, it is important to use a rectification grid with an appropriate grid size (see 

section 9.2 for details). 

9.2 Rules for Taking Images of the Rectification Grid 

If you want to achieve accurate results, please follow the rules given in this section: 

Rectification


a) b) 

c) d) 

Figure 84: Distorted rectification grid: a) Image; b) extracted corners of the black and white fields; c) lines 

that connect the corners; d) extracted rectification grid. 

→ 

a) b) 

Figure 85: Mapping of the distorted rectification grid (a) into the undistorted rectification grid (b).

9.3 Machine Vision on Ruled Surfaces 127 

• The image must not be overexposed or underexposed: otherwise, the extraction of the corners of 

the black and white fields of the rectification grid may fail. 

• The contrast between the bright and the dark fields of the rectification grid should be as high as 

possible. 

• Ensure that the rectification grid is homogeneously illuminated. 

• The images should contain as little noise as possible. 

• The border length of the black and white fields should be at least 10 pixels. 

In addition to these few rules for the taking of the images of the rectification grid, it is very important 

to use a rectification grid with an appropriate grid size because the mapping is determined such that 

within the meshes of the rectification grid a bilinear interpolation is applied. Because of this, non-linear 

distortions within the meshes cannot be eliminated. 

The use of a rectification grid that is too coarse (figure 86a), i.e., whose grid size is too large, leads to 

errors in the rectified image (figure 86b). 

a) b) 

Figure 86: Cylindrical object covered with a very coarse rectification grid: a) Distorted image; b) rectified 

image. 

If it is necessary to fold the rectification grid, it should be folded along the borders of the black and white 

fields. Otherwise, i.e., if the fold crosses these fields (figure 87a), the rectified image (figure 87b) will 

contain distortions because of the bilinear interpolation within the meshes. 

9.3 Machine Vision on Ruled Surfaces 

In this section, the rectification of images of ruled surfaces is described in detail. Again, the example 

of the cylindrically shaped object (hdevelop\grid_rectification_ruled_surface.dev) is used to 

explain the involved operators. 

First, the operator create_rectification_grid is used to create a suitable rectification grid. 

Rectification


a) b) 

Figure 87: Rectification grid folded across the borders of the black and white fields: a) Distorted image; 

b) rectified image. 

create_rectification_grid (WidthOfGrid, NumSquares, 

’rectification_grid.ps’) 

The parameter WidthOfGrid defines the effectively usable size of the rectification grid in meters and 

the parameter NumSquares sets the number of squares (black and white fields) per row. The rectification 

grid is written to the PostScript file that is specified by the parameter GridFile. 

To determine the mapping, an image of the rectification grid, wrapped around the object, must be taken 

as described in section 9.2 on page 125. Figure 88a shows an image of a cylindrical object and figure 88b 

shows the same object wrapped by the rectification grid. 

Then, the rectification grid is searched in this image with the operator find_rectification_grid. 

find_rectification_grid (Image, GridRegion, MinContrast, Radius) 

The operator find_rectification_grid extracts image areas with a contrast of at least 

MinContrast and fills up the holes in these areas. Note that in this case, contrast is defined as the 

gray value difference of neighboring pixels in a slightly smoothed copy of the image (Gaussian smoothing 

with σ = 1.0). Therefore, the value for the parameter MinContrast must be set significantly lower 

than the gray value difference between the black and white fields of the rectification grid. Small areas of 

high contrast are then eliminated by an opening with the radius Radius. The resulting region is used to 

restrict the search space for the following steps with the operator reduce_domain (see figure 89a). 

reduce_domain (Image, GridRegion, ImageReduced) 

The corners of the black and white fields appear as saddle points in the image. They can be extracted 

with the operator saddle_points_sub_pix (see figure 89b). 

saddle_points_sub_pix (ImageReduced, ’facet’, SigmaSaddlePoints, Threshold, 

Row, Col) 

The parameter Sigma controls the amount of Gaussian smoothing that is carried out before the actual 

extraction of the saddle points. Which point is accepted as a saddle point is based on the value of the

a) b) 

Figure 88: Cylindrical object: a) Without and b) with rectification grid. 

9.3 Machine Vision on Ruled Surfaces 129 

parameter Threshold. If Threshold is set to higher values, fewer but more distinct saddle points are 

returned than if Threshold is set to lower values. The filter method that is used for the extraction of the 

saddle points can be selected by the parameter Filter. It can be set to ’facet’ or ’gauss’. The method 

’facet’ is slightly faster. The method ’gauss’ is slightly more accurate but tends to be more sensitive to 

noise. 

a) b) c) 

Figure 89: Distorted rectification grid: a) Image reduced to the extracted area of the rectification grid; b) 

extracted corners of the black and white fields; c) lines that connect the corners. 

To generate a representation of the distorted rectification grid, the extracted saddle points must be connected 

along the borders of the black and white fields (figure 89c). This is done with the operator 

connect_grid_points. 

connect_grid_points (ImageReduced, ConnectingLines, Row, Col, 

SigmaConnectGridPoints, MaxDist) 

Again, the parameter Sigma controls the amount of Gaussian smoothing that is carried out before 

the extraction of the borders of the black and white fields. When a tuple of three values [sigma_min, 

sigma_max, sigma_step] is passed instead of only one value, the operator connect_grid_points tests 

every sigma within the given range from sigma_min to sigma_max with a step size of sigma_step and 

Rectification


chooses the sigma that causes the largest number of connecting lines. The same happens when a tuple 

of only two values sigma_min and sigma_max is passed. However, in this case a fixed step size of 0.05 

is used. The parameter MaxDist defines the maximum distance with which an edge may be linked to 

the respectively closest saddle point. This helps to overcome the problem that edge detectors typically 

return inaccurate results in the proximity of edge junctions. Figure 90 shows the connecting lines if the 

parameter MaxDist has been selected inappropriately: In figure 90a, MaxDist has been selected to 

small, whereas in figure 90b, it has been selected too large. 

a) b) 

Figure 90: Connecting lines: Parameter MaxDist selected a) too small and b) too large. 

Then, the rectification map is determined from the distorted grid with the operator 

gen_grid_rectification_map. 

gen_grid_rectification_map (ImageReduced, ConnectingLines, Map, Meshes, 

GridSpacing, 0, Row, Col) 

The parameter GridSpacing defines the size of the grid meshes in the rectified image. Each of the black 

and white fields is projected onto a square of GridSpacing × GridSpacing pixels. The parameter 

Rotation controls the orientation of the rectified image. The rectified image can be rotated by 0, 90, 

180, or 270 degrees, or it is rotated such that the black circular mark is left of the white circular mark if 

Rotation is set to ’auto’. 

Using the derived rectification map, any image that shows the same distortions can be rectified very fast 

with the operator map_image (see figure 91). Note that the objects must appears at exactly the same 

position in the distorted images. 

map_image (ImageReduced, Map, ImageMapped) 

9.4 Using Self-Defined Rectification Grids 

Up to now, we have used the predefined rectification grid together with the appropriate operators for its 

segmentation. In this section, an alternative to this approach is presented. You can arbitrarily define 

the rectification grid by yourself, but note that in this case you must also carry out the segmentation by 

yourself.

a) b) 

Figure 91: Rectified images: a) Rectification grid; b) object. 

9.4 Using Self-Defined Rectification Grids 131 

This example shows how the grid rectification can be used to generate arbitrary distortion maps based on 

self-defined grids. 

The example application is a print inspection. It is assumed that some parts are missing and that smudges 

are present. In addition, lines may be vertically shifted, e.g., due to an inaccurate paper transport, i.e., 

distortions in the vertical direction of the printed document may be present. These distortions should not 

result in a rejection of the tested document. Therefore, it is not possible to simply compute the difference 

image between a reference image and the image that must be checked. 

Figure 92a shows the reference image and figure 92b the test image that must be checked. 

In a first step, the displacements between the lines in the reference document and the test document are 

determined. With this, the rectification grid is defined. The resulting rectification map is applied to the 

reference image to transform it into the geometry of the test image. Now, the difference image of the 

mapped reference image and the test image can be computed. 

The example program (hdevelop\grid_rectification_arbitrary_distortion.dev) uses the 

component-based matching to determine corresponding points in the reference image and the test image. 

First, the component model is generated with the operator create_component_model. Then, the 

corresponding points are searched in the test image with the operator find_component_model. 

Based on the corresponding points of the reference and the test image (RowRef, ColRef, RowTest, and 

ColTest), the coordinates of the grid points of the distorted grid are determined. In this example, the 

row and column coordinates can be determined independently from each other because only the row 

coordinates are distorted. Note that the upper left grid point of the undistorted grid is assumed to have 

Rectification


a) b) 

smudges 

displaced lines 

missing line 

Figure 92: Images of one page of a document: a) Reference image; b) test image that must be checked. 

the coordinates (-0.5, -0.5). This means that the corresponding grid point of the distorted grid will be 

mapped to the point (-0.5, -0.5). Because there are only vertical distortions in this example, the column 

coordinates of the distorted grid are equidistant, starting at the value -0.5. 

GridSpacing := 10 

ColShift := mean(ColTest-ColRef) 

RefGridColValues := [] 

for HelpCol := -0.5 to WidthTest+GridSpacing by GridSpacing 

RefGridColValues := [RefGridColValues, HelpCol+ColShift] 

endfor 

The row coordinates of the distorted grid are determined by a linear interpolation between the above 

determined pairs of corresponding row coordinates. 

MinValue := 0 

MaxValue := HeightTest+GridSpacing 

sample_corresponding_values (RowTest, RowRef-0.5, MinValue, MaxValue, 

GridSpacing, RefGridRowValues) 

The interpolation is performed within the procedure

procedure sample_corresponding_values (: : Values, CorrespondingValues, 

MinValue, MaxValue, 

InterpolationInterval: 

SampledCorrespondingValues) 


which is part of the example program hdevelop\grid_rectification_arbitrary_distortion.dev. 

Now, the distorted grid is generated row by row. 

RefGridRow := [] 

RefGridCol := [] 

Ones := gen_tuple_const(|RefGridColValues|, 1) 

for r := 0 to |RefGridRowValues|-1 by 1 

RefGridRow := [RefGridRow, RefGridRowValues[r]*Ones] 

RefGridCol := [RefGridCol, RefGridColValues] 

endfor 

The operator gen_arbitrary_distortion_map uses this distorted grid to derive the rectification map 

that maps the reference image into the geometry of the test image 2 . 

gen_arbitrary_distortion_map (Map, GridSpacing, RefGridRow, RefGridCol, 

|RefGridColValues|, WidthRef, HeightRef) 

With this rectification map, the reference image can be transformed into the geometry of the test image. 

Note that the size of the mapped image depends on the number of grid cells and on the size of one grid 

cell, which must be defined by the parameter GridSpacing. Possibly, the size of the mapped reference 

image must be adapted to the size of the test image. 

map_image (ImageRef, Map, ImageMapped) 

crop_part (ImageMapped, ImagePart, 0, 0, WidthTest, HeightTest) 

Finally, the test image can be subtracted from the mapped reference image. 

sub_image (ImagePart, ImageTest, ImageSub, 1, 128) 

Figure 93 shows the resulting difference image. In this case, missing parts appear dark while the smudges 

appear bright. 

The differences between the test image and the reference image can now be extracted easily from the 

difference image with the operator threshold. If the difference image is not needed, e.g., for visualization 

purposes, the differences can be derived directly from the test image and the reference image with 

the operator dyn_threshold. 

Figure 94 shows the differences in a cut-out of the reference image (figure 94a) and of the test image (figure 

94b). The markers near the left border of figure 94b indicate the vertical position of the components 

that were used for the determination of the corresponding points. Vertical shifts of the components with 

respect to the reference image are indicated by a vertical line of the respective length that is attached 

to the respective marker. All other differences that could be detected between the test image and the 

reference image are encircled. 

2 In this case, the reference image is mapped into the geometry of the test image to facilitate the marking of the differences in 

the test image. Obviously, the rectification grid can also be defined such that the test image is mapped into the geometry of the 

reference image. 

Rectification


a) b) 

Figure 93: Difference image: a) The entire image overlaid with a rectangle that indicates the position of 

the cut-out. b) A cut-out.

a) b) 


Figure 94: Cut-out of the reference and the checked test image with the differences marked in the test 

image: a) Reference image; b) checked test image. 

Rectification


Appendix 

A The HALCON Calibration Plate 

Figure 95a shows a HALCON calibration plate. Note that it has an asymmetric pattern in the upper left 

corner. This pattern ensures that the pose of the calibration plate can be determined uniquely. 

Old calibration plates do not have this pattern (see figure 95b). This may lead to problems if, e.g., a 

stereo camera or hand-eye system must be calibrated because the poses must be determined uniquely 

for this. To overcome this problem, you can make an asymmetric calibration plate out of your old 

calibration plate by marking one corner. Pay attention that the asymmetric pattern is not too close to the 

circular calibration marks because otherwise it could have an influence on the geometric accuracy of the 

calibration result. 

(a) (b) 

Figure 95: (a) The HALCON calibration plate with the asymmetric pattern in the upper left corner; (b) an 

old calibration plate that does not have the asymmetric pattern. 

There are two different types of calibration plate description files, which typically lie in the subdirectory 

calib of the folder where you installed HALCON: The standard description files for calibration plates 

that have the asymmetric pattern and the old description files for calibration plates without the asymmetric 

pattern. The old description files are indicated with the suffix old (orientation-less description). 

The behavior of the operator find_marks_and_pose depends on the combination of the used calibration 

plate and the specified calibration plate description file: 

Calibration plate Description file Behavior of find_marks_and_pose 

asymmetric asymmetric The pose will be determined uniquely. 

asymmetric old The pose will be determined such that the x-axis 

points to the right and the y-axis points downwards. 

old asymmetric The operator find_marks_and_pose returns an error 

because it cannot find the asymmetric pattern. 

old old The pose will be determined such that the x-axis 

points to the right and the y-axis points downwards.

B HDevelop Procedures Used in this Application Note 137 

B HDevelop Procedures Used in this Application Note 

B.1 gen_hom_mat3d_from_three_points 

procedure gen_hom_mat3d_from_three_points (: : Origin, PointOnXAxis, 

PointInXYPlane: HomMat3d) 

XAxis := [PointOnXAxis[0]-Origin[0],PointOnXAxis[1]-Origin[1], 

PointOnXAxis[2]-Origin[2]] 

XAxisNorm := XAxis/sqrt(sum(XAxis*XAxis)) 

VectorInXYPlane := [PointInXYPlane[0]-Origin[0], 

PointInXYPlane[1]-Origin[1], 

PointInXYPlane[2]-Origin[2]] 

cross_product (XAxisNorm, VectorInXYPlane, ZAxis) 

ZAxisNorm := ZAxis/sqrt(sum(ZAxis*ZAxis)) 

cross_product (ZAxisNorm, XAxisNorm, YAxisNorm) 

HomMat3d_WCS_to_RectCCS := [XAxisNorm[0],YAxisNorm[0],ZAxisNorm[0], 

Origin[0],XAxisNorm[1],YAxisNorm[1], 

ZAxisNorm[1],Origin[1],XAxisNorm[2], 

YAxisNorm[2],ZAxisNorm[2],Origin[2]] 

hom_mat3d_invert (HomMat3d_WCS_to_RectCCS, HomMat3d) 

return () 

end 

This procedure uses the procedure 

procedure cross_product (: : V1, V2: CrossProduct) 

CrossProduct := [V1[1]*V2[2]-V1[2]*V2[1],V1[2]*V2[0]-V1[0]*V2[2], 

V1[0]*V2[1]-V1[1]*V2[0]] 

return () 

end 

Procedures


B.2 parameters_image_to_world_plane_centered 

procedure parameters_image_to_world_plane_centered (: : CamParam, Pose, 

CenterRow, CenterCol, 



ScaleForCenteredImage, 

PoseForCenteredImage) 

* Determine the scale for the mapping 

* (here, the scale is determined such that in the 

* surroundings of the given point the image scale of the 

* mapped image is similar to the image scale of the original image) 

Dist_ICS := 1 

image_points_to_world_plane (CamParam, Pose, CenterRow, CenterCol, 1, 

CenterX, CenterY) 

image_points_to_world_plane (CamParam, Pose, CenterRow+Dist_ICS, 

CenterCol, 1, BelowCenterX, BelowCenterY) 

image_points_to_world_plane (CamParam, Pose, CenterRow, 

CenterCol+Dist_ICS, 1, RightOfCenterX, 

RightOfCenterY) 

distance_pp (CenterY, CenterX, BelowCenterY, BelowCenterX, 

Dist_WCS_Vertical) 

distance_pp (CenterY, CenterX, RightOfCenterY, RightOfCenterX, 

Dist_WCS_Horizontal) 

ScaleVertical := Dist_WCS_Vertical/Dist_ICS 

ScaleHorizontal := Dist_WCS_Horizontal/Dist_ICS 

ScaleForCenteredImage := (ScaleVertical+ScaleHorizontal)/2.0 

* Determine the parameters for set_origin_pose such 

* that the point given via get_mbutton will be in the center of the 

* mapped image 

DX := CenterX-ScaleForCenteredImage*WidthMappedImage/2.0 

DY := CenterY-ScaleForCenteredImage*HeightMappedImage/2.0 

DZ := 0 

set_origin_pose (Pose, DX, DY, DZ, PoseForCenteredImage) 

return () 

end

B.3 parameters_image_to_world_plane_entire 

B.3 parameters_image_to_world_plane_entire 139 

procedure parameters_image_to_world_plane_entire (Image: : CamParam, Pose, 



ScaleForEntireImage, 

PoseForEntireImage) 

* Transform the image border into the WCS (scale = 1) 

full_domain (Image, ImageFull) 

get_domain (ImageFull, Domain) 

gen_contour_region_xld (Domain, ImageBorder, ’border’) 

contour_to_world_plane_xld (ImageBorder, ImageBorderWCS, CamParam, 

Pose, 1) 

smallest_rectangle1_xld (ImageBorderWCS, MinY, MinX, MaxY, MaxX) 

* Determine the scale of the mapping 

ExtentX := MaxX-MinX 

ExtentY := MaxY-MinY 

ScaleX := ExtentX/WidthMappedImage 

ScaleY := ExtentY/HeightMappedImage 

ScaleForEntireImage := max([ScaleX,ScaleY]) 

* Shift the pose by the minimum X and Y coordinates 

set_origin_pose (Pose, MinX, MinY, 0, PoseForEntireImage) 

return () 

end 

Procedures


B.4 tilt_correction 

procedure tilt_correction (DistanceImage, RegionDefiningReferencePlane: 

DistanceImageCorrected: : ) 

* Reduce the given region, which defines the reference plane 

* to the domain of the distance image 

get_domain (DistanceImage, Domain) 

intersection (RegionDefiningReferencePlane, Domain, 

RegionDefiningReferencePlane) 

* Determine the parameters of the reference plane 

moments_gray_plane (RegionDefiningReferencePlane, DistanceImage, MRow, 

MCol, Alpha, Beta, Mean) 

* Generate a distance image of the reference plane 

get_image_pointer1 (DistanceImage, _, Type, Width, Height) 

area_center (RegionDefiningReferencePlane, _, Row, Column) 

gen_image_surface_first_order (ReferencePlaneDistance, Type, Alpha, 

Beta, Mean, Row, Column, Width, Height) 

* Subtract the distance image of the reference plane 

* from the distance image of the object 

sub_image (DistanceImage, ReferencePlaneDistance, 

DistanceImageWithoutTilt, 1, 0) 

* Determine the scale factor for the reduction of the distance values 

CosGamma := 1.0/sqrt(Alpha*Alpha+Beta*Beta+1) 

* Reduce the distance values 

scale_image (DistanceImageWithoutTilt, DistanceImageCorrected, 

CosGamma, 0) 

return () 

end 

B.5 visualize_results_of_find_marks_and_pose 

procedure visualize_results_of_find_marks_and_pose (Image: : WindowHandle, 

RCoord, CCoord, Pose, 

CamPar, CalTabFile: ) 

dev_set_window (WindowHandle) 



gen_cross_contour_xld (Cross, RCoord, CCoord, 6, 0) 

dev_display (Cross) 

display_calplate_coordinate_system (CalTabFile, Pose, CamPar, 

WindowHandle) 

return () 

end

B.6 display_calplate_coordinate_system 

B.6 display_calplate_coordinate_system 141 

procedure display_calplate_coordinate_system (: : CalTabFile, Pose, CamPar, 

WindowHandle: ) 

caltab_points (CalTabFile, X, Y, Z) 

* arrow should point to farthest marks 

ArrowLength := abs(X[0]) 

display_3d_coordinate_system (Pose, CamPar, ArrowLength, WindowHandle, 

’blue’) 

return () 

end 

B.7 display_3d_coordinate_system 

procedure display_3d_coordinate_system (: : Pose, CamPar, ArrowLength, 

WindowHandle, Color: ) 

pose_to_hom_mat3d (Pose, HomMat3D) 

* store coordinates of the arrows in tuples 

* sequence: origin, x-axis, y-axis, z-axis 

ArrowsXCoords := [0,ArrowLength,0,0] 

ArrowsYCoords := [0,0,ArrowLength,0] 

ArrowsZCoords := [0,0,0,ArrowLength] 

* transform arrow points into camera coordinates 

affine_trans_point_3d (HomMat3D, ArrowsXCoords, ArrowsYCoords, 

ArrowsZCoords, ArrowsXCoords_cam, 

ArrowsYCoords_cam, ArrowsZCoords_cam) 

* get the image coordinates 

project_3d_point (ArrowsXCoords_cam, ArrowsYCoords_cam, 

ArrowsZCoords_cam, CamPar, ArrowsRows, ArrowsCols) 

* display the coordinate system 

dev_set_color (Color) 

gen_contour_polygon_xld (XAxis, [ArrowsRows[0], ArrowsRows[1]], 

[ArrowsCols[0], ArrowsCols[1]]) 

dev_display (XAxis) 

set_tposition (WindowHandle, ArrowsRows[1], ArrowsCols[1]) 

write_string (WindowHandle, ’x’) 

gen_contour_polygon_xld (YAxis, [ArrowsRows[0], ArrowsRows[2]], 


dev_display (YAxis) 


write_string (WindowHandle, ’y’) 

gen_contour_polygon_xld (ZAxis, [ArrowsRows[0], ArrowsRows[3]], 


dev_display (ZAxis) 


write_string (WindowHandle, ’z’) 

return () 

end 

Procedures


B.8 select_values_for_ith_image 

procedure select_values_for_ith_image (: : NRow, NCol, NX, NY, NZ, 

NFinalPose, MRelPoses, i: Rows, Cols, 

X, Y, Z, CalplatePose, RobotPose) 

Rows := NRow[i*49:i*49+48] 

Cols := NCol[i*49:i*49+48] 

X := NX[i*49:i*49+48] 

Y := NY[i*49:i*49+48] 

Z := NZ[i*49:i*49+48] 

CalplatePose := NFinalPose[i*7:i*7+6] 

RobotPose := MRelPoses[i*7:i*7+6] 

return () 

end 

B.9 calc_base_start_pose_movingcam 

procedure calc_base_start_pose_movingcam (: : CalplatePose, CamStartPose, 

RobotPoseInverse: BaseStartPose) 

* BaseStartPose = base_H_calplate = base_H_tool * tool_H_cam * cam_H_calplate 

pose_to_hom_mat3d (CamStartPose, cam_H_tool) 

hom_mat3d_invert (cam_H_tool, tool_H_cam) 




hom_mat3d_compose (tool_H_cam, cam_H_calplate, tool_H_calplate) 

hom_mat3d_compose (base_H_tool, tool_H_calplate, base_H_calplate) 

hom_mat3d_to_pose (base_H_calplate, BaseStartPose) 

return () 

end 

B.10 calc_cam_start_pose_movingcam 



* CamStartPose = cam_H_tool = cam_H_calplate * calplate_H_base * base_H_tool 









return () 

end

B.11 calc_calplate_pose_movingcam 

B.11 calc_calplate_pose_movingcam 143 

procedure calc_calplate_pose_movingcam (: : BaseFinalPose, CamFinalPose, 

RobotPoseInverse: CalplatePose) 








return () 

end 

B.12 calc_base_start_pose_stationarycam 

procedure calc_base_start_pose_movingcam (: : CalplatePose, CamStartPose, 

RobotPoseInverse: BaseStartPose) 

* BaseStartPose = base_H_calplate = base_H_tool * tool_H_cam * cam_H_calplate 

pose_to_hom_mat3d (CamStartPose, cam_H_tool) 

hom_mat3d_invert (cam_H_tool, tool_H_cam) 




hom_mat3d_compose (tool_H_cam, cam_H_calplate, tool_H_calplate) 

hom_mat3d_compose (base_H_tool, tool_H_calplate, base_H_calplate) 

hom_mat3d_to_pose (base_H_calplate, BaseStartPose) 

return () 

end 

B.13 calc_cam_start_pose_stationarycam 



* CamStartPose = cam_H_tool = cam_H_calplate * calplate_H_base * base_H_tool 









return () 

end 

Procedures


B.14 calc_calplate_pose_stationarycam 

procedure calc_calplate_pose_movingcam (: : BaseFinalPose, CamFinalPose, 

RobotPoseInverse: CalplatePose) 








return () 

end 

B.15 define_reference_coord_system 

procedure define_reference_coord_system (: : ImageName, CamParam, 

CalplateFile, WindowHandle: 

PoseCamRef) 

read_image (RefImage, ImageName) 

dev_display (RefImage) 

caltab_points (CalplateFile, X, Y, Z) 

* parameter settings for find_caltab and find_marks_and_pose 

SizeGauss := 3 

MarkThresh := 100 

MinDiamMarks := 5 

StartThresh := 128 

DeltaThresh := 3 

MinThresh := 18 

Alpha := 0.5 

MinContLength := 15 

MaxDiamMarks := 100 

find_caltab (RefImage, Caltab, CalplateFile, SizeGauss, MarkThresh, 

MinDiamMarks) 

find_marks_and_pose (RefImage, Caltab, CalplateFile, CamParam, 



StartPose) 

camera_calibration (X, Y, Z, RCoord, CCoord, CamParam, StartPose, 

’pose’, CamParam, PoseCamRef, Errors) 

display_3d_coordinate_system (PoseCamRef, CamParam, 0.01, WindowHandle, 

’cyan’) 

return () 

end

HALCON Application Note 1D Metrology - MVTec Software GmbH

Create successful ePaper yourself

Delete template?

Save as template?