CSC Visual Servoing Toolbox for Matlab (CSC-VS) Libraries ... - Main

CSC Visual Servoing Toolbox forMatlab (CSC-VS)Libraries for simulation of visual-guidedmanipulatorsDr. Marco Antonio Pérez-CisnerosControl Systems CentreInstitute of Science and TechnologyUniversity of Manchester, UK

Contents1 CSC Visual Servoing Toolbox for Matlab 51.1 Basic concepts in visual servoing (VS) . . . . . . . . . . . . . . . 51.2 Position-based visual servoing (PBVS) . . . . . . . . . . . . . . . 71.2.1 Point-based PBVS . . . . . . . . . . . . . . . . . . . . . . 91.2.2 Pose-based PBVS . . . . . . . . . . . . . . . . . . . . . . . 111.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3 Image-based visual servoing (IBVS) . . . . . . . . . . . . . . . . . 151.3.1 A more advanced IBVS scheme . . . . . . . . . . . . . . . 201.3.2 Modelling an IBVS experiment . . . . . . . . . . . . . . . 231.4 Visual servoing control and stability . . . . . . . . . . . . . . . . . 311.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

4 CONTENTS

Chapter 1CSC Visual Servoing Toolbox forMatlabThis reference manual develops the basic idea of controlling a robot using theimage provided by a camera. Although many authors argue that this concepthas intuitively emerged directly from our human nature, it is obvious that notonly humans but many living beings acknowledge their environment througha great deal of visual information. The advantages are evident because visualsensing facilitates adaptation and other intelligent behavior which eventually haveevolved resulting in well-developed systems to which humans can attribute theirsuccess.The reference manual starts by clarifying the visual servoing concept in anintroductory section. Then a short and concise summary of visual servoing conceptsis presented. Also in this section, some guidelines are given and a practicalVS taxonomy is proposed to classify several contributions found in the literature.Some of the most relevant visual servoing architectures are developed in thefollowing section. Special attention is devoted to those characteristics which practicallydistinguish each visual servoing school. Simulation examples are obviouslypresented using the CSC VS Toolbox created by the author to clarify the use ofVS schemes.1.1 Basic concepts in visual servoing (VS)The use of robotic systems has remarkably contributed to increase the speed andprecision of automated tasks mainly in industrial manufacturing processes. Butgenerally such robot systems require a detailed description of the workspace andmanipulated objects. This is not an issue when the required task employs onlyfully characterized objects, within a completely known environment as is the casewith most industrial assembly lines. However, it has been widely discussed thatthere exists an inherent lack of sensory capabilities in modern robotic systems5

6 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABwhich make them unable to cope with new challenges such as an unknown orchanging workspace, undefined locations, calibration errors and so on.In response to this challenge, visual servoing was born. It is said that the versatilityof a given robot system can be greatly improved by using artificial visiontechniques. As previously discussed, VS emerges naturally from our own humanexperience and from observing other living beings which are able to executecomplicated tasks thanks to their sometimes primitive visual systems.Remember that through our sense of vision, humans are able to measurethe environment and gather relevant data which together with other reasoningprocess, contributes to the ability of performing complicated tasks.So far, when it is required that a given robot executes different sort of tasks,considerable human and economic efforts should be spent in robot reprogramming,relocation, redesign of the workspace and re-characterizing of the objectsand so on, which evidently impacts in increasing labour and production costs.It is hoped that equipping the robot with higher abilities to interact directlywith the environment such as visual analysis capabilities, more complex taskswould be effectively performed by the robot, reducing redesign times and savingmoney. Of course this fact also results in a more versatile robotic system withadaptation capacities and sometimes, as shown in this reference manual, learningabilities.As discussed in the introduction, it is precisely on this idea that the wholeresearch was born because visual servoing is now a mature subject which currentlyhosts many different research lines and in the author’s view the use of learningschemes has still much to offer to improve current and future developments invisual servoing schemes.The more intuitive scheme of visual servoing can be described as follows: bymeans of an artificial visual algorithm, the object of interest can be detectedand marked in the scene. Thus by using background computation, it is possibleto calculate the required movements to drive the robotic system to the requiredlocation. If such procedure is repeated for a sequence of required positions ofthe robot’s end-effector with respect to the object of interest then a sequenceof required movements for the end-effector to reach such a desired object canbe calculated. This primitive scheme was initially known as “look-then-move”.It embodies the first attempts to visually guide a robot. Later, as discussed inthe next section, the inclusion of a feedback loop was proposed to increase theoverall system’s accuracy but at a price, it also includes the drawbacks of classicfeedback schemes such as higher noise sensitivity and instability risks.Although a general term of “visual feedback” was first used at the beginningof visually guided models, the more explicative term of “visual servoing” waslater used instead. As discussed in the history review of VS in the next section,in the early stages some high specialized hardware and software were required tocreate a visual servoing system. Currently, after the impressive development incomputation and graphic processing speed of an average personal computer, it is

1.2. POSITION-BASED VISUAL SERVOING (PBVS) 7possible to design a system capable of driving the robotic arm within real-timeconstraints. In many published reviews of visual servoing such as [1] and [2], itis discussed that visual servoing was boosted by the arriving of more powerfuldesktop computing systems which ease the design of VS systems beyond oldconstraints such as heavy scene analysis computation, slow feature extractionand slow calculation for robot’s link demand at a sufficient rate for servoing themanipulator.This discipline requires the direct interaction of image processing, computervision algorithms, real-time control, robot modelling, linear and non-linear controltheory and digital signal processing, among others. This subject is not thereforean easy topic which requires at least an average knowledge of each of these disciplines.Also a core of real-time software for programming of visual interfaces andreal-time robot control are no less than elemental tools in visual servoing.Given the considerable interaction of several disciplines in VS, it has manycommon frontiers with other subjects with which it may seem to share taskobjectives, such as dynamic vision [3, 4] and active vision [5, 6, 7, 8]. Howevernotice that visual servoing aims to control a robot through artificial vision in away as to manipulate the environment in contrast with these other disciplines onwhich the environment is rather observed passively or actively to extract specificinformation.Now that the word control has emerged, it is fair to say that normally thenature of the regulation algorithms used in VS falls within the non-linear systemsclass, given that they relate a 3D context to a multi-degree robotic system. Thecontrol subject has been addressed by many authors specially by Corke in [9] andChaumette in [10]. This subject is discussed later in this reference manual.Last but not least, consider also the relevance of all the image processingalgorithms acting in the VS system such as object identification, feature marking,tracking and often reconstruction through artificial vision.Let’s now review some basic concepts which participate in a VS scheme. Theaim of visual servoing is the use of visual information to control the pose (positionand orientation) of the final robot actuator or end-effector, with respect to a setof image features from the object of interest or the object’s location itself. Thusit is a priority now to discuss how a required robot task can be defined to suitthe analysis of the visually guided task.1.2 Position-based visual servoing (PBVS)Most robot manipulators include a Cartesian interface given that the 3D can beintuitively understood and facilitates the design of robotic applications. Initially,visual servoing systems were designed in such a way that the object’s position wasestimated from the image with respect to the camera in Cartesian coordinates.This approach was called Position-based Visual Servoing (PBVS). This approach

8 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABwas also presented as a dynamic look-and-move system. Figure 1.1 shows aillustrative representation of an PBVS as presented by Weiss.c x d ControlLawInverseKinematicsJointControllersPowerAmplifiers- x*q D(t)q(t)PoseDeterminationPFeatureExtractionFigure 1.1: Position-based Visual Servoing block representation as introduced byWeissIn PBVS, the geometric model of the target object is used in conjunction withvisual features extracted from the image to estimate the pose with respect to thecamera frame, computing the control law by reducing the error in pose space. Inthis way, the estimation problem involved in computing the object location canbe studied separately from the problem of calculate the feedback signal requiredby the control algorithm.Corke et al. [2] introduced this subject starting from a positioning task specifiedby a kinematic error function ε(x e ) with x e = 0 T e being the end-effector’spose and the function ε : τ → R m . The task is fulfilled when the error is zero,i.e. ε(x e ) = 0. Considering a given pose x e for which the task has been done, drepresents the number of degrees of freedom which have been constrained by thetask such that d ≤ m, with m being the total number of DOF in the robot. Asdiscussed by Espiau et al. in [11, 12], this constraint can be thus used to representthe kinematic error function through a virtual kinematic constraint betweenthe end-effector and the target. This is further commented on later when eachservoing motion is presented. So a point to point positioning task can be simplydefined as:ε(x) = x et − x e (t) (1.1)with x et representing the Cartesian target location and x e (t) the trajectory. Thesimplest control law can be enforced ensuring an exponentially decreasing zeroerror given by the differential equationẋ e (t) = κ · ε( x e (t) ) = κ · ( x et − x e (t) ) (1.2)Once the task is properly defined, visual data are applied to the regulator toreduce the kinematic error function to zero. At the controller output, the vectorscrew u m ɛ R m is generated and sent to the link controller. Recall that m inthis case represents the number of degrees of freedom constrained by the taskfunction.

1.2. POSITION-BASED VISUAL SERVOING (PBVS) 9A more formal presentation of PBVS follows. It presents two classes ofPBVS schemes: point-based -sometimes also called features-based, and posebasedPBVS systems. In the point-based PBVS, the visual sensor maps someof the object’s features into a Cartesian positional vector whereas pose-basedPBVS methods use an estimation algorithm to define the object’s pose with respectto the camera, passing this estimation directly to the control algorithmwhich operates in pose space. Let’s discuss further about each method.1.2.1 Point-based PBVSJust for purposes of clarity , the discussion regarding PBVS begins by presentingone of the simplest schemes whose simple task is positioning the robot’send-effector in a given target location. Although these systems are not furtherdeveloped in this reference manual, they help us to understand some conceptsand to introduce the notation.The manipulator is sensed by registering one marker located in its end-effector.One camera is fixed in a location were occlusion is not likely to occur 1 . This setupis shown in Figure 1.2 where the main feature e P is expressed relative to the baseframe by 0 T e · eP. Similarly, the camera location is defined with respect to thebase frame as 0 T c .0 T e. e P0 T cSFigure 1.2: Simple example of a Eye-to-Hand PBVS with a fixed cameraRecalling the VS taxonomy presented before, this is an ECL Eye-to-Handposition-based VS system, i.e. with a fixed camera. The kinematic error functionmay be simply defined in base coordinates asE pp ( 0 T e ; S, e P) = 0 T e ·e P − S (1.3)which is equivalent to expression 1.1 but within a different perspective. In thisnotation the first value before the semicolon represents the variable to control.In PBVS this variable is always the end-effector position T e . Other values in thefunction body represent other parameters. E pp refers to a simple point-to-pointpositioning task restricted to the Cartesian task space τ ∈ R 3 . As introduced byEspiau et al. in [12], the task can be seen as equivalent to a rigid link that fullyconstrains the end-effector’s pose with respect to the target.1 This condition is supposed and evidently Figure 1.2 is just for illustration purposes

10 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABBasically the system obtains an estimate of the target location c Ŝ relative tothe camera. This supposes a previous calibration procedure to define the camerapose in base coordinates 0 ˆT c yielding0 Ŝ = 0 ˆT c · cŜ (1.4)now 0 Ŝ represents the target Cartesian point in base coordinates. Thus, thecontrol law for this problem can be formulated following Equation 1.3 asu 3 = −k E pp ( 0 ˆT e ;0 ˆT c · cŜ, e P) = −k( 0 ˆT e · eP − 0 ˆT c · cŜ) (1.5)This is a proportional law to take the error function to zero with k beingthe proportional feedback gain. Notice as well that though 0 T e is the kinematictransformation among the robot’s links, it has been marked as an estimate valuebecause of the chance of exhibiting errors.The control law in Equation 1.5 is marked as u 3 just to make clear thatthe resulting vector is a 3-component cartesian screw which is different from the6-element screw vector in a fully characterized robot visual task belonging toτ ∈ SE 3 .Using the same notation let’s analyze the Eye-on-Hand problem which belongsto the class of VS systems of interest in this reference manual. This new setup ispresented graphically in Figure 1.3.SFigure 1.3: Simple example of a Eye-in-Hand PBVS with the camera attachedto the end-effectorIn this case, the kinematic error function e E pp can be expressed ase E pp ( 0 T e ; 0 S, e P) = e P − ( 0 T e ) −1 · 0S= e P − e T 0 · 0S(1.6)Again S is computed as c Ŝ = c ˆT t · tS from the camera estimation of thetarget point. This can be combined with other transformations coming from therobot kinematics and the camera calibration to generate0 Ŝ = 0 ˆT e · e ˆT c · cŜ (1.7)

1.2. POSITION-BASED VISUAL SERVOING (PBVS) 11Recalling the error function from Equation 1.6 into one computed with respectto the end-effector to account for the Eye-on-Hand setup, the control law emergesasu 3 = −k e E pp ( 0 0T e ; ˆT e · e ˆT c · cŜ, e P) = −k( e P − e ˆT 0 · 0 ˆT e · e ˆT c · cŜ)= −k e E pp ( 0 eT e ; ˆT c · cŜ, e P) = −k( e P − e ˆT c · cŜ)(1.8)which is of great importance because 0 ˆT e has been dropped out of the controllaw making the expression simpler and moreover showing that position accuracydoes not depend on the robot’s kinematics [2]. However notice that this is just apoint-to-point positioning method. More complex VS motions such as point-tolinecan be computed in a similar fashion but they are not included here becausethey are outside our scope. A simple and readable exposition of this matter canbe found in the famous Corke tutorial [1].1.2.2 Pose-based PBVSSo far, the positioning task has been defined using observable point features.When the visual servoing task involves a known object, it is possible to recoverdirectly the pose of the object, driving the visual task in pose space.Take a second look at Figure 1.3. Again suppose that t S is a point defined inthe object coordinate frame and we know that through the Eye-on-Hand cameracalibration it is possible to compute the transformation c ˆT t between the objectframe and the camera’s. The object pose can be thus expressed relative to thecamera frame as c Ŝ = c ˆT t · tS. The desired end-effector’s pose relative to theobject frame is c T e ∗, instead of the target point used in the point-based PBVSintroduced in the last section.If for a moment, the VS scheme shown in Figure 1.3 is supposed to be a“fixed-camera” system -such as the Eye-to-hand presented in last section, butrelative to the end-effector, thus the error function can be easily expressed as inEquation 1.8 considering the object’s [1] asE rp ( 0 ˆT e ; t T e∗ , 0 ˆT t ) = e T e∗= ( 0 ˆT e ) −1 · 0 ˆT t · tT e∗= e ˆT o · 0 ˆT t · tT e∗(1.9)Notice that the error function is expressed in pose space as the transformatione T e∗ from the desired pose e ∗ to the current one e. Given the fixed cameralocation 0 T c which is defined relative to the base frame and estimated as resultof the calibration procedure, 0 ˆT t is easily obtained from 0 ˆT t = 0 T c · cŜThis last expression is the term which changes when we return to a non-fixedcamera configuration with the camera mounted in the robot’s last actuator, i.e.an Eye-on-Hand setup. 0 ˆT t is now expressed in such a way to consider the factthat the camera is moving and therefore the object pose is changing with respect

12 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABto the camera frame. Do not forget that the camera is firmly attached to thelast actuator with the transform e T c as a constant matrix. Let’s deduce thenew error expression for the Eye-in-Hand setup from the fixed-camera schemegiven in Equation 1.9. Adapting for the camera mounted in the end-effector andsubstituting 0 ˆT t for 0 T e · eT c · c ˆT t , it yieldsE rp ( t T e ; 0 T e , e T c , c ˆT t , t T e ∗) = e T 0 · ( 0 T e · eT c · c ˆT t ) · tT e ∗= e T c · c ˆT t · tT e ∗(1.10)Notice again that the error function E rp (relative to pose) does not include therobot kinematic transformation 0 T e . This accounts for the independence of theservoing task regarding errors in the robot kinematic model.This class of PBVS is sometimes known as 3D Position-based Visual Servoingbecause the model of the object is used to estimate the pose with respect to thecamera frame c ˆT t which is later used in the algorithm. Notice as well that e T cis a constant matrix derived from the camera calibration and t T e ∗ is the targetrelationship between the object and the robot’s end-effector, i.e. the desired robotpose.One key component of the pose-based PBVS is therefore the estimation algorithmused to compute the relationship c ˆT t between the object frame and thecamera’s. A common and practical choice in many PBVS systems is the Dementhonalgorithm for estimating the pose of an object. There are two versions ofthe algorithm which are commonly used in VS. This is discussed in chapter ??.Let’s present a simulated example of pose-based PBVS using some librariesfrom the visual servoing toolbox for Matlab written for this reference manual.The example and the camera representation are taken from the elegant work ofCervera presented in [13], but developed with our own simulations. Figure 1.4shows the Simulink model of this example.[4x4]−C−0Tt[4x4]P[4x4][4x4]End−Effector[2x4]FeaturesObject Frame[4x4]Object Object in FrameImage Experimental Model[2x4][4x4][2x4]Image[4x4]cTtObject[4x4]DementhonIn1Camera ModelVisualizationCurrent Object PoseDesired PoseError Computation[3x1]P Ptheta[3x1] [6x1] [4x4]u u cVc Screw TrajectorythetaScrew VectorComputationKinematicstP−C−tTe*[4x4][4x4]Object in Frame[2x4]Desired FeaturesDesired Camera PoseDesired Camera Position[4x4]Image[4x4]cTt*ObjectDementhon*Figure 1.4: Schematic model in Simulink of a Pose-based, Position-based VisualServoing task using Dementhon’s POSIT algorithm.

1.2. POSITION-BASED VISUAL SERVOING (PBVS) 130.2Z00. 20endstart20. 51.51Y1. 50.51X20Figure 1.5: Trajectory of the camera in the workspace, including the start andend locations as well as two intermediate positions (all axes in meters).Basically, as shown by Figure 1.5, the servoing task aims to take an idealizedcamera of 512 × 512 pixels from a Cartesian position (in meters) p 0 =[2.15, −2.090], to a new location of interest p f = [0.93, −0.870]. The attention isfocused on one square object of 25 centimeters centered in the origin of the baseframe and resting vertically on the x-z plane. The two first graphics in Figure1.6 shows the initial and final camera image corresponding to the first and lastcamera location. The left image shows the trajectory of features in the imagespace. Observe that the orientation of the camera is changed as the object isapproached.01002003004005000 200 40001002003004005000 200 40001002003004005000 200 400Figure 1.6: Left and central images: initial and final location of the features inthe image. Right, trajectory of the features in the image (all axes are pixels).The camera motion is simulated as a simple linear motion excited by the 6-element screw vector, with three translational and three rotational components

14 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABas the one in Equation ?? in chapter ??. The controller is a simple proportionalregulator with K p = 0.125.The pose estimation algorithm used is the Dementhon’s POSIT (Pose fromOrthography and Scaling with Iterations) reviewed in chapter ??. The thresholdfor the pose estimation is ζ = 0.001Analyzing the simulation results, it is easy to see that the motion is slowlyperformed in 30 seconds, resulting in a nearly straight trajectory with a slightrotation upon the z-axis of about -34.3 degrees. Notice in Figure 1.8 that theCartesian trajectory results from the translation component z of the screw vectorwhose velocity begins from 0.25 meters per second. Recall that such translationis only applied in direction of the z axis of the camera frame which explains whythe resulting trajectory is a straight line following the z axis. The same occursto the rotational components which only rotate around the z axis. This fact canbe better observed in the upper graph in Figure 1.7 which presents a differentperspective of the trajectory in the Cartesian Space. In the lower graphic thevector field of the camera frame is presented. Notice that the trajectory followsthe direction of the z axis. Actually it reduces its longitude as the motion reachesthe desired pose t T e ∗ because the positional components in the screw vector goto zero. Moreover the vector which initially points upwards in the lower graph,results in a vectorial component from the angular velocity around the z axis whichexplains why its longitude also reduces as the task is achieved.1.2.3 DiscussionThe fact that the whole task description can be formulated in terms of Cartesiancoordinates is definitely the main advantage of the PBVS schemes. For instance,the tracking of moving objects with Cartesian motion predicted and estimatedusing Kalman filters as reported in [14]. Also the construction of trajectoryinterpolators based on PBVS is not difficult.Notice however that the system performance directly depends on the reliabilityof the camera calibration and the pose estimation algorithm. The level ofinteraction changes from one application to another. For instance, if moderateaccuracy is required, standard calibration techniques might provide a sufficientlyaccurate performance. Conversely calibration sensitivity is an important issue insystems with high accuracy requirements and moving cameras.Many reported PBVS applications use different imaging models. The simplepin-hole camera model is the most popular providing moderate accuracy whereasother reported works use different models such as the affine camera model in [15].As discussed by Corke in [1], Pose-based PBVS seems to be the most genericapproach despite the fact that the model of the object geometry is required tocompute the pose. The time required to estimate the pose is usually consideredas the main disadvantage. However this inconvenience has been considerablyreduced using new faster processing elements.

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 150.2Z0−0.20 0.5 1 1.5 2 −2 −1.5 −1 −0.5 0XY0.040.020Z−0.02−0.04−0.060.8 1 1.2 1.4 1.6 1.8 2 −2.5 2.2 −2 −1.5 −1 −0.5XYFigure 1.7: Upper graph: Camera trajectory in PBVS seen from a differentperspective. Lower graphic: Vector-field representation of the screw applied tothe camera motion. All the axes expressed in meters.1.3 Image-based visual servoing (IBVS)This section discusses the principles of the Image-based Visual Servoing (IBVS)and aims to introduce several VS foundations presenting at the same time someof the contributions of this reference manual such as the simulation library forone IBVS scheme for the TQ MA2000 manipulator. These libraries are designedto naturally interface with the Matlab Toolbox of Robotics developed by Corke[16] and introduced in Chapter ??.This VS class employs image features parameters to calculate the error. Nopose estimation algorithms are required because the servoing is performed directlyin image space. A visual servoing task can be represented by an image errorfunction such as e : F → R m with m ≤ l and l being the dimension of the featurespace. The value of such an error function should be zero when the task has beencompleted. In the case that the object of interest is moving, the error functionrepresents a tracking task.Usually the desired feature vector is defined in the image space either by usingone camera geometry model as those introduced in chapter ?? or by the techniqueof “teaching by showing” which basically takes the manipulator to the desiredlocation and registers the features of the object of interest.IBVS has been successfully used in both fixed-camera and Eye-in-Hand setup,

16 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLAB0.30.250.2vxvyvz0.15m/sec0.10.0500.050 5 10 15 20 25 30Time (seconds)0.10.08wxwywz0.06rad/sec0.040.0200 5 10 15 20 25 30Time (seconds)Figure 1.8: Screw vector in Position-based Visual Servoingas well as in EOL and ECL schemes (see section ??). Although our discussionmainly focuses in Eye-in-Hand VS, some remarks regarding fixed-camera systemare also included.P d+ -Control LawJointPowerVisual Jacobian Controllers AmplifiersPq(t)FeatureExtractionFigure 1.9: Image-based Visual Servoing blocks representation as per WeissFigure 1.9 shows a very illustrative representation of IBVS as given by Weissin [17]. Inside the control law block which drives the set of joint controllersresides the visual Jacobian. Similar to the kinematic Jacobian introduced inchapter ??, the image Jacobian relates differential changes in the image featuresto differential changes in the robot position. If r represents the end-effector’spose in a given representation of the task space F , thus ṙ represents the velocityexpressed by a screw vector. Thus let f represents a vector of features parametersand ḟ their velocities. In similar fashion to Equation ??, it is possible to writesuch a relationship as

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 17ḟ = J v · ṙ (1.11)with J v being the image Jacobian with J v ∈ R k×m . k corresponds to the sizeof the feature parameter vector and m is the dimension of the task space F .Expression 1.12 shows a more complete representation of the visual Jacobian aspresented in [1], which can be understood as the linear transformation from thetangent space of T at r to the tangent space of F at f.J v = [ ∂∂r⎡] ⎢ = ⎣∂v 1 (r)∂r 1. . ..∂v k (r)∂r 1. . .∂v 1 (r)∂r m.∂v k (r)∂r m⎤⎥⎦ (1.12)Notice that the number of columns in the Jacobian can vary depending on thetask. In the case of our experimental setup m = 6 because our task is constrainedby the number of DOF in the TQ MA2000 robot.Let’s follow the illustrative calculation of a simple point-based visual Jacobianpresented in [2] and [1]. Basically the velocity of an object’s point with respectto the camera can be computed using the differential motion, as follows:cṗ = ω(t) × c p + T (1.13)with ω(t) = [ω x , ω y , ω z ] and c p = [x, y, z] T . If the equations from the pin-holecamera model are considered, thus it is possible to write c ṗ in terms of the featureparameters u, v as follows:ẋ = zω y − vz ω f z + T zẏ = uz ω f z − zω x + T y(1.14)ż = z (vω f x − uω y ) + T zConsidering the vector of feature parameters as F = [u, v], the perspective projectionin Equations ?? and ?? and solving for each derivative such as ˙u and ˙v,it is possible to gather both expressions by isolating the components of the screwvector T and ω to conform the matrix arrangement[ ˙u˙v]=[ λz0−uz0λz− v z− uvλ−λ 2 −v 2λλ 2 +u 2λuvλ]−v [Tx ] TT y T z ω x ω y ω z (1.15)uThe visual Jacobian links the changes in feature parameters with the changesin the manipulator pose. As discussed in [18] the complete Jacobian matrix isconformed by stacking expression in 1.15 as many times as the number of featuresbeing registered in the image. This is to say that if our task is constrained by themanipulator’s 6-DOF, then only three features are required to build a square 6x6visual Jacobian matrix because the stacking results in a matrix with six rows.

18 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABSeveral others derivations of the visual Jacobian are published alternativelyin [19, 20]. Initially this matrix was named the feature sensitivity matrix in [17].Other publications which also include its derivation, have named it differentlysuch as B matrix [21, 22] or interaction matrix [12]. Hager [23] also describes howto construct a Jacobian for an stationary stereo system with several positioningtasks. Other approaches to calculate the visual Jacobian make use of differentcamera models, such one extension of Hager’s research [24] which changes fromthe classical pin-hole model to the projective camera model. An orthographiccamera model is used by Scheering and Kersting [25] while Colombo et al. [15]use an affine camera model. A more special model is used by Mitsuda et al. in[26]. It is based on the binocular human visual system and basically the visualspace is covered by three angular parameters: elevation, vergence and azimuth.The natural question of which features are more convenient to register in orderto build a better Jacobian representation was addressed by Feddema in [27].He describes an automated method to choose features to facilitate the controlleroperation and the image processing. Some properties such as the controllability,observability and sensitivity are analyzed in the light of the convenience of choosingsuch features. A more detailed study into the controller sensitivity due tofeature selection is discussed by Hashimoto and Noritsugu in [28]. Also Sharmaand Hutchinson [29] published an analogy of the manipulability property of theclassical robot Jacobian which is now transported to the visual case by analyzingthe motion perceptibility of a visual Jacobian.Let’s discuss further regarding the feature configuration in the target. In[27] a system to assign automatically the object’s features is presented, thougha CAD model of the object is needed. Another approach is to use directly theimage plane to define feature locations through the sequence of images in therobot’s path. Feddema et al. [30] uses this data to interpolate trajectories. Thusit is also possible to create a time-variant reference feature as documented byBerry et al. in [31]. In the same line, Nelson and Khosla [32] use an CADbasedenvironmental model to generate such a reference feature in advance. It ispossible to export these ideas to the case of a projective modelling of the problemas shown by Ruf and Horaud in [33]. Optionally the full-scale motion planningconsidering visual constraints was also studied by Sharma and Sutanto [34] andFox and Hutchinson [35].Solving the IBVS problemIBVS is not an easy problem to solve. For control purposes normally the inverseof the visual Jacobian is required, that is to know the change in the end-effectorpose as a result of changes in several feature points in the image. In a visualfeedback control structure, the error signal is defined directly in terms of theimage features, with the error function defined as e(f) = (f ∗ − f), with f ∗ beingthe target feature vector. Thus a simple proportional control law based on the

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 19resolved rate method can be built as followsẋ = k · J −1v (x) · (f ∗ − f) (1.16)An exact inverse may exist conforming a square matrix as mentioned beforebut this implies some challenges.Several cases should thus be considered attending to the number of featuresand the dimension of the task space. Assuming that ḟ is obtained from the visualalgorithm thus typically VS applications require to compute ṙ in Equation 1.11.In early approaches only a square Jacobian was chosen [17, 36] which meansthat the required number of features match with the number of DOF of thetask. However, given that the Jacobian is defined locally, it can eventually developsome singularities, specially once the robot is in motion. Jang and Bien[37] analyze in advance singularities in the robot’s path with an off-line method.Usually the easiest option is to choose more features than the number of DOFin the task. This evidently produces a non-square Jacobian which in order to beinverted requires pseudoinverse of the visual Jacobian J + v . A discussion regardingsuitable pseudoinverses can be found in [1]. A study regarding the effects ofusing a redundant number of features which includes experimental and analyticalissues was presented by Hashimoto and Noritsugu in [28]. As detailed below, theinverse case, when there are less features than DOF in the task is also interesting.Chaumette et al. [38] addressed systematically that some eye-in-hand servoingtasks can be decomposed in suitable visual Jacobians by constraining some DOFs,which they called virtual linkages. By using the task function approach[11], theremaining DOF of the positioning task can be used to solve some secondarypositioning task such as avoiding kinematic limits [39] or camera occlusions suppressionwhich enhances the field of view of the camera [40].As clearly discussed by Hager in [2], there are three cases that must be considered.Recalling that k is the total number of feature parameters, which implies atotal of k/2 features because each feature vector is conformed by two componentsof the form f = [u, v]. On the other hand m is determined by the task space andas said for our particular case in this reference manual m = 6 because representsthe number of DOF in the robot, if full control is assumed.With m = k (which is equivalent to m = k), J 2 2 v is non-singular and henceJ −1v exists. For this case, our problem is directly solved as ṙ = J −1v ḟ. In [41],Feddema combines this approach with an automated method to choose betterfeatures to minimize the condition number of J v .The servoing problem is a different story when k ≠ m because the computationof J −1v should be accomplished by different means (or methods). Assumingthat J v is full rank with rank(J v ) = min(k, m), thus a Least-Squares (LS) solutioncan be proposed to minimize the norm ‖ḟ − J vṙ‖. Following the LSformulation, our servoing problem can be drawn as follows

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 21inconveniences arise from singularities and local minima in the visual computation,affecting the convergence and stability properties [10].New IBVS techniques, sometimes called 3D IBVS 2 , make better use of animproved geometry for the camera modelling. Moreover advanced VS schemesexplore further the information provided by the value of pixels which is alsocombined with improved methods for depth estimation [51, 45].Initially the error function is also defined in terms of the feature vector fand the desired feature vector f ∗ . As discussed before, IBVS aims to find therelationship between velocity of the features in the image and the velocity screwṙ of the camera attached to the robot as specified in Equation 1.11.An improved camera model is considered as the one in chapter ?? with thefocal distance expressed in pixel units, distortion coefficients 3 and the remarkableassumption of no perfect square pixels. Thus the intrinsic camera parameters canbe arranged in a convenient way inside the matrix A as introduced in chapter ??⎡⎤ ⎡⎤[ fk ˙u˙v] u −fk u cot φ u 0 α u α uv u 0= ⎣ 0 fk v sin φ v 0⎦ = ⎣ 0 α v v 0⎦ (1.19)0 0 1 0 0 1with f being the focal distance whereas α u , α uv , α v , u 0 and v 0 are the cameraintrinsic parameters. Thus the visual Jacobian matrix is now a function of the socalledmetric 4 pixel coordinates of each feature s i = [x i y i ] T , the depth estimationZ i and the camera matrix A as in expression 1.19. As shown by Malis in [52],the visual Jacobian takes the formJ v (s i , Z i , A) =[ ] [αu α uv −1·0 α vZ i00 − 1 Z i− y ix i]Z ix i y i −(1 + x 2 i ) y iZ i(1 + yi 2 ) −x i y i −x i(1.20)The Jacobian stacking is also required when several pixels are consideredyielding a Jacobian matrix of the form⎡ ⎤J v1⎢ ⎥J vs (s i , Z i , A) = ⎣ . ⎦ (1.21)J vk/2with k/2 being the total number of features being registered and k itself representingthe total number feature parameters.2 Though this name may create confusion with other VS schemes discussed before.3 Depending on the precision requirements, sometimes the camera model might considerdistortion4 Metric is the name used by this author with the only purpose of remarking that pixel valuesdo not come from image pixels but from the metric value of each pixel defined by expression1.22.

22 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABThus the metric value of the feature parameters, i.e. x i and y i parameters,can be computed directly by[xi]=y i[ 1α u]− α uvα u α vα u01·[ ]ui − u 0v i − v 0(1.22)One of the advantages of this new VS modelling is that assuming the cameramodel without distortion, it is possible to process expression 1.22 to obtain a linearrelationship between pixel coordinates (u i v i ) and their correspondent metriccoordinates (x i y i ) yielding[ ] [ ] [ ] [ ]ui αu α=uv xi u0· +(1.23)v i 0 α u y i v 0Recall that the pin-hole simple projection for each point can be expressed as[ ]xi= 1 [ ]Xi(1.24)y i Z i Y iThus expression 1.23 can be developed to obtain a relationship based on theDepth parameter Z i as follows⎡ ⎤ ⎡⎤u i Z i α u α uv u 0⎣v i Z i⎦ = ⎣ 0 α v v 0⎦ p i = Ap i (1.25)Z i 0 0 1from which the new feature vector is composed by s i = [u i Z i , v i Z i , Z i ] T , yieldingto a camera transformation of the form s i = Ap i , which can be used to expressthe relationship between velocities asṡ i = Aṗ i (1.26)As in the simple IBVS control algorithm, we require the computation of theinverse visual Jacobian to calculate the required manipulator motion in responseto changes in the image features. Thus the velocity screw required is calculatedunder the task function approach [11] asv = −λ · J + v · e (1.27)with e being the error vector between the current feature vector f and the desiredone f ∗ . The pseudo-inverse as in expression 1.18 is used to calculate J + v . Nowconsidering the new modelling including the camera intrinsic matrix A, the errorexpression can be expressed as⎡⎤A(p 1 − p ∗ 1)e = (f − f ∗ ⎢⎥) = ⎣ . ⎦ (1.28)A(p k − p ∗ k )

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 23This practically completes the presentation of basic operational features of atypical IBVS application. It is thus a good point to develop a visual servoingtask which allows us to discuss further regarding IBVS properties.The next section describes the experimental setup and a detailed simulationof the IBVS problem following these equations.1.3.2 Modelling an IBVS experimentThis section introduces the experimental setup used repeatedly in this referencemanual. The experimental setup is described together with several objects whichtake part in our experiments. Most notably one small replica train of controlledspeed is used as a mobile target in the experiments.The purpose is to build an Eye-in-Hand monocular Image-based visual servoingsystem. The camera is placed in the robot’s end-effector and calibrated usingthe Bouguet[53] Camera Calibration Toolbox for Matlab presented in Chapter??.Several visual servoing tasks can be conducted on this experimental setup,such as positioning the robot’s end-effector over immobile objects like a whitesquare. Figure 1.10 shows an illustrative picture of the robot produced by theMatlab’s Robotics Toolbox[16] on which four features of a target object are alsoshown.xyzFigure 1.10: The end-effector of the TQ MA2000 Robot placed over an objectmarked by four featuresFor clarity purposes the camera is supposed to be attached exactly at theterminal point of the end-effector as shown in Figure 1.10. The end-effectororientation keeps the camera Z-axis normal to the workspace whereas the othertwo axes are oriented so as to keep the four features centered in the image.

24 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABDescribing the robot’s workspaceThe TQ MA2000 Robot is resting over a flat surface with the circular railway,on which the mini train runs, placed within the workspace just in front of therobot with the closest distance between them being approximately 0.10 meters.The mini train passes in front of the camera which registers the four features inorder to perform the tracking task. Figure 1.11 shows an elevated frontal view ofthe experimental setup with a section of the circular railway. The camera is notshown because, as explained before, its location is assumed to be at the terminalpoint of the end-effector’s last link. Also three small squares are shown only tomark the flat surface in the workspace.yzxFigure 1.11: Frontal and elevated picture of the TQ MA2000 Robot includingthe railway trajectory and three surface markersThe TQ MA2000 robot kinematics is simulated following the parameters documentedin Chapter ??. The train motion is simulated as a simple circular motionwith a constant angular velocity as follows⎡ ⎤ ⎡⎤X train r cos ωt + c x⎣Y train⎦ = ⎣r sin ωt + c y⎦Z train h⎡ ⎤ ⎡ ⎤Ẋ train −rω sin ωt⎣Ẏ train⎦ = ⎣ rω cos ωt ⎦ (1.29)Ż train 0with r being the radius of the circle, (c x , c y ) the center of the circle and h theheight of the train. Notice that the train speed is variable but an average speedof ω = 0.698 mts was selected. Using this value the train is able to ride throughsecthe arc of an angle of 45 degrees in 2.25 seconds. This value was defined fromseveral real measurements of the time required by the train to complete a full lapat different velocities.

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 25The height is also important because the observed features are located on theroof of the train. In this simulation, h = 0.04 meters. The complete parametricrepresentation included in the software is therefore⎡ ⎤ ⎡X train 0.44 · cos( −0.698t − 3π ) + 0.54 ⎤4⎣Y train⎦ = ⎣ 0.44 · sin( −0.698t − 3π ) ⎦ (1.30)4Z train 0.04with r = 0.44 meters and the railway circle centered at (0.54, 0.0). The phasevalue of 3 π locates the initial position of the train in the right corner of the robot’s4workspace. The negative sign in the expression produces a clock-wise rotation.Simulating the IBVS systemFigure 1.12 presents the schematic of our IBVS Simulink model. The generatorfor the train motion is included in the first block on the left labelled as “0Tdo”.In1Camera Model VisualizationImage−based visual servo controlTracking the Railway CircleMarco Perez CSC UMIST 20030ToMovoTdoPoPEnd−EffectorFeaturesTrajectoryObjectDepthImage Experimental ModelFeatures(uv)MetricFeatures PreprocessingFeatures MetricDepthJacInvVisual Jacobian Computationand PseudoInverseJ+Features Metric (s) cVcTarget Metric (s*)PI control lawcVcT0ctime0Tc(t)q−C−0To−C−0Tc*ObjectDesired Position FeaturesCamera Target PosDesired ModelFeatures(uv)MetricFeatures Preprocessing1qTQ MA2000 RobotnJjacobnD:[6]Signal SpecificationJ−5.0JJ −1ijacobJiMatrixMultiply1sRatecontrolledrobot axesqTQ MA2000 RobotqTfkineFigure 1.12: The Image-based Visual Servoing Simulink diagram to implementthe tracking task in the MA2000 Robot workspaceThe Simulink block marked as “Image Experimental Model” contains thepin-hole camera model presented in expressions 1.23 to 1.25. The output of thisblock is fed into the visualization block to produce a virtual image window [13]which represents the image on the camera sensor. Notice that these two blockswould conform the camera system in a real-time VS implementation. If thecamera is not attached to the final point of the robot’s end-effector, then a screwtransformation should be computed. There is no necessity of such conversion inthis simulation given that the camera is supposed to be attached in the final point

26 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABof the end-effector and centered. However a real implementation of the algorithmdoes require this transformation as shown in the case studies in Chapter ??.Next block in the schematic is labelled as “Features Preprocessing”. Thisblock calculates the metric vector associated with each feature under registration,as explained by Equation 1.22. This is a fundamental component of a 3D VSscheme.Notice as well that from the block “Image Experimental Model” emerges theDepth value. It can be easily obtained given that a CAD model of the object isrequired to simulate the image evolution in the previous block. A specific methodshould be employed to compute the Depth value in a real-time VS scheme.A core component in this VS scheme -and in most classical IBVS schemes,is the Jacobian computation block, labelled as “Visual Jacobian Computationand Pseudoinverse” in the schematic. This block computes the partial visualJacobian for each feature point as in Equation 1.20 and stacks all of them in onematrix as in expression 1.21, conforming the full Jacobian. Considering that fourfeatures are registered and that the number of DOF to be controlled is six thusthe dimension of the Jacobian matrix is (8 × 6). In order to invert such a matrix,one pseudoinverse algorithm is used. A standard algorithm based on the singularvalue decomposition is therefore employed. At the output, this block providesthe correspondent matrix J + v of dimensions (6 × 8).In the meantime, the feature error vector should be computed in order to beused by the control law. The block is labelled as “PI control law”. It receivesthe feature metric vector calculated by the “features preprocessing” block andthe Target feature metric vector, labelled as s and s ∗ respectively. The Targetfeature vector f ∗ is generated from the “Desired Model” block which receives theobject CAD model, the desired object location with respect to the base frame andthe desired camera position with respect to the object. The output of this blockshould also be preprocessed to generate a compatible vector of metric values. Asecond copy of the “Preprocessing” block is then used.Thus the error is calculated within the block as shown by Equation 1.28. API control law is used, though some results from a purely proportional law arealso shown. The gains used are K p = −5.165 and K i = −1.5.A wider discussion about the effect of using different kinds of control laws ispresented in section 1.4 later in this chapter. This block thus outputs the screwvector of required velocities. This vector is sent directly to the robot controllerjust after applying a simple proportional gain. The robot controller has a secondinternal feedback signal which is used in the kinematic control scheme. For adeeper discussion regarding the operation of the kinematic controller refer toChapter ??.In order to complete the simulation of the camera motion, the homogeneoustransform of the end-effector is fed backed to the “Image Experimental Model”block at each simulation step because this block is in charge of simulating theimage evolution.

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 27Running the IBVS systemThe simulation runs for 2.25 seconds which is the time required by the train tocomplete the arc of an angle of 45 degrees in front of the robot.A handy graphic representation of the tracking task of the train is presented inFigure 1.13 which shows the camera trajectory during the tracking motion. Noticethe camera represented by the tetrahedron in the initial and final locations. Theupper graph shows the lateral view while the lower graph represents the view fromthe robot’s base including the train motion. Following the methods in [13], thesimulation shows a vectorial field representation every 5 steps in the simulationbut notice however that the size of each vector can be also altered by the visualperspective of the figure. Also notice that the height of the camera position ateach step is kept constant.Z0.160.140.12endstart0.10.40.20Y0. 20. 40.050.10.15X0.20.25endstart0.40.20.3Z0.10.200.40.30.20.10Y0. 10. 20. 30. 400.1XFigure 1.13: Camera trajectory during the motion in the tracking task. Uppergraph: lateral view showing start and end locations. Lower graph: view from therobot’s base including the train trajectory (all the axes are expressed in meters).Figure 1.14 presents the trajectory followed by the features in the image.The initial position is also marked. Notice that this figure contains the cameraview,-better known as the image, which explains why the axis dimensions are

28 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLAB(320, 240). The feature error can be better appreciated by the left graph in theFigure 1.15. It is evident in this graph that the four x-components from eachfeature are represented by one line while the y-components by the other. The signdifference comes from the fact that the x-axis increases towards the right-handdirection while the y-axis decreases top-down.050100v (pixels)1502000 50 100 150 200 250 300u (pixels)Figure 1.14: Simulated camera image showing the trajectory of features in theimage plane.4030Error in XsError in Ys21002000201900180010pixels0102030Condition number170016001500140013001200400 0.5 1 1.5 2 2.5Time (seconds)11000 0.5 1 1.5 2 2.5Time (seconds)(a)(b)Figure 1.15: Left graph (a): Error in the feature space. Right image (b): Evolutionof the condition number of the visual Jacobian during the tracking motion.Figure 1.16 presents the evolution of the screw vector components which aresent directly to the robot controller. The upper graph exhibits the behavior oftranslational components while the lower shows the rotational values. Notice thatfor this task the height of the end-effector with respect to the workspace surface

1.3. IMAGE-BASED VISUAL SERVOING (IBVS) 29is kept constant and therefore the translational component in Z is approximatelyzero.Some insights into the stability of the visual system can be gained from analyzingthe condition number of the visual Jacobian. By construction, giventhat the features are usually coplanar -as those in this problem, the visual Jacobiancan show large condition numbers which seriously condition the stabilityof the system. However, the condition number might eventually remain inside ofa window of values which can still guarantee stability. Figure 1.15 presents thecondition number evolution in the simulation.0.150.1vxvyvzm/sec0.050−0.05−0.10 0.5 1 1.5 2 2.5Time (seconds)10.5wxwywzrad/sec0−0.5−10 0.5 1 1.5 2 2.5Time (seconds)Figure 1.16: Screw vector elements during the tracking task. Upper graph showsthe evolution of translational components while the lower presents the values ofrotational elements.In a second example, the simulation uses a much higher proportional gain ofK p = −15.165 keeping K i = −1.5. As expected, it increases the performance ofthe visual tracker, though some important inconveniences are also found. FirstFigure 1.17 shows the image plane feature error which exhibits a lower error.This can be also confirmed in the left graph in Figure 1.20 with a much lowererror range in the vertical axis compared with the same graph in Figure 1.15.However, observing the 3D trajectory of the camera in Figure 1.18, irregular

30 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABshapes in the vectorial field can be appreciated. Analyzing the screw vectorevolution in Figure 1.19, considerable oscillations can be easily identified. Fromcontrol theory we know that increasing the gain in the control loop might improvethe overall performance but the instability risk increases. In the case of oursimulation, after increasing the proportional constant, velocities fall into the rangeof 0.2 meters per second for translation and about 2 radians per second for therotational component. Such values are practically impossible for the MA2000Robot which exhibits a slow motion performance and high backlash. Also asexplained in Chapter ??, there exist other mechanical and torque limitations.Also in the right graph of Figure 1.20, notice that the range of values takenby the condition number of the visual Jacobian has expanded. In this case it iseven difficult to establish if it might oscillate around a given value which seriouslyquestions the system stability. The result of using an increased proportional gainis included here just to remark the fact that there are some formal conditions tocomply such as proper gains and a stable condition number in order to assureacceptable stability in the system. This is further discussed in section 1.4.050100v (pixels)1502000 50 100 150 200 250 300u (pixels)Figure 1.17: Simulated camera image showing the trajectory of features in theimage plane. A high proportional gain is used in the simulation (K p = −15.165).Finally a third experiment eliminates the integral gain by setting K p = −5.165and K i = 0. The behavior of the screw components is very similar to that for thefirst experiment. It exhibits oscillations of lower magnitude for the translationalcomponents but a considerable increase in the peak values for the rotational componentsas shown in Figure 1.21. However an interesting fact emerges from theright graph in Figure 1.22 because the window of values taken by the conditionnumber is located in lower values then the first simulation. Such values concentratemainly between 1200 and 1400. Notice also that the error in the imagefeature space (left graph in Figure 1.22) is practically the same as in the firstexperiment.

1.4. VISUAL SERVOING CONTROL AND STABILITY 31Z0.160.140.12endstart0.10.40.20Y0. 20. 40.050.1X0.150.20.250.2endZ0.1start0.4 00.20Y0. 20. 40.050.10.15X0.20.25Figure 1.18: Camera trajectory in the tracking task with a high proportional gain(K p = −15.165). Upper graph: lateral view showing start and end locations.Lower graph: view from the robot’s base including the train trajectory (all axesin meters).1.4 Visual servoing control and stabilitySo far the implementation of position-based and image-based VS schemes in anEye-in-Hand setup has been reported to be satisfactory (See the literature reviewat the end of this chapter). Convergence to the desired position can be reachedand the system can be robustly stable with respect to camera calibration errors,robot calibration and image measurement errors. This was possible thanks tothe feedback closed-loop structures used but some problems can appear whenthe robot consists of more that two or three DOFs such as the 6-DOF MA2000robot used as example. Thus it is important to discuss some points regardingstability and robustness of VS schemes. Notice that this discussion focuses onEye-in-Hand camera systems controlled by Image-based VS. A wider discussionof other camera-robot setups, including position-based schemes, can be found inthe concise work of Chaumette [10].A calibration-error free context is assumed only to introduce each of the conceptsin the following.

32 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLAB0.40.2vxvyvzm/sec0−0.2−0.40 0.5 1 1.5 2 2.5Time (seconds)321wxwywzrad/sec0−1−2−3−40 0.5 1 1.5 2 2.5Time (seconds)Figure 1.19: Screw vector elements during the tracking task with a high proportionalgain (K p = −15.165). Upper graph shows the evolution of translationalcomponents while the lower presents the values of rotational elements.As widely discussed in section 1.3, the visual Jacobian matrix J v is related tothe feature vector f as in Equation 1.13 and can be used to design several controllaws as the one in expression 1.27. Thus the camera velocity is sent to the robotcontroller in terms of velocities for each robot’s link, using the robot Jacobian toavoid joint limits and kinematics singularities as in Chapter ??. Thus followingthe general formṙ = f ( J + v (f − f ∗ ) ) (1.31)with f being the controller function, which can be a proportional gain [12], linearcontrol [54], a more complex function to regulate f to f ∗ such as optimal control[22], non-linear control [55, 56], or heuristic methods such those developed in thisreference manual.Noise and errors can be found in the camera calibration, image acquisitionand moreover in the depth estimation. Therefore the global asymptotic stability

1.4. VISUAL SERVOING CONTROL AND STABILITY 331510Error in XsError in Ys21002000519001800Error (pixels)50Condition number170016001500101400151300200 0.5 1 1.5 2 2.5Time (seconds)12000 0.5 1 1.5 2 2.5Time (seconds)(a)(b)Figure 1.20: Tracking with a high proportional gain (K p = −15.165). Left graph(a): Error in the feature space. Right image (b): Evolution of the conditionnumber of the visual Jacobian during the tracking motion.of the system is an important issue to evaluate in a VS design. First works inthe field include Corke’s [9] and Chaumette’s [10]. From well known results inEspiau’s book[11], a sufficient condition to assure a non-linear global stability isthat:Ĵ + v J v ( f(t), z(t) ) > 0, ∀t (1.32)with Ĵ + v a model, an approximation or an estimation of the pseudo-inverse ofJ v . Indeed, camera calibration errors, noisy image measurements and unknowndepth z involved in the computation of Ĵ + v imply the use of such estimation sincethe real value of J v remains unknown. Ideally Ĵ + v J v∼ = I which implies a perfectdecoupled system as long as the product Ĵ + v J v stays away from a null matrix.Chaumette discusses in [10] that this condition is difficult to achieve in practicebut still it is possible to consider three options to define the visual Jacobianmatrix.• Estimation of the Visual Jacobian. In this method the visual Jacobian matrixis estimated without taking into account its analytical expression inEquation 1.12. Hence the Jacobian in expression 1.32 can be characterizedas Ĵ + v = Ĵ + v (t). Several methods have been proposed following this line[57, 58] and even neural networks have been used [59]. However, the maindrawback of purely estimating the Jacobian resides on the fact that it isdifficult to assure the condition of expression 1.32. Moreover, initial estimations,coarse by nature, may lead to unstable behavior, especially at thebeginning when starting conditions are usually initialized at random.

34 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLAB0.150.1vxvyvzm/sec0.0500.050. 10 0.5 1 1.5 2 2.5Time (seconds)10.5wxwywzrad/sec00. 510 0.5 1Time (seconds)1.5 2 2.5Figure 1.21: Screw vector elements during the tracking task with no integral gain(K i = 0). Upper graph shows the evolution of translational components whilethe lower presents the values of rotational elements.• Updating the Visual Jacobian. In this technique the visual Jacobian is updatedat each step of the control law including the last measure of thevisual features and the depth estimate ẑ(t). It seems optimal since ideallyit satisfies Ĵ + v J v = I. The depth can be obtained from methods discussedin Chapter ??, the knowledge of the object’s 3D model [60] or from cameracontrolled motion as in [8] and [61]. The main drawback is that each imagepoint is constrained to follow a straight trajectory towards its desired position.This may imply an inadequate camera motion with the possibilityof local minima and nearing of task singularities [10].• Fixing a Target Jacobian. In this method the Jacobian is constant. Itscomputation is performed in advance during an off-line process using anapproximation of the feature’s depth at the target pose. Evidently convergenceis assured only in a small neighborhood around the desired position.A decoupled behavior will be therefore also achieved just in such a smallregion. Determine the size of such a neighborhood is difficult given the

1.4. VISUAL SERVOING CONTROL AND STABILITY 354030Error in XsError in Ys21002000201900101800Error (pixels)010Condition number170016001500201400301300401200500 0.5 1 1.5 2 2.5Time (seconds)11000 0.5 1 1.5 2 2.5Time (seconds)(a)(b)Figure 1.22: Tracking with no integral gain (K i = 0). Left graph (a): Error inthe feature space. Right image (b): Evolution of the condition number of thevisual Jacobian during the tracking motion.complexity of the involved symbolic calculations. An important drawbackarises from the fact that some features may eventually leave the field ofview causing a failure, especially for those desired positions which are faraway from the desired pose.Although IBVS has exhibited a satisfactory tolerance to camera calibrationerrors and eye-in-hand calibration errors, other stability and convergence problemsmay also arise from• Task Singularities The analytical Jacobian J v or the iterative Ĵ v maybecome singular for different reasons. First if the feature vector f is formedby three collinear points or they belong to a cylinder containing the cameraoptical center, thus the visual Jacobian becomes singular[62, 63]. The useof more than three points generally leads to a problem free VS design.However consider that no matter how many feature points are registeredin the image the visual Jacobian J v may become singular during the visualservoing. It has been suggested that straight lines traced among featurespoints can be used as main features, but this varies for each particular VStask though. A simple but interesting example is studied in [10].• Local minima. It represents those states on which the actuator velocityṙ = 0 but the feature vector f ≠ f ∗ , which means that the desired endeffectorpose has not been achieved. Recalling the matrix algebra, this casecan be represented considering that the feature error vector falls into thekernel of the estimated visual Jacobian, which mathematically is

36 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLABf − f ∗ ∈ ker Ĵ + v (1.33)Some researches argued that if just three features are considered then asquare Jacobian of dimension (6 × 6) is obtained, which implies that theker Ĵ + v = 0 because J v is full-rank, i.e. rank=6, and therefore there are nolocal minima. However, now it is well known that three points in the sameimage can be seen from four different camera poses [64] which is equivalentto considering four local minima on which the conditions are f = f ∗ andṡ = 0. Using four points theoretically assures a unique pose, but on theother hand, the visual Jacobian dimension will be (8 × 6) which againgives way to a Kernel dimension of dim Ker J + v = 2 with at least two localminima. Notice however that this does not assure that those local minimaexist. On the contrary, it means that a corresponding camera pose mustexist, i.e. they are physically coherent, unless they are unreachable imagemotions. The determination of a general result seems rather difficult giventhe complexity of the involved symbolic computations.Analyzing the link between local minima and unreachable image motions,it is not difficult to see that Ker J v = Ker J + v , which represents that theKernel of the estimated Jacobian can be considered equal to the kernel ofthe pseudo-inverse matrix. Now observe that so far, the visual Jacobianhas been chosen to be updated at each algorithm step Ĵ v (f(t), ẑ(t)). Butif the constant Jacobian is used, i.e. J v ( f ∗ , z ∗ ) as in the third case, thenevidently Ker J t v ≠ Ker Ĵ+ v in which t means Target and the right-handterm represents the pseudo-inverse of the visual Jacobian. It allows thesystem to avoid local minima as long as the motions are valid and realizable.Again, the conclusion is that for some applications, the use of a constantvisual Jacobian is recommended whereas other visual tasks deliver by meansof an iterative visual Jacobian.Notice as well that the condition number of the visual Jacobian has also animportant role in the behavior as mentioned in [65] and also in [66, 67]. Finally,supplementary sufficient conditions to ensure a correct modelling of an imagebasedtask are that Ker J + v = Ker J v = 0, which also excludes the existence oftask singularities [10].1.5 SummaryThis reference manual has been entirely devoted to discuss about Visual Servoingschemes. A concise summary of visual servoing concepts is presented at thebeginning of the reference manual. Then a practical VS taxonomy is introduced to

1.5. SUMMARY 37classify several VS contributions which are found in the literature. The Position-Based Visual-Servoing (PBVS) is explained by means of one simulation. Alsothe Image-Based Visual-Servoing is presented with three simulation examples.The CSC Robot Visual Servoing Toolbox is used to develop all the examples.The experimental setup is also introduced and simulated. The reference manualthen discusses about control and stability of visually controlled robotic systemswith special emphasis on those factors which determine the real-time use of theseschemes in manipulators. Finally in the last section, the reference manual offerssome contributions and lots of references.

38 CHAPTER 1. CSC VISUAL SERVOING TOOLBOX FOR MATLAB

Bibliography[1] S. Hutchinson, G.D. Hager, and P.I. Corke. A tutorial on visual servorcontrol. IEEE Transactions on Robotics and Automation, 12(5):651–670,Octuber 1996.[2] G.D. Hager, S. Hutchinson, and P. Corke. Tt3: Tutorial on visual servocontrol, workshop notes. IEEE International Conference on Robotics andAutomation, 1996.[3] J. K. Aggarwal and N. Nandhakumar. On the computation of motion fromsequences of images, a review. Proceedings of the IEEE, 76:917–935, August1988.[4] G. Adiv. Inherent ambiguities in recovering 3D motion and structure froma noisy flow field. IEEE Transactions on Pattern Analysis and MachineIntelligence, 11:477–489, May 1989.[5] B. Espiau and P. Rives. Closed-loop recursive estimation of 3D featuresfor a mobile vision system. Proceedings of the 1987 IEEE InternationalConference on Robotics and Automation, pages 1436–1443, 1987.[6] R. Bajcsy. Active perception. Proceedings of the IEEE, 76:996–1005, August1988.[7] G. Sandini and M. Tistarelly. Active tracking strategy for monocular depthinference over multiple frames. IEEE Transactions on Pattern Analysis andMachine Intelligence, 12:13–27, January 1990.[8] f. Chaumette, s. Boukir, P. Bouthemy, and D. Juvin. Structure from controledmotion. IEEE Transactions on Pattern Analysis and Manchine Intelligence,18(5):492–504, May 1996.[9] P.I. Corke and M.C. Good. Dynamic effects in visual closed-loop systems.IEEE Transactions on Robotics and Automation, 12(5):671–683, October1996.39

40 BIBLIOGRAPHY[10] Francois Chaumette. Potential problems of stability and convergence inimage-based and position-based visual servoing. In the Confluence of Visionand Control, D. Kriegman, G. Hager, A. Morse, 237:66–78, 1998. Springer-Verlag.[11] C. Samson, M. Le Borgne, and Espiau B. Robot Control, a task functionapproach. Oxford Engineering Series, 22. Oxford Science Publications, 1990.ISBN 0198538057.[12] B. Espiau, Chaumette F., and P. Rives. A new approach to visual servoingin robotics. IEEE Transactions on Robotics and Automation, 8(3):313–326,June 1992.[13] E. Cervera. Laboratory worknotes. EURON 2002 Summer School on VisualServoing, 2002.[14] W.J. Wilson. Visual servo control of robots using kalman filter estimatesof robot pose relative to work-pieces. In Visual Servoing, Hashimoto, K.(Editor), World Scientific, pages 71–104, 1993. ISBN 9810246064.[15] C. Colombo, B. Allotta, and P. Dario. Affine visual servoing: A frameworkfor relative positioning with a robot. IEEE Proceedings of the InternationalConference on Robotics and Automation, 1995, pages 464–471, 1995.[16] P.I. Corke. A robotics toolbox for MATLAB. IEEE Robotics and AutomationMagazine, 3(1):24–32, March 1996.[17] C. Sanderson and L.E. Weiss. Image-based visual servo control using relationalgraph error signals. Proceedings of the IEEE, 1(1):1074–1077, 1980.[18] J.T. Feddema, C.S.G. Lee, and O.R. Mitchell. Feature-based visual servoingof robotic systems. In Visual Servoing, Hashimoto, K. (Editor), WorldScientific, 1993.[19] J. Aloimonos and D.P. Tsakiris. On the mathematics of visual tracking.Image and Vision computing, 9:235–251, 1991.[20] R.M. Haralick and Shapiro L.G. Computer and Robot Vision. Addison-Wesley, 1993.[21] N. P. Papanikolopulos and P.K. Khosla. Adaptive robot visual tracking:Theory and experiments. IEEE Transactions on Automatic Control,38(3):429–445, 1993.[22] N. P. Papanikolopulos, P.K. Khosla, and T. Kanade. Visual tracking of amoving target by a camera mounted on a robot: a combination of vision andcontrol. IEEE Transactions on Robotics and Automation, 9(1):14–35, 1993.

BIBLIOGRAPHY 41[23] G.D. Hager, G. Grunwald, and G. Hirzinger. Feature-based visual servoingand its application to telerobotics. In Proceedings of the IEEE/RSJ InternationalConference on Intelligents Robots and Systems, pages 164–171,1994.[24] G. Hager and Z. Dodds. A projective framework for constructing accuratehand-eye systems. Proceedins of the Workshop on New Trends in ImagebasedRobot Servoing, pages 71–82, 1997. In association with the IROS1997.[25] C. Scheering and B. Kersting. Uncalibrated hand-eye coordination with aredundant camera system. In Proceedings of the IEEE International Conferenceon Robotics and Automation, ICRA 1998, pages 2953–2958, May1998.[26] T. Mitsuda, N. Maru, K. Fujikawa, and F. Miyazaki. Visual servoing basedon the use of binocular visual space. In Proceedings of IEEE/RSJ InternationalConference on Intelligent Robots and Systems, IROS 1996, pages1104–1111, 1996.[27] J.T. Feddema, C.S.G. Lee, and O.R. Mitchell. Weighted selection of imagefeatures for resolved rate visual feedback control. IEEE Transactions onRobotics and Automation, 7(1):31–47, February 1991.[28] K. Hashimoto and T. Noritsugu. Performance and sensitivity in visual servoing.In Proceedings of the IEEE International Conference on Robotics andAutomation, ICRA 1998, pages 2321–2326, 1998.[29] R. Sharma and S. Hutchinson. On the observability of robot motion underactive camera control. In Proceedings of the IEEE International Conferenceon Robotics and Automation, ICRA 1994, pages 162–167, 1994.[30] J.T. Feddema and O.R. Mitchell. Vision-guided servoing with feature-basedtrajectory generation. IEEE Transactions on Robotics and Automation,5(5):691–700, October 1989.[31] F. Berry, P. Martinet, and J. Gallice. Trajectory generation by visual servoing.In Proceedings of IEEE/RSJ International Conference on IntelligentRobots and Systems, IROS 1997, pages 1066–1072, September 1997.[32] B.J. Nelson and P.K. Khosla. An extendable framework for expectationbasedvisual servoing using environmental models. Proceedings of the IEEEInternational Conference on Robotics and Automation, pages 184–189, 1995.[33] A. Ruf and R. Horaud. Visual trajectories from uncalibrated stereo. InProceedings of the Workshop on New Trends in Image-based Visual Servoing,pages 83–92, 1997. In association with the IROS 1997.

42 BIBLIOGRAPHY[34] R. Sharma and H. Sutanto. A framework for robot motion planning with sensorconstraints. IEEE Transactions on Robotics and Automation, 13(1):61–73, February 1997.[35] A. Fox and S. Hutchinson. Exploiting visual constraints in the synthesisof uncertainty-tolerant motion plans. IEEE Transactions on Robotics andAutomation, 11(1):56–71, June 1995.[36] L.E. Weiss, A.C. Sanderson, and C.P. Newman. Dynamic sensor-based controlof robots with visual feedback. IEEE Transactions on Robotics andAutomation, 3(5):404–417, October 1987.[37] W. Jang and Z. Bien. Feature-based visual servoing of an eye-in-hand robotwith improved tracking performance. In Proceedings of the IEEE InternationalConference on Robotics and Automation, 1991, pages 2254–2260,1991.[38] F. Chaumette and N. Hollinghurst. Visually guidad grasping in unstructuredenvironments. Robotics and Autonomous Systems, 19:337–346, 1997.[39] E. Marchand, F. Chaumette, and A. Rizzi. Using the task function approachto avoid the robot joint limits and kinematic singularities in visual servoing.Proceedings of the IEEE/RSJ International Conference on Intelligent Robotsand Systems IROS’96, pages 1083–1090, October 1996.[40] E. Marchand and G. Hager. Dynamic sensor planning in visual servoing.Proceedings of the IEEE International Conference on Robotics and Automation,1998, pages 1988–1993, May 1998.[41] J. Feddema and O. Mitchell. Vision guided servoing with feature-based trajectorygeneration. IEEE Transactions on Robotics and Automation, 5:691–700, 1989.[42] A. Castaño and S. Hutchinson. Visual compliance: a task-directed visualservo control. IEEE Transactions on Robotics and Automation, 10(3):334–342, June 1994.[43] P.Y. Oh and P.K. Allen. Design of a partitioned visual feedback controller.In Proceedings of the IEEE International Conference on Robotics and Automation,1998, pages 1360–1365, May 1998.[44] J.Y. Zheng, T. Sakai, and N. Abe. Guiding robot motion using zoomingand focusing. In Proceedings of IEEE/RSJ International Conference onIntelligent Robots and Systems, IROS 1996, pages 1076–1082, 1996.[45] E. Malis, F. Chaumette, and S. Boudet. 2D-1/2 visual servoing. IEEETransactions on Robotics and Automation, 15(2):238–250, April 1999.

BIBLIOGRAPHY 43[46] G.D. Hager. A modular system for positioning using feedback from stereo vision.IEEE Transactions on Robotics and Automation, 13(4):582–595, 1997.[47] R. Horaud, F. Dornaika, and B. Espiau. Visually guided object grasping.IEEE Transactions on Robotics and Automation, 14(4):525–532, August1998.[48] K. Hosoda, K. Ishida, and M. Asada. Utilizing affine description for adaptivevisual servoing. In Proceedings of the Workshop on New Trends in ImagebasedVisual Servoing, pages 13–20, 1997. In association with the IROS1997.[49] M. Jägersan, O. Fuentes, and R. Nelson. Acquiring visual-motor models forprecision manipulation with robot hands. Proceedings of the 4th EuropeanConference on Computer Vision ECCV’96, pages 603–612, 1996.[50] J.A. Piepmeier, G.V. McMurray, and H. Lipkin. A dynamic quasi-newtonmethod for uncalibrated visual servoing. In Proceedings of the IEEE InternationalConference on Robotics and Automation, ICRA 1999, pages 1595–1600, 1999.[51] E. Cervera and P. Martinet. Combining pixel and depth information inimage-based visual servoing. ICAR 1999, Tokio Japan, pages 445– 450,1999.[52] F. Malis. Contributions à la Modélisation et à la Commande en AsservissementVisuel. PhD thesis, Université de Rennes, France, November 1998.[53] J.Y. Bouguet. Camera Calibration Toolbox for Matlab. Open Source ComputerVision Library, INTEL.[54] K. Hashimoto and T. Noritsugu. Visual servoing with linearized observer.In Proceedings of the IEEE International Conference on Robotics and Automation,1999, pages 263–268, 1999.[55] K. Hashimoto and H. Kimura. Dynamic visual servoing with nonlinearmodel-based control. In Proceedings of the 12th World Congress of IFAC,9:405–408, July 1993. Sydney, Australia.[56] K. Hashimoto and H. Kimura. Visual servoing with nonlinear observer. InProceedings of the IEEE International Conference on Robotics and Automation,1995, pages 484–489, 1995.[57] H. Hosoda and M. Asada. Versatile visual servoing without true knowledgeof true jacobian. In Proceedings of IEEE/RSJ International Conference onIntelligent Robots and Systems, IROS 1994, pages 186–193, September 1994.Munchen, Germany.

44 BIBLIOGRAPHY[58] M. Jagersand, O. Fuentes, and R. Nelson. Experimental evaluation of uncalibratedvisual servoing for precision manipulation. In Proceedings of theIEEE International Conference on Robotics and Automation, 3:2874–2880,April 1997. Alburquerque, New Mexico.[59] I.H. Suh and T.W. Kim. Visual servoing of robot manipulators by fuzzymembership function based neural networks. In Visual Servoing, Hashimoto,K. (Editor), World Scientific, pages 285–315, 1993.[60] D. Dementhon and L. Davis. Model-based object pose in 25 lines of code.International Journal of Computer Vision, 15(1):123–141, June 1995.[61] C. Smith and N. Papanikolopoulos. Computation of shape through controlledactive exploration. Proceedings of the International Conference onRobotics and Automation, 3:2516–2521, May 1994. San Diego, California.[62] F. Chaumette, P. Rives, and B. Espiau. Classification and realization ofdifferent vision-based task. In Visual Servoing, Hashimoto, K. (Editor),World Scientific, pages 199–228, 1993.[63] N. Papanikolopoulos. Selection of features and evaluation of visual measurementsduring robotic visual servoing tasks. Journal of Intelligent andRobotics Systems, 13:279–304, 1995.[64] R. Horaud. New methods for matching 3D objects with single perspectiveview. IEEE Transactions on Pattern Analysis and Machine Intelligence,9(3):401–412, May 1987.[65] J. Feddema, C. Lee, and O. Mitchell. Automatic selection of image featuresfor visual servoing of a robot manipulator. In Proceedings of the IEEEInternational Conference on robotics and Automation, 2:832–837, May 1989.Scottsdale, Arizona.[66] B. Nelson and P. Khosla. The resolvability ellipsoid for visual servoing.Proceedings of the IEEE International Conference on Computer Vision andPattern Recognition, pages 829–832, June 1994. Seattle, Washington.[67] R. Sharma and S. Hutchinson. Optimizing hand/eye configuration for visualservo systems. In Proceedings of the IEEE International Conference onRobotics and Automation, ICRA 1994, pages 172–177, May 1995. Nagoya,Japan.

CSC Visual Servoing Toolbox for Matlab (CSC-VS) Libraries ... - Main

Create successful ePaper yourself

Delete template?

Save as template?