3D surface tracking and approximation using Gabor filters - CoViL

3D surface tracking and approximation usingGabor filtersJesper Juul HenriksenMarch 28, 2007

5 Open problems 265.1 Symbolic representation . . . . . . . . . . . . . . . . . . . . . 266 Conclusion 27A Timetable 28ii

AbstractThis report covers the theory covered and work done under theFORK part of the masters project at Syddansk Universitet, in thepersuit of a way to estimate surface topology of object under robotcontrolled motion, based on motion and vision information.

1.3 Proposal of solution 1 THE PROBLEMThe coordinates of the surface points will be tracked over time, in order toget a better statistical basis for determining the coordinates of the points.Either a Kalman filter, or a particle filter will be used for point tracking, asthey have shown themselves to be very effective at tracking systems, wherelittle knowledge of the system to be tracked is known at the start of tracking.This solution the surface approximation problem depends on a list of tools,both theory and software, that will be covered in the folowing sections.2

2 THEORY COVERED SO FAR2 Theory covered so farIn this section different candidate theories will be described. The theoriesthat have been discarded for the project will be covered very briefly, whilethe candidates that hold potential will be covered in greater detail.2.1 Stereo visionIn order to digitize a 3D surface, at least two different views of the surfaceare necessary. Using the knowledge of the projection matrix of the cameras,it is possible to generate a depth estimate by triangulation. The process oftriangulation is well understood, and is fairly straight forward. The accuracyof the triangulation is higher if the angle between the rays is increased, thismeans that ideally the cameras generating the stereo images should be veryfar apart. The problem is the finding of the same point on the surface inboth images. This is known as the correspondence problem [1].2.2 Correspondence problemThere are several issues that makes problem of finding points in the stereoimages, that corresponds to the same point on the surface of an object difficult.• It is likely that the surface looks different from different angles, this istypically caused by the fact that views from different angles are subjectto different lighting conditions. Shiny object for example can lookvery different from different angles because of the highlight effect onthe surface. This problem can be a limiting factor on the distance thestereo cameras can be placed from eachother, and thereby the achievabledepth accuracy.• The surface itself can also be obstructing the view of the correspondencepoint, this is the case for example for a cup, where the handle of thecup at certain viewing angles is in front of the cup itself.• It may not be that there is a unique match between image points. Oftenthere are a number of candidate matches from one stereo image to theother, this is often the case for surfaces, that have weak surface texture,like a sheet of white paper.The Correspondence problem is one of the main problems in stereo vision,and there are no fool proof ways around the problem for unassisted vision. By3

2.3 Dense stereo 2 THEORY COVERED SO FARassisting the vision process by using structured light [2], the correspondenceproblem can be greatly reduced though. The structured light method usesa technique where the surface to be digitized is lit by a lightsource emittinglight in a known pattern. This can be achieved by using a projector emittingfor example a checker pattern on the surface. This method is most effectivewhen the emitted light is the only light source, and this is often not the casefor real environments, and certainly not for out door use.2.3 Dense stereoDense stereo is a relatively simple way to do stereo vision, in its simplestform it tries to match every pixel in one of the stereo images to every pixel inthe other picture. This simple form is very ineffective, and is never used inpractice. Matches are typically sought on what are called epipolar lines in thepictures. The use of this technique requires that the cameras are mountedside by side, since it assumes that point on the surface will be at the sameheight on both stereo images.2.4 Sparse stereoAnother way to generate stereo data, is sparse stereo, this approach tries tominimize the number of pixels that has to be checked for correspondences,and thereby the correspondence problem by only looking at parts of theimages that are likely to be recognizable from one image to another. Onesuch method is looking at the perceived edges[3] or lines [4] in the images .The main difference between dense and sparse stereo, is that in sparse stereo,comparison is not done on a pixel by pixel basis, but groups of pixels defininga feature. An edge of an object in the image would for example be definedby at least two pixels. The fact that more pixels are involved reduces thecorrespondence problem, since the comparison is based on more data.2.5 Gabor filtersAnother way to find good correspondence candidates is to use different imageprocessing techniques on the stereo image pair. Gabor filtering is onesuch way to process the stereo images. Gabor filters are rotation sensitivelocal frequency detectors. Gabor filtering is done by convolution of a imagewith set of Gabor kernels designed to detect certain frequencies and certainorientations.The Gabor filter bank is often composed of kernels detecting 4 different4

2.5 Gabor filters 2 THEORY COVERED SO FARwavelengths/frequencies 1 at 8 different orientations from 0 ◦ to 180 ◦ . Thereason the orientations are only for a half circle is, that the absolute responsewould be the same, and the phase would just be rotated by 180 ◦ . Theresponse to each filter in the filterbank is stored and used as part of the basisfor comparison.One of the problems of Gabor filtering is that it needs much processing togenerate the filter responses. For a jet of 4 wavelengths over 8 orientations,32 filter responses have to be calculated. The process fortunately scalesnicely when split up and run in parallel, as each response can be calculatedindependently of the others, and the process in it self, like any convolutionalso scales well.Comparison is done by comparing a vector composed of the filter responsesof the filter bank at a certain pixel (jet), by doing comparison basedon the whole jet, the likelihood of wrong correspondence matching is therebyreduced.2.5.1 Gabor kernelsThe Gabor filter is composed of a complex sinusoidal carrier and a 2D Gaussianenvelop. Figure 1 left shows the real part of the carrier Figure 1 rightshows the same carrier with the Gaussian envelop.33221100-1-1-2-2-3-3 -2 -1 0 1 2 3-3-3 -2 -1 0 1 2 3Figure 1: Real sinusoidal carrier and same carrier under Gaussian envelop.The black part of both figures represents values of -1 and the white areasrepresents values of 1.Convolution of an image with the kernel gives a response that is proportionalto how well the local feature in the image matches the kernel. Thisproperty of convolution is general for signal processing. When this is done forpixels1 Throughout this paper the notationwavelength will be used instead of radpixeleven iftalking about frequencies, this is because Gabor kernels are designed by wavelengths.5

2.5 Gabor filters 2 THEORY COVERED SO FARboth the real and imaginary kernels, the frequency response and phase canbe found by calculating the amplitude and argument of the complex responser, i from the real and imaginary kernels respectively:Amplitude = √ r 2 + i 2Argument = tan ( )−1 irWawelength 4 Angle 90.0Wawelength 4 Angle 90.0175150125140120100140120100100808075606050404025202000 50 100 150 200 25000 50 100 150 20000 50 100 150 200Figure 2: Picture and and corresponding magnitude and phase response ofGabor filtering with parameters λ = 4 , σ = 2 and θ = π 2Typically only the magnitude of the response is needed, but the phasecan be used to further distinguish between responses, as the phase changesrapidly with location, and therefore makes the comparison more precise, sincetwo neighporing points in the image can have the same absolute response buttypically very different phase.It can be seen on the middle picture of Figure 2, the absolute responseshows where the image has local horizontal changes, and the rightmost pictureof Figure 2 shows in what direction the change is happening, the phase.The Gauss envelop is limiting the cutoff effect at the edges of the kernel,that is due to the fact that the kernel can only be represented by a finitenumber of values, that for computational reasons are kept fairly small. Thefact that the Gauss curve goes towards 0 relatively fast makes it ideal as awindow function.If larger Gauss envelops are used, a greater number of wavelengths aresignificant in the kernel, thereby improving the frequency and orientationresolution, but lowering the spatial certainty, this relation is a property ofthe Gauss window function. The relationship between the spatial resolutionand frequency resolution is fixed. ”A signal’s specificity simultaneous in timeand frequency is fundamentally limited by a lower bound on the product of itsbandwidth and duration (analogous to indeterminacy relations of quantummechanics).” [6, page 6]6

2.5 Gabor filters 2 THEORY COVERED SO FAR(△x)(△ω) ≥ 14πSince the Gabor filter is a spatial band pass filter, the relationship σ λdefines the bandwidth of the filter. The bandwidth b in octaves in relationto σ is as follows [12, Section 2, formula 3]:λ√σλb = log π + ln222 √σπ − ln2λ 2⇐⇒ σ λ = 1 π√ln222 b + 12 b − 1The problem with reduced resolution arises from the fact that two responsesclose to each other add as shown in Figure 3, where the two Gaussshaped responses merge to form a response that make the individual responsesindistinguishable from each other.(1)22221.51.51.51.510.5210.5210.5210.520-200-200-200-2002-202-202-202-2Figure 3: Gauss curves with σ = 1 placed at distance 4, 3, 2, 1 from eachotherIdeally the σ should be designed in such a way, that a local feature who’sfrequency / orientation lies half way between two orientations in the jet,should give a equal response in both orientations, that is half of what therespones would be, if the feature was perfectly aligned with the jet. That waythe job of finding out witch jet coordinate the feature belongs to, is mucheasier, and assure best coverage in the Gabor space.The Gabor filter is implemented as a set of two filter kernels, one for detectingthe real part, and the other detecting the imaginary part. The simpleformula (without DC compenstation) for generating the real and imaginaryGabor kernels that do not compensate for the DC component are.realg(x, y, λ, θ, σ, γ) = e x′2 +γ 2 y ′22σ 2 cos(2π x′λ ) (2)imaginaryg(x, y, λ, θ, σ, γ) = e x′2 +γ 2 y ′22σ 2 sin(2π x′λ ) (3)7

2.5 Gabor filters 2 THEORY COVERED SO FARThis change in Gabor space has been described in [8, pages 16 to 26], andis highlighted here.The problem is, given a jet j representing a response of a certain point onthe surface, how does this jet change as the surface undergoes transformation.The first part is the transformation from 3D surface point to 2D image coordinate(a possible solution is mentioned in Section 4.2 on page 24), the otheris the perceived 2D transformation in the image. The following concerns theinfluence of 2D transformations of features in the image on jets.Given a image transformation A, how does this transform the jet j before,into the jet j ′ after a transformation. The goal is to find the transformationmatrix C (A) that transforms the jet j into j ′ , when the image feature undergoesthe transformation A, as seen in [8, formula 2.11, page 20].j ′ = C (A) j (4)The transformation matrix C (A) is a linear approximation, that gets moreprecise with the resolution of the jet space (coordinates of the jet). Thetransformation matrix can be calculated by finding the resolving factors ofevery complex Gabor filterkernel ψ k (x) into every other kernel ψ k ′(x), usingthe formula [8, formula 2.12, page 21].∫< k|k ′ >=ψk ∗ (x)ψ k ′(x)dx (5)and for the filters representing the transformed Gabor, based on theknowledge of the transformation A∫< K(A −1 )|k ′ >= ψk ∗ (A−1 x)ψ k ′(x)dx (6)In this way the jet transformation matrix can be constructed. This isdone by solving the system for the values of C A [8, second half, page 20]where=⎛⎞< k 1 |k 1 > · · · < k 1 |k N >< k 2 |k 1 > . . . .C (A)⎜⎟⎝ .. ⎠< k N |k 1 > · · · < k N |k N >⎛< K 1 (A −1 )|k 1 > · · · < K 1 (A −1 ⎞)|k N >< K 2 (A −1 )|k 1 > . . . .⎜⎟⎝ .. ⎠< K N (A −1 )|k 1 > · · · < K N (A −1 )|k N >(7)9

2.6 Texons/Motons 2 THEORY COVERED SO FARC (A) =⎛⎜⎝⎞c 11 · · · c N11. . .. .⎟⎠c 1N · · · c NNFor all c NN , as they make the components of the transformation matrixC (A) .2.6 Texons/MotonsTexons and Motons [9] are a high level description of image features, thathave shown themselves to be very robust for tracking of moving texturepatches in image sequences. By grouping several filter responses to a filterbank containing Gabor filters and Log Gauss filters, a higher level ofabstraction is reached. This higher level of description reduces the base ofcomparison in an intelligent way, while at the same time making the descriptionmore flexible, making it possible to recognize texture patches that wouldotherwise not be recognized.Allthough the use of this method for solving the correspondence probleminitially showed promise, the overhead of this method would make the systemvery difficult to implement. Therefore the method was droppet, although alot of time was spent looking into the method.2.7 Kalman filtersThe Kalman filter [13],[14] is an adaptive tracking filter, that is very efficientat tracking systems where a model of the system exists. In order to usethe Kalman filter, it must be assumed that the state space equations of thesystem that is to be tracked are known, the system is linear, and that anynoise influencing the input and sensing of the system state is Gaussian with0 mean.State space equations are mathematical descriptions of how the systemreacts under the influence of input over time, and describe how the systemmoves from one state to another between each time step.The system state in this problem is the positions of features on the surface.The transformation from timestep to timestep is the RBM the surfaceundergoes.The systems typically needs to be formulated as linear states space equationsin matrix form like in Formula 8 on the following page, where X j is thenew state/positions, generated by multiplying the old state/positions X j−1 ,10

2.7 Kalman filters 2 THEORY COVERED SO FARby the transition matrix A representing the RBM. The term BU j is the inputvector effecting the system states, and W j is a vector describing the inputnoise. The input noise is the difference of the actual RBM the robot makesthe surface undergo, compared to what is expected.The output of the system can be formulated like in Formula 9, where His the output reading matrix, that describes what parts of the internal state,can be read. The vector V j is a vector describing the noise of the readings,and in this problem is a uncertainty based on the perception of the surfacethrough the cameras. This way of deriving the next state and output basedon previous states and input, is typical for definitions of dynamic systems.X j = AX j−1 + BU j + W j (8)Z j = HX j + V j (9)A graphical overview of the system can be seen on Figure 4. Where theKalman filter is the bottom part of the figure, and can be thought of as asystem simulator running in parallel with the actual system (the top part).By continually adjusting the simulator/tracker based on the error betweenthe simulator and the actual system (the residual), the Kalman filter getsbetter and better at simulating the system, and thereby better and better attracking the points on the surface.U jB+W j V j++X j + Z jH+AX j-1TB++Time DelayX^_j+HZ^j_+AX^j-1TX^j+KResidualFigure 4: A graphical overview of the Kalman filterThe Kalman filter tries to represent the state of the system (the positionsof the feature points before and after the RBM) as a Gaussian distribution.In the case of tracking a surface feature, the Kalman filter represents theposition as a position where it is most likely that the actual feature is, basedon knowledge of the system so far, and a Gaussian distribution that describes11

2.8 Particle filters 2 THEORY COVERED SO FARthe uncertainty of this estimate. If the uncertainty of that positions is great,the Gauss distribution is very wide. Based on this estimate of the system,the Kalman filter makes an estimate of how the system state will be afterthe RBM occurres. This estimate is compared with the actual system stateafter RBM, and the error is used to tune the estimated state of the system.2.8 Particle filtersParticle filters [15, section4, page 8] work in much the same way Kalmanfilters do, in that they simulate the system, while at the same time adjustingthe simulation based on the percieved error between the simulation and theactual system. The main difference lies in the way particle filters representthe uncertainty of the state and the estimates.In Kalman filters the hypothesis is represented by a best guess, and aGaussian distribution showing how likely the guess is to be correct. In particlefiltering this distribution is represented by a group of candidate guesses(particles) that each contain a hypothesis of the state of the system.Each particle is measured against the actual system state, and are assigneda value based on how well the individual particle matches the actualstate of the system. The prediction of the RBM’s effect on the system is thedone for the particles and new ”child” particles are created. These ”child”particles are created from particle candidates with a probability that is proportionalto how well the ”parent” particle matched the actual system state.In this way particles that are unlikely to describe the system very well, willmost likely not be carried over to the new generation, while particles thatmatch the system very well probably will be carried over. In this way theestimates of the system will continually get better.12

3 WORK DONE SO FAR3 Work done so farIn this section the tests performed to learn more about the theory will becovered, along with tests and description of the software tools, that havebeen implemented so far.The implemented code makes use of the existing MoInS code base [17], ina way, that should allow seamless use of the implemented with other parts ofthe MoInS project. The existing code base is written in C++, and is thereforea significant factor in the speed, at which new functions are implemented,due to the lack of previous experience with C++.3.1 Tests of Gabor filterSince the Gabor filter was no known at the start of the project, tests havebeen conducted in order to learn the effects of the different parameters of theGabor filter. This is essential since the Gabor filter will be the base of thefeature detection scheme.3.1.1 Angle response testIn this section the tests of the Gabor filters ability to detect features atdifferent angles will be covered.Kernel size 40 Kernel size 10σ = 21000080006000400020001500001250001000007500050000250000.5 1 1.5 2 2.5 30.5 1 1.5 2 2.5 31.2·10 6σ = 123000002000001000001·10 68000006000004000002000000.5 1 1.5 2 2.5 30.5 1 1.5 2 2.5 3Figure 5: The magnitude of the response of Gabor filtering of a pattern witha kernel of the same wavelength as the pattern using σ values 12 and 2. Usingkernel sizes of 40 and 10 pixels13

3.1 Tests of Gabor filter 3 WORK DONE SO FARHigher values of sigma, makes the precision of the orientation detectionbetter, as it can be seen on Figure 5 on the previous page. Where theresponse is very tight around 0 and π, for high values of sigma. When usinglarge values of sigma, it is necessary to have a large kernel, othervise the cutoffeffect of the window function is noticeable, because the Gauss function is notclose to 0 at the cutoff point. This effect can be seen in the image where alarge sigma of 12 has been chosen, but a fairly small kernel size of 10 pixels,generates a ripple effect in the response.This effect is a well documented effect of improper dimensioned windowfunctions in signal processing [10, Section 9.1] . The shape of the frequencyresponse for the smaller sigma, shown in the two last pictures, is not affectedin nearly the same way, since the Gauss function is much closer to 0 at thecutoff point.The fact that larger kernels make the filtering process execution timegreater, sets a natural limit for how large kernels, and thereby sigma, thatare usable without ripple effect.3.1.2 Frequency response testOne of the needed features of the Gabor filter is its ability to separate atexture into its frequency components. This frequency decomposition is, likethe decomposition of texture into orientations, also subject to sigma dependentmerging of responses. Figure 7 on the following page shows the Gaborfilter responses of a wavelength sweep from 4 to 12 pixels of a pattern shownin Figure 6 on the next page containing two components with wavelength 7and 10. For lower sigma values, the ability to distinguish the responses fromeach other is diminished.14

3.1 Tests of Gabor filter 3 WORK DONE SO FAR1008060402000 20 40 60 80 100Figure 6: Pattern containing two sinodials of wavelength 7 and 10 of equalamplitude4·10 68·10 7 6 8 10 126·10 74·10 72·10 73·10 72·10 71·10 74·10 7 6 8 10 128·10 66·10 64·10 62·10 61·10 7 6 8 10 123.5·10 63·10 62.5·10 62·10 61.5·10 61·10 66 8 10 12Figure 7: Responses to a wavelength sweep from 4 to 12 pixels of the patternshown in Figure 6, for values 12, 8, 4 and 2 of sigma.3.1.3 Influence of sigma in spatial domainIn this section the tests of the influence of the sigma parameter on the Gaborfilters response in the spatial domain.15

3.1 Tests of Gabor filter 3 WORK DONE SO FAR15012510075502500 25 50 75 100 125 150Wawelength 2 Angle 0Wawelength 2 Angle 0Wawelength 2 Angle 080808060606040404020202000 20 40 60 8000 20 40 60 8000 20 40 60 80Figure 8: The spatial response of Gabor filtering of box image with a wavelengthof 2 pixels, at horizontal orientation, and a sigma value of 1, 2 and 4respectivelyAs has been shown in the previous sections, a higher value of sigma makesthe filters ability to detect both the frequency and orientation of a feature,this improved resolution comes a price though. The resolution in the thespatial domain is reduced, this effect can be observed as a blurring of thespatial response. Figure 8 shows this blurring effect in the spatial domain.This affect is not a problem, as long as features are well separated, butwhen features are close, the responses merge together to form a higher response,that would be detected as a single response. Figure 9 on the nextpage shows this effect. The feature shown in the top image is filtered bya Gabor filter with different sigma values. As the sigma is increased, theresponses are blurred, intill they merge.16

3.1 Tests of Gabor filter 3 WORK DONE SO FAR15012510075502500 25 50 75 100 125 150Wawelength 2 Angle 0Wawelength 2 Angle 0Wawelength 2 Angle 080808060606040404020202000 20 40 60 8000 20 40 60 8000 20 40 60 80Figure 9: The spatial response of Gabor filtering of line image with a wavelengthof 2 pixels, at horizontal orientation, and a sigma value of 1, 2 and 4respectively17

3.2 KJet 3 WORK DONE SO FAR3.2 KJetIn order to work efficiently with the Gabor filtering of the surface images,the datastructure KJet has been constructed for containing and manipulatingthe filter responses. In this section there will be given a short overview ofthe implemented code together with test of the functionality.3.2.1 Jet representationThe jets are stored internally in the KJet class as a array of images, thatgroups responses of the same wavelength and sequentially goes through theresponse orientations. An example of a filter bank can be seen in Figure 10.A Jet is a vector of values represented at the same pixel coordinatethrough all the response images in the array. One can think of the arrayas a stack of images, and the jet would be the vector of values that wouldbe directly above eachother. In this way a jet can be accessed by pickingthe same pixel coordinate from all response images in the image array. Thisis not the most efficient way to access the jet, since a given jet can not beaccessed directly, but this representation makes a lot of the image manipulationsthat are done on the image array much more efficient, since they workon the whole response images and not on the individual jets.Figure 10: Overview of the real part of the filter bank, of the KJet. Kernelswith larger wavelengths have lower amplitude, in order for the magnitude ofthe responses to be comparable across wavelengths3.2.2 Jet overviewVisualization is a great aid when comparing results of image processing,therefor a line of functions has been implemented for displaying the imagearray in a intuitive way. Figure 11 on the next page shows the input image,and the resulting overview is shown in Figure 12 on the following page.The jet consists of the Gabor responses over 4 wavelengths from 32 to 418

3.2 KJet 3 WORK DONE SO FARpixels (downwards), over 8 orientations from 0 to 150 degrees (across). Theresponse images are ordered, so that they corresponds to the kernel positionsin Figure 10 on the previous page.Figure 11: Input image for jet creation, it contains a range of frequenciesover a range of orientationsFigure 12: The resulting overview of the jet images showing the magnitudeof the Gabor responses for the 4 wavelengths and 8 orientationsAnother implemented overview function generates a image array based oneither the average or maximum response of a given (orientation, wavelength)coordinate. Figure 13 on the following page shows the steps of generatingthis overview. This function is usefull for data inspection by users becauseit reduses the data tremendously.The function is used in Figure 14 on page 21 that shows how the maximumresponse of the different orientations change, as the texture is rotatedclockwise by 0, 45, and 90 degrees. From Figure 14 on page 21 it can beseen, that there is a ”bleeding” effect to neighboring wavelengths and orientationscoordinates. Ideally the response should only be noticeably in onesquare (the one that perfectly matches the wavelength and orientation). I19

3.2 KJet 3 WORK DONE SO FARsufficiently large values of σ are used, this ”bleeding” effect is reduced, butaliasing problems arise see section 3.2.4 on the next page.3.2.3 Jet differenceOne of the sub goals of the project is to recognize a feature in the image, evenif the feature has moved. To this end, a function that finds the best matchto a jet, representing the feature, in the image is necessary. Comparisonbetween two jets J and J ′ is done using the formula described in [7, formula5, page 5].∑j n j n′Similarity(J, J ′ n) = √∑j 2 ∑nn nj ′2n(10)The comparison can be thought of as the dot product of the two vectorsover the product of the norms of the two jets, and returns a double thatrepresents the similarity between the two jet vectors. Using the similarityfunction it is only a matter of going over all generated jets, and choosing thejet that matches best. This process is a fairly time consuming process, andis a motivator for another jet representation see 5.1 on page 26.⇓⇓Figure 13: The process of generating the jet overview, from image, to Gaborresponses to overview of average response20

3.2 KJet 3 WORK DONE SO FAR3.2.4 Jet changes under image transformationTests have been conducted in order to get a feeling of how the jets changeunder the transformation of the features they represent, Figure 14 shows theeffect of rotation of a simple texture, the intensity of the squares shows themaximum response of all jets in the image at the given jet coordinate.TextureJet overview0 ◦45 ◦90 ◦Figure 14: To the left is a texture composed of two main patterns wherethe horizontal pattern has half the wavelength of the vertical. The resultingKJet representation of 4 frequencies times eight orientations is shown in theoverview to the right. The overview shows the changes as the texture isrotated clockwise through 0, 45 and 90 degrees. The sigma value is 1.0The same process can be done for a natural texture. Figure 15 on page 23shows a texture rotated counter clockwise through the angles 0 to 60 degreesin steps of 15 degrees. In the left overview the bleeding effect is lesspronounced since the sigma value is set to 2.0, this makes the filter moreprecise.Certain responses fade in and out, as the texture is rotated. This effectis caused by texture frequency components, that are not perfectly alignedwith the 8 orientations in the filter. The rotation of the texture is done insteps of 15 degrees, but the Gabor filter resolution is in 180 = 22.5 ◦ , and8therefore features, that are not aligned with the jet orientations, are causingan aliasing effect across the rotation of the texture. This effect is not shownin Figure 14, as the rotation of the texture is perfectly aligned with the jetorientations. This problem is present in all digital sampling of systems [10,section 1.3].21

3.2 KJet 3 WORK DONE SO FARThe aliasing effect is less pronounced when using a smaller values of sigma,as it essentially blurs the response, and thereby ”bleeds” to nearby frequenciesand orientations in the jet. This can be seen in the right overviews. Thedrawback of blurring the responses is the fact that the individual responsesare less distinct, so a compromise has to be made when choosing the sigmavalue.22

3.2 KJet 3 WORK DONE SO FARTexture Sigma 2.0 Sigma 0.50 ◦15 ◦30 ◦45 ◦60 ◦Figure 15: To the left is a natural texture composed of several frequencies.The resulting KJet overview to the right shows the changes as the texture isrotated through 0 to 60 degrees in steps of 15 degrees. The sigma value is0.53.2.5 Jet to jet decompositionIt is essential for the success of the project, that features on the surfacecan be tracked even though the surface undergoes transformations. It istherefore necessary to have a method for transforming a jet representing aknown feature in a way, that makes it possible to locate the same featureafter transformation of the feature. At this point the transformation factorintegral of formula 5 on page 9 has been implemented, along a function forgenerating the leftmost matrix of Equation 7 on page 9.Progress is at this point stuck by the fact, the existing code base is unableto handle matrix operations like multiplication and solving of matrix equationsfor advanced datatypes like complex numbers. This has been made ahigh priority to solve this problem, as it will be beneficial for all using theMoInS codebase, that this functionality is available.23

4 TOOLS NEEDED FOR PROJECT4 Tools needed for projectThere is still a list of tools needed, to do this project, both theory andimplementation wise. This section touches briefly upon the tools that havebeen identified as necessary for the the project.4.1 2D to 3D conversionIn order to convert 2D coordinates in the stereo images to 3D coordinates,a method for intersection of 3D lines has to be used. As long as the correspondenceproblem is solved, the 2D to 3D conversion should be a fairlysimple problem to solve, since its only a matter of solving simple linear equations.This is allready implemented in the existing condebase, as a part ofthe feature detection software.4.2 3D to 2D conversionAll transformation of the surfaces to be tracked, happen in 3D, so it is necessaryto have a method for transforming hypothesis of local 3D texturepatches into 2D, for comparison with the actual perceived stereo images.A simple projection might be sufficient, but more likely a method thatconverts 3D surface coordinates to 2D image coordinates, would be better[16, section 3.1, page 96, formula 5].4.3 Tracking mechanismIn order to generate an estimation of the surface topology, it is necessary tobe able to track the featurepoints, as they move in 3D. This requires anotherlist of tools.4.3.1 Implementing Kalman filterIn order to track points on the 3D surface, a tracking filter is to be implemented.The Kalman filter is a relatively simple tracking filter, and can withsmall modifications be used as a basis for the more complex particle filter.The theory of the Kalman filter and particle filter has been covered brieflyin sections 2.7 on page 10 and 2.8 on page 12 respectively.4.3.2 Hypothesis generatorBoth the Kalman filter and the particle filter uses a technique of predictionand verification of predictions, in order to track input. This dictates the need24

4.4 Surface generation 4 TOOLS NEEDED FOR PROJECTfor a way to generate predictions/hypothesis of where a given feature will beafter the next timestep, and what it will look like, when it has undergonea certain transformation. The change of appearance of a feature has brieflybeen covered in Section 2.5.3 on page 8.4.3.3 RBM modelHypothesis generator uses the knowledge accumulated so far, together withthe known knowledge of the transformation of the surface. This known transformationis based on the RBM of the surface, and is therefore under the domainof robotics. At this point, the knowledge of RBM of robotics is limited,so investigation in this field is necessary.4.3.4 Particle filterOnce the Kalman filter is implemented, another level of complexity can beadded, by using the implemented Kalman filter as a basis for a more advancedparticle filter. The necessity of the particle filter depends on how well theKalman filter handles the tracking task. This again is dependent on how wellthe assumption of Gauss distributed errors hold true.4.4 Surface generationOnce the 3D surface points have been tracked over sufficiently long time,and their relative positions have been determined to an acceptable level, thecollections of tracked points needs to be combined in a way that can representthe surface that is spanned between them. For surfaces without occlusion,this should not be a very difficult task.One representation of the surface could be a mesh spanned by the trackedpoints. In order to generate this mesh, triangles have to be spanned betweenthe tracked points, in a way that fills out the surface between the points. Onecan think of spanning a graph between the point nodes. Knowledge of howto efficiently span this graph is needed and therefore falls under the domainof graph theory.Another simpler method is to augment a quadratic mesh of a given numberof nodes, by the known surface points. this could be done by interpolatingthe mesh points between the known surface points, and on essence ”wrapping”the mesh around the known points.The creation of the surface approximation, in its simplest form is not acomplex task, and should therefore fairly easy to solve.25

5 OPEN PROBLEMS5 Open problems5.1 Symbolic representationThe current representation of the features, as a jet of Gabor responses is bothtime and space consuming to create and store because of the dimensions ofthe jet vector is the same as the number of frequencies times the orientationsused for jet decomposition. This makes any search in the jet representationfor the best matching jet very time consuming.It might be beneficial to find another representation, based on the existingjet representation, that compresses the jet representation, in such a way thatthe search problem is reduced. One such representation could be to only usea list of coordinates, that have a response over a certain threshold.Since low responses are more likely to be either ”bleeding” from a nearbyresponse, or a very indistinguishable feature. In both cases information isremoved, the hope is, that because of the insignificance of the response thisremoving of information does not matter. This method of information compressionis similar to the wavelet compression method [11, 8.5.3, Page 486]26

6 CONCLUSION6 ConclusionThe task of generating a surface approximation using tracking of texturepoints, is not a trivial one. A proposal of a solution has been formulated,and a list of the tools needed to complete the task, both implemented andyet to come, have been covered in this report.A few candidate technologies for correspondence problem reduction, 2Dto 3D conversion and tracking of texture points, have briefly been covered.Gabor filters have shown themselves as a solid tool for handling the 2Dto 3D correspondence problem, and will therefore be the basis for furtherdevelopment.Kalman filters are conceptually the simplest tracking filter, this will bethe basis for developing the 3D tracking system. The Kalman filter, is able tofunction as the basis for particle filtering, with a few modifications. Thereforethe particle filtering method might be beneficial to explore, in the case thatKalman filtering is not sufficient.Jesper Juul Henriksen27

ATIMETABLEATimetableThis section contains a timetable, showing the estimated time needed tocomplete the different sub tasks yet to be completed.Figure 16: Timetable for the masters project28

REFERENCESREFERENCESReferences[1] Abhijit S. Ogale and Yiannis Aloimonos: Stereo correspondence withslanted surfaces: critical implications of horizontal slant, Paper, Centerfor Automation Reasearch, University of Maryland.[2] Martin Kampel: Shape from Structured Light, Short web tutorial,http://www.prip.tuwien.ac.at/Research/3DVision/struct.html, Lastmodified: Tuesday, 12-Sep-2000 13:58:02 CEST[3] Norbert Krüger and Florentin Wörgötter: Multi-modual Primitivesas functional Models of Hyper-colums and their use for contextual Integration,Paper, Published in the proceedings of the workshop Brain,Vision and Artificial Intelligense symposium 2005.[4] Norbert Krüger, Marcus Ackermann and Gerald Sommer: Accumulationof Object Representation utilizing Interaction of Robot Actionand Perception, Paper, Published in Knowledge Based Systems 15:111-118, 2002.[5] Javier R. Movellan: Tutorial on Gabor Filters, Tutorial paperhttp://mplab.ucsd.edu/tutorials/pdfs/gabor.pdf.[6] Michael Lindenbaum, Roman Sandler: Gabor Filter Analysis for TextureSegmentation, Technical Report CIS-2005-05 - 2005, Technion -Computer Science Department.[7] Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger and Christophvon der Malsburg: Face Recognition by Elastic Bunch Graph Matching,Internal Report 96-08, ISSN 0943-2752, Institut für Neuroinformatik,Ruhr-Universität Bochum.[8] Thomas Mauer: Erkennung gedrehter Gesichter: von Einzelbildernzu Sequenzen, Paper, Ruhr-Universität Bochum, December 1998.[9] Song-Shun Zhu, Cheng-En Guo, Yizhou Wang and Zijian Xu:What are Texons?, Paper, International Journal of Computer Vision62(1/2). 121-143, 2005.[10] Sophocles J. Orfanidis: Introduction To Signal Processing, ISBN 0-13-209172-0.[11] Rafael C. Gonzalez and Richard E. Woods: Digital Image Processing,Second edition ISBN 0-13-094650-8.29

REFERENCESREFERENCES[12] P. Kruizinga, N. Petkov and S.E. Grigorescu: Comparison of texturefeatures based on Gabor filters, Paper, Institute of Mathmaticsand Computing Science, University of Groningen. Publiched in Proceedingsof the 10th International Conference on Image Analysis andProcessing.[13] Greg Welch and Gary Bishop: An Introduction to the Kalman Filter,Department of Computer Science, University of North Carolina atChapel Hill, Publiched at SIGGRAPH 2001[14] Erik Cuevas, Daniel Zaldivar and Raul Rojas: Kalman filter for visiontracking Technical Report B 05-12, publiched 10 August 2005[15] Sebastian Thun, Michael Montemerlo, Daphne Koller, Ben Wegbreit,Juan Nieto and Eduardo Nebot: FastSLAM: An Efficient Solution tothe Simultanious Localization And Mapping Problem with UnknownData Association, Paper, Computer Science Department, StandfortUniversity, Australian Center for Field Robotics, The University ofSydney, Australia.[16] Corentin Massot and Jeanny Hérault: Recovering the Shape fromTexture Using Lognormal Filters, Paper, Laboratory of Image andSignals, Grenoble, France[17] http://www.covig.imi.aau.dk/30

3D surface tracking and approximation using Gabor filters - CoViL

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?