11.07.2015 Views

3D surface tracking and approximation using Gabor filters - CoViL

3D surface tracking and approximation using Gabor filters - CoViL

3D surface tracking and approximation using Gabor filters - CoViL

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>3D</strong> <strong>surface</strong> <strong>tracking</strong> <strong>and</strong> <strong>approximation</strong> <strong>using</strong><strong>Gabor</strong> <strong>filters</strong>Jesper Juul HenriksenMarch 28, 2007


5 Open problems 265.1 Symbolic representation . . . . . . . . . . . . . . . . . . . . . 266 Conclusion 27A Timetable 28ii


AbstractThis report covers the theory covered <strong>and</strong> work done under theFORK part of the masters project at Syddansk Universitet, in thepersuit of a way to estimate <strong>surface</strong> topology of object under robotcontrolled motion, based on motion <strong>and</strong> vision information.


1.3 Proposal of solution 1 THE PROBLEMThe coordinates of the <strong>surface</strong> points will be tracked over time, in order toget a better statistical basis for determining the coordinates of the points.Either a Kalman filter, or a particle filter will be used for point <strong>tracking</strong>, asthey have shown themselves to be very effective at <strong>tracking</strong> systems, wherelittle knowledge of the system to be tracked is known at the start of <strong>tracking</strong>.This solution the <strong>surface</strong> <strong>approximation</strong> problem depends on a list of tools,both theory <strong>and</strong> software, that will be covered in the folowing sections.2


2 THEORY COVERED SO FAR2 Theory covered so farIn this section different c<strong>and</strong>idate theories will be described. The theoriesthat have been discarded for the project will be covered very briefly, whilethe c<strong>and</strong>idates that hold potential will be covered in greater detail.2.1 Stereo visionIn order to digitize a <strong>3D</strong> <strong>surface</strong>, at least two different views of the <strong>surface</strong>are necessary. Using the knowledge of the projection matrix of the cameras,it is possible to generate a depth estimate by triangulation. The process oftriangulation is well understood, <strong>and</strong> is fairly straight forward. The accuracyof the triangulation is higher if the angle between the rays is increased, thismeans that ideally the cameras generating the stereo images should be veryfar apart. The problem is the finding of the same point on the <strong>surface</strong> inboth images. This is known as the correspondence problem [1].2.2 Correspondence problemThere are several issues that makes problem of finding points in the stereoimages, that corresponds to the same point on the <strong>surface</strong> of an object difficult.• It is likely that the <strong>surface</strong> looks different from different angles, this istypically caused by the fact that views from different angles are subjectto different lighting conditions. Shiny object for example can lookvery different from different angles because of the highlight effect onthe <strong>surface</strong>. This problem can be a limiting factor on the distance thestereo cameras can be placed from eachother, <strong>and</strong> thereby the achievabledepth accuracy.• The <strong>surface</strong> itself can also be obstructing the view of the correspondencepoint, this is the case for example for a cup, where the h<strong>and</strong>le of thecup at certain viewing angles is in front of the cup itself.• It may not be that there is a unique match between image points. Oftenthere are a number of c<strong>and</strong>idate matches from one stereo image to theother, this is often the case for <strong>surface</strong>s, that have weak <strong>surface</strong> texture,like a sheet of white paper.The Correspondence problem is one of the main problems in stereo vision,<strong>and</strong> there are no fool proof ways around the problem for unassisted vision. By3


2.3 Dense stereo 2 THEORY COVERED SO FARassisting the vision process by <strong>using</strong> structured light [2], the correspondenceproblem can be greatly reduced though. The structured light method usesa technique where the <strong>surface</strong> to be digitized is lit by a lightsource emittinglight in a known pattern. This can be achieved by <strong>using</strong> a projector emittingfor example a checker pattern on the <strong>surface</strong>. This method is most effectivewhen the emitted light is the only light source, <strong>and</strong> this is often not the casefor real environments, <strong>and</strong> certainly not for out door use.2.3 Dense stereoDense stereo is a relatively simple way to do stereo vision, in its simplestform it tries to match every pixel in one of the stereo images to every pixel inthe other picture. This simple form is very ineffective, <strong>and</strong> is never used inpractice. Matches are typically sought on what are called epipolar lines in thepictures. The use of this technique requires that the cameras are mountedside by side, since it assumes that point on the <strong>surface</strong> will be at the sameheight on both stereo images.2.4 Sparse stereoAnother way to generate stereo data, is sparse stereo, this approach tries tominimize the number of pixels that has to be checked for correspondences,<strong>and</strong> thereby the correspondence problem by only looking at parts of theimages that are likely to be recognizable from one image to another. Onesuch method is looking at the perceived edges[3] or lines [4] in the images .The main difference between dense <strong>and</strong> sparse stereo, is that in sparse stereo,comparison is not done on a pixel by pixel basis, but groups of pixels defininga feature. An edge of an object in the image would for example be definedby at least two pixels. The fact that more pixels are involved reduces thecorrespondence problem, since the comparison is based on more data.2.5 <strong>Gabor</strong> <strong>filters</strong>Another way to find good correspondence c<strong>and</strong>idates is to use different imageprocessing techniques on the stereo image pair. <strong>Gabor</strong> filtering is onesuch way to process the stereo images. <strong>Gabor</strong> <strong>filters</strong> are rotation sensitivelocal frequency detectors. <strong>Gabor</strong> filtering is done by convolution of a imagewith set of <strong>Gabor</strong> kernels designed to detect certain frequencies <strong>and</strong> certainorientations.The <strong>Gabor</strong> filter bank is often composed of kernels detecting 4 different4


2.5 <strong>Gabor</strong> <strong>filters</strong> 2 THEORY COVERED SO FARwavelengths/frequencies 1 at 8 different orientations from 0 ◦ to 180 ◦ . Thereason the orientations are only for a half circle is, that the absolute responsewould be the same, <strong>and</strong> the phase would just be rotated by 180 ◦ . Theresponse to each filter in the filterbank is stored <strong>and</strong> used as part of the basisfor comparison.One of the problems of <strong>Gabor</strong> filtering is that it needs much processing togenerate the filter responses. For a jet of 4 wavelengths over 8 orientations,32 filter responses have to be calculated. The process fortunately scalesnicely when split up <strong>and</strong> run in parallel, as each response can be calculatedindependently of the others, <strong>and</strong> the process in it self, like any convolutionalso scales well.Comparison is done by comparing a vector composed of the filter responsesof the filter bank at a certain pixel (jet), by doing comparison basedon the whole jet, the likelihood of wrong correspondence matching is therebyreduced.2.5.1 <strong>Gabor</strong> kernelsThe <strong>Gabor</strong> filter is composed of a complex sinusoidal carrier <strong>and</strong> a 2D Gaussianenvelop. Figure 1 left shows the real part of the carrier Figure 1 rightshows the same carrier with the Gaussian envelop.33221100-1-1-2-2-3-3 -2 -1 0 1 2 3-3-3 -2 -1 0 1 2 3Figure 1: Real sinusoidal carrier <strong>and</strong> same carrier under Gaussian envelop.The black part of both figures represents values of -1 <strong>and</strong> the white areasrepresents values of 1.Convolution of an image with the kernel gives a response that is proportionalto how well the local feature in the image matches the kernel. Thisproperty of convolution is general for signal processing. When this is done forpixels1 Throughout this paper the notationwavelength will be used instead of radpixeleven iftalking about frequencies, this is because <strong>Gabor</strong> kernels are designed by wavelengths.5


2.5 <strong>Gabor</strong> <strong>filters</strong> 2 THEORY COVERED SO FARboth the real <strong>and</strong> imaginary kernels, the frequency response <strong>and</strong> phase canbe found by calculating the amplitude <strong>and</strong> argument of the complex responser, i from the real <strong>and</strong> imaginary kernels respectively:Amplitude = √ r 2 + i 2Argument = tan ( )−1 irWawelength 4 Angle 90.0Wawelength 4 Angle 90.0175150125140120100140120100100808075606050404025202000 50 100 150 200 25000 50 100 150 20000 50 100 150 200Figure 2: Picture <strong>and</strong> <strong>and</strong> corresponding magnitude <strong>and</strong> phase response of<strong>Gabor</strong> filtering with parameters λ = 4 , σ = 2 <strong>and</strong> θ = π 2Typically only the magnitude of the response is needed, but the phasecan be used to further distinguish between responses, as the phase changesrapidly with location, <strong>and</strong> therefore makes the comparison more precise, sincetwo neighporing points in the image can have the same absolute response buttypically very different phase.It can be seen on the middle picture of Figure 2, the absolute responseshows where the image has local horizontal changes, <strong>and</strong> the rightmost pictureof Figure 2 shows in what direction the change is happening, the phase.The Gauss envelop is limiting the cutoff effect at the edges of the kernel,that is due to the fact that the kernel can only be represented by a finitenumber of values, that for computational reasons are kept fairly small. Thefact that the Gauss curve goes towards 0 relatively fast makes it ideal as awindow function.If larger Gauss envelops are used, a greater number of wavelengths aresignificant in the kernel, thereby improving the frequency <strong>and</strong> orientationresolution, but lowering the spatial certainty, this relation is a property ofthe Gauss window function. The relationship between the spatial resolution<strong>and</strong> frequency resolution is fixed. ”A signal’s specificity simultaneous in time<strong>and</strong> frequency is fundamentally limited by a lower bound on the product of itsb<strong>and</strong>width <strong>and</strong> duration (analogous to indeterminacy relations of quantummechanics).” [6, page 6]6


2.5 <strong>Gabor</strong> <strong>filters</strong> 2 THEORY COVERED SO FAR(△x)(△ω) ≥ 14πSince the <strong>Gabor</strong> filter is a spatial b<strong>and</strong> pass filter, the relationship σ λdefines the b<strong>and</strong>width of the filter. The b<strong>and</strong>width b in octaves in relationto σ is as follows [12, Section 2, formula 3]:λ√σλb = log π + ln222 √σπ − ln2λ 2⇐⇒ σ λ = 1 π√ln222 b + 12 b − 1The problem with reduced resolution arises from the fact that two responsesclose to each other add as shown in Figure 3, where the two Gaussshaped responses merge to form a response that make the individual responsesindistinguishable from each other.(1)22221.51.51.51.510.5210.5210.5210.520-200-200-200-2002-202-202-202-2Figure 3: Gauss curves with σ = 1 placed at distance 4, 3, 2, 1 from eachotherIdeally the σ should be designed in such a way, that a local feature who’sfrequency / orientation lies half way between two orientations in the jet,should give a equal response in both orientations, that is half of what therespones would be, if the feature was perfectly aligned with the jet. That waythe job of finding out witch jet coordinate the feature belongs to, is mucheasier, <strong>and</strong> assure best coverage in the <strong>Gabor</strong> space.The <strong>Gabor</strong> filter is implemented as a set of two filter kernels, one for detectingthe real part, <strong>and</strong> the other detecting the imaginary part. The simpleformula (without DC compenstation) for generating the real <strong>and</strong> imaginary<strong>Gabor</strong> kernels that do not compensate for the DC component are.realg(x, y, λ, θ, σ, γ) = e x′2 +γ 2 y ′22σ 2 cos(2π x′λ ) (2)imaginaryg(x, y, λ, θ, σ, γ) = e x′2 +γ 2 y ′22σ 2 sin(2π x′λ ) (3)7


2.5 <strong>Gabor</strong> <strong>filters</strong> 2 THEORY COVERED SO FARThis change in <strong>Gabor</strong> space has been described in [8, pages 16 to 26], <strong>and</strong>is highlighted here.The problem is, given a jet j representing a response of a certain point onthe <strong>surface</strong>, how does this jet change as the <strong>surface</strong> undergoes transformation.The first part is the transformation from <strong>3D</strong> <strong>surface</strong> point to 2D image coordinate(a possible solution is mentioned in Section 4.2 on page 24), the otheris the perceived 2D transformation in the image. The following concerns theinfluence of 2D transformations of features in the image on jets.Given a image transformation A, how does this transform the jet j before,into the jet j ′ after a transformation. The goal is to find the transformationmatrix C (A) that transforms the jet j into j ′ , when the image feature undergoesthe transformation A, as seen in [8, formula 2.11, page 20].j ′ = C (A) j (4)The transformation matrix C (A) is a linear <strong>approximation</strong>, that gets moreprecise with the resolution of the jet space (coordinates of the jet). Thetransformation matrix can be calculated by finding the resolving factors ofevery complex <strong>Gabor</strong> filterkernel ψ k (x) into every other kernel ψ k ′(x), <strong>using</strong>the formula [8, formula 2.12, page 21].∫< k|k ′ >=ψk ∗ (x)ψ k ′(x)dx (5)<strong>and</strong> for the <strong>filters</strong> representing the transformed <strong>Gabor</strong>, based on theknowledge of the transformation A∫< K(A −1 )|k ′ >= ψk ∗ (A−1 x)ψ k ′(x)dx (6)In this way the jet transformation matrix can be constructed. This isdone by solving the system for the values of C A [8, second half, page 20]where=⎛⎞< k 1 |k 1 > · · · < k 1 |k N >< k 2 |k 1 > . . . .C (A)⎜⎟⎝ .. ⎠< k N |k 1 > · · · < k N |k N >⎛< K 1 (A −1 )|k 1 > · · · < K 1 (A −1 ⎞)|k N >< K 2 (A −1 )|k 1 > . . . .⎜⎟⎝ .. ⎠< K N (A −1 )|k 1 > · · · < K N (A −1 )|k N >(7)9


2.6 Texons/Motons 2 THEORY COVERED SO FARC (A) =⎛⎜⎝⎞c 11 · · · c N11. . .. .⎟⎠c 1N · · · c NNFor all c NN , as they make the components of the transformation matrixC (A) .2.6 Texons/MotonsTexons <strong>and</strong> Motons [9] are a high level description of image features, thathave shown themselves to be very robust for <strong>tracking</strong> of moving texturepatches in image sequences. By grouping several filter responses to a filterbank containing <strong>Gabor</strong> <strong>filters</strong> <strong>and</strong> Log Gauss <strong>filters</strong>, a higher level ofabstraction is reached. This higher level of description reduces the base ofcomparison in an intelligent way, while at the same time making the descriptionmore flexible, making it possible to recognize texture patches that wouldotherwise not be recognized.Allthough the use of this method for solving the correspondence probleminitially showed promise, the overhead of this method would make the systemvery difficult to implement. Therefore the method was droppet, although alot of time was spent looking into the method.2.7 Kalman <strong>filters</strong>The Kalman filter [13],[14] is an adaptive <strong>tracking</strong> filter, that is very efficientat <strong>tracking</strong> systems where a model of the system exists. In order to usethe Kalman filter, it must be assumed that the state space equations of thesystem that is to be tracked are known, the system is linear, <strong>and</strong> that anynoise influencing the input <strong>and</strong> sensing of the system state is Gaussian with0 mean.State space equations are mathematical descriptions of how the systemreacts under the influence of input over time, <strong>and</strong> describe how the systemmoves from one state to another between each time step.The system state in this problem is the positions of features on the <strong>surface</strong>.The transformation from timestep to timestep is the RBM the <strong>surface</strong>undergoes.The systems typically needs to be formulated as linear states space equationsin matrix form like in Formula 8 on the following page, where X j is thenew state/positions, generated by multiplying the old state/positions X j−1 ,10


2.7 Kalman <strong>filters</strong> 2 THEORY COVERED SO FARby the transition matrix A representing the RBM. The term BU j is the inputvector effecting the system states, <strong>and</strong> W j is a vector describing the inputnoise. The input noise is the difference of the actual RBM the robot makesthe <strong>surface</strong> undergo, compared to what is expected.The output of the system can be formulated like in Formula 9, where His the output reading matrix, that describes what parts of the internal state,can be read. The vector V j is a vector describing the noise of the readings,<strong>and</strong> in this problem is a uncertainty based on the perception of the <strong>surface</strong>through the cameras. This way of deriving the next state <strong>and</strong> output basedon previous states <strong>and</strong> input, is typical for definitions of dynamic systems.X j = AX j−1 + BU j + W j (8)Z j = HX j + V j (9)A graphical overview of the system can be seen on Figure 4. Where theKalman filter is the bottom part of the figure, <strong>and</strong> can be thought of as asystem simulator running in parallel with the actual system (the top part).By continually adjusting the simulator/tracker based on the error betweenthe simulator <strong>and</strong> the actual system (the residual), the Kalman filter getsbetter <strong>and</strong> better at simulating the system, <strong>and</strong> thereby better <strong>and</strong> better at<strong>tracking</strong> the points on the <strong>surface</strong>.U jB+W j V j++X j + Z jH+AX j-1TB++Time DelayX^_j+HZ^j_+AX^j-1TX^j+KResidualFigure 4: A graphical overview of the Kalman filterThe Kalman filter tries to represent the state of the system (the positionsof the feature points before <strong>and</strong> after the RBM) as a Gaussian distribution.In the case of <strong>tracking</strong> a <strong>surface</strong> feature, the Kalman filter represents theposition as a position where it is most likely that the actual feature is, basedon knowledge of the system so far, <strong>and</strong> a Gaussian distribution that describes11


2.8 Particle <strong>filters</strong> 2 THEORY COVERED SO FARthe uncertainty of this estimate. If the uncertainty of that positions is great,the Gauss distribution is very wide. Based on this estimate of the system,the Kalman filter makes an estimate of how the system state will be afterthe RBM occurres. This estimate is compared with the actual system stateafter RBM, <strong>and</strong> the error is used to tune the estimated state of the system.2.8 Particle <strong>filters</strong>Particle <strong>filters</strong> [15, section4, page 8] work in much the same way Kalman<strong>filters</strong> do, in that they simulate the system, while at the same time adjustingthe simulation based on the percieved error between the simulation <strong>and</strong> theactual system. The main difference lies in the way particle <strong>filters</strong> representthe uncertainty of the state <strong>and</strong> the estimates.In Kalman <strong>filters</strong> the hypothesis is represented by a best guess, <strong>and</strong> aGaussian distribution showing how likely the guess is to be correct. In particlefiltering this distribution is represented by a group of c<strong>and</strong>idate guesses(particles) that each contain a hypothesis of the state of the system.Each particle is measured against the actual system state, <strong>and</strong> are assigneda value based on how well the individual particle matches the actualstate of the system. The prediction of the RBM’s effect on the system is thedone for the particles <strong>and</strong> new ”child” particles are created. These ”child”particles are created from particle c<strong>and</strong>idates with a probability that is proportionalto how well the ”parent” particle matched the actual system state.In this way particles that are unlikely to describe the system very well, willmost likely not be carried over to the new generation, while particles thatmatch the system very well probably will be carried over. In this way theestimates of the system will continually get better.12


3 WORK DONE SO FAR3 Work done so farIn this section the tests performed to learn more about the theory will becovered, along with tests <strong>and</strong> description of the software tools, that havebeen implemented so far.The implemented code makes use of the existing MoInS code base [17], ina way, that should allow seamless use of the implemented with other parts ofthe MoInS project. The existing code base is written in C++, <strong>and</strong> is thereforea significant factor in the speed, at which new functions are implemented,due to the lack of previous experience with C++.3.1 Tests of <strong>Gabor</strong> filterSince the <strong>Gabor</strong> filter was no known at the start of the project, tests havebeen conducted in order to learn the effects of the different parameters of the<strong>Gabor</strong> filter. This is essential since the <strong>Gabor</strong> filter will be the base of thefeature detection scheme.3.1.1 Angle response testIn this section the tests of the <strong>Gabor</strong> <strong>filters</strong> ability to detect features atdifferent angles will be covered.Kernel size 40 Kernel size 10σ = 21000080006000400020001500001250001000007500050000250000.5 1 1.5 2 2.5 30.5 1 1.5 2 2.5 31.2·10 6σ = 123000002000001000001·10 68000006000004000002000000.5 1 1.5 2 2.5 30.5 1 1.5 2 2.5 3Figure 5: The magnitude of the response of <strong>Gabor</strong> filtering of a pattern witha kernel of the same wavelength as the pattern <strong>using</strong> σ values 12 <strong>and</strong> 2. Usingkernel sizes of 40 <strong>and</strong> 10 pixels13


3.1 Tests of <strong>Gabor</strong> filter 3 WORK DONE SO FARHigher values of sigma, makes the precision of the orientation detectionbetter, as it can be seen on Figure 5 on the previous page. Where theresponse is very tight around 0 <strong>and</strong> π, for high values of sigma. When <strong>using</strong>large values of sigma, it is necessary to have a large kernel, othervise the cutoffeffect of the window function is noticeable, because the Gauss function is notclose to 0 at the cutoff point. This effect can be seen in the image where alarge sigma of 12 has been chosen, but a fairly small kernel size of 10 pixels,generates a ripple effect in the response.This effect is a well documented effect of improper dimensioned windowfunctions in signal processing [10, Section 9.1] . The shape of the frequencyresponse for the smaller sigma, shown in the two last pictures, is not affectedin nearly the same way, since the Gauss function is much closer to 0 at thecutoff point.The fact that larger kernels make the filtering process execution timegreater, sets a natural limit for how large kernels, <strong>and</strong> thereby sigma, thatare usable without ripple effect.3.1.2 Frequency response testOne of the needed features of the <strong>Gabor</strong> filter is its ability to separate atexture into its frequency components. This frequency decomposition is, likethe decomposition of texture into orientations, also subject to sigma dependentmerging of responses. Figure 7 on the following page shows the <strong>Gabor</strong>filter responses of a wavelength sweep from 4 to 12 pixels of a pattern shownin Figure 6 on the next page containing two components with wavelength 7<strong>and</strong> 10. For lower sigma values, the ability to distinguish the responses fromeach other is diminished.14


3.1 Tests of <strong>Gabor</strong> filter 3 WORK DONE SO FAR1008060402000 20 40 60 80 100Figure 6: Pattern containing two sinodials of wavelength 7 <strong>and</strong> 10 of equalamplitude4·10 68·10 7 6 8 10 126·10 74·10 72·10 73·10 72·10 71·10 74·10 7 6 8 10 128·10 66·10 64·10 62·10 61·10 7 6 8 10 123.5·10 63·10 62.5·10 62·10 61.5·10 61·10 66 8 10 12Figure 7: Responses to a wavelength sweep from 4 to 12 pixels of the patternshown in Figure 6, for values 12, 8, 4 <strong>and</strong> 2 of sigma.3.1.3 Influence of sigma in spatial domainIn this section the tests of the influence of the sigma parameter on the <strong>Gabor</strong><strong>filters</strong> response in the spatial domain.15


3.1 Tests of <strong>Gabor</strong> filter 3 WORK DONE SO FAR15012510075502500 25 50 75 100 125 150Wawelength 2 Angle 0Wawelength 2 Angle 0Wawelength 2 Angle 080808060606040404020202000 20 40 60 8000 20 40 60 8000 20 40 60 80Figure 8: The spatial response of <strong>Gabor</strong> filtering of box image with a wavelengthof 2 pixels, at horizontal orientation, <strong>and</strong> a sigma value of 1, 2 <strong>and</strong> 4respectivelyAs has been shown in the previous sections, a higher value of sigma makesthe <strong>filters</strong> ability to detect both the frequency <strong>and</strong> orientation of a feature,this improved resolution comes a price though. The resolution in the thespatial domain is reduced, this effect can be observed as a blurring of thespatial response. Figure 8 shows this blurring effect in the spatial domain.This affect is not a problem, as long as features are well separated, butwhen features are close, the responses merge together to form a higher response,that would be detected as a single response. Figure 9 on the nextpage shows this effect. The feature shown in the top image is filtered bya <strong>Gabor</strong> filter with different sigma values. As the sigma is increased, theresponses are blurred, intill they merge.16


3.1 Tests of <strong>Gabor</strong> filter 3 WORK DONE SO FAR15012510075502500 25 50 75 100 125 150Wawelength 2 Angle 0Wawelength 2 Angle 0Wawelength 2 Angle 080808060606040404020202000 20 40 60 8000 20 40 60 8000 20 40 60 80Figure 9: The spatial response of <strong>Gabor</strong> filtering of line image with a wavelengthof 2 pixels, at horizontal orientation, <strong>and</strong> a sigma value of 1, 2 <strong>and</strong> 4respectively17


3.2 KJet 3 WORK DONE SO FAR3.2 KJetIn order to work efficiently with the <strong>Gabor</strong> filtering of the <strong>surface</strong> images,the datastructure KJet has been constructed for containing <strong>and</strong> manipulatingthe filter responses. In this section there will be given a short overview ofthe implemented code together with test of the functionality.3.2.1 Jet representationThe jets are stored internally in the KJet class as a array of images, thatgroups responses of the same wavelength <strong>and</strong> sequentially goes through theresponse orientations. An example of a filter bank can be seen in Figure 10.A Jet is a vector of values represented at the same pixel coordinatethrough all the response images in the array. One can think of the arrayas a stack of images, <strong>and</strong> the jet would be the vector of values that wouldbe directly above eachother. In this way a jet can be accessed by pickingthe same pixel coordinate from all response images in the image array. Thisis not the most efficient way to access the jet, since a given jet can not beaccessed directly, but this representation makes a lot of the image manipulationsthat are done on the image array much more efficient, since they workon the whole response images <strong>and</strong> not on the individual jets.Figure 10: Overview of the real part of the filter bank, of the KJet. Kernelswith larger wavelengths have lower amplitude, in order for the magnitude ofthe responses to be comparable across wavelengths3.2.2 Jet overviewVisualization is a great aid when comparing results of image processing,therefor a line of functions has been implemented for displaying the imagearray in a intuitive way. Figure 11 on the next page shows the input image,<strong>and</strong> the resulting overview is shown in Figure 12 on the following page.The jet consists of the <strong>Gabor</strong> responses over 4 wavelengths from 32 to 418


3.2 KJet 3 WORK DONE SO FARpixels (downwards), over 8 orientations from 0 to 150 degrees (across). Theresponse images are ordered, so that they corresponds to the kernel positionsin Figure 10 on the previous page.Figure 11: Input image for jet creation, it contains a range of frequenciesover a range of orientationsFigure 12: The resulting overview of the jet images showing the magnitudeof the <strong>Gabor</strong> responses for the 4 wavelengths <strong>and</strong> 8 orientationsAnother implemented overview function generates a image array based oneither the average or maximum response of a given (orientation, wavelength)coordinate. Figure 13 on the following page shows the steps of generatingthis overview. This function is usefull for data inspection by users becauseit reduses the data tremendously.The function is used in Figure 14 on page 21 that shows how the maximumresponse of the different orientations change, as the texture is rotatedclockwise by 0, 45, <strong>and</strong> 90 degrees. From Figure 14 on page 21 it can beseen, that there is a ”bleeding” effect to neighboring wavelengths <strong>and</strong> orientationscoordinates. Ideally the response should only be noticeably in onesquare (the one that perfectly matches the wavelength <strong>and</strong> orientation). I19


3.2 KJet 3 WORK DONE SO FARsufficiently large values of σ are used, this ”bleeding” effect is reduced, butaliasing problems arise see section 3.2.4 on the next page.3.2.3 Jet differenceOne of the sub goals of the project is to recognize a feature in the image, evenif the feature has moved. To this end, a function that finds the best matchto a jet, representing the feature, in the image is necessary. Comparisonbetween two jets J <strong>and</strong> J ′ is done <strong>using</strong> the formula described in [7, formula5, page 5].∑j n j n′Similarity(J, J ′ n) = √∑j 2 ∑nn nj ′2n(10)The comparison can be thought of as the dot product of the two vectorsover the product of the norms of the two jets, <strong>and</strong> returns a double thatrepresents the similarity between the two jet vectors. Using the similarityfunction it is only a matter of going over all generated jets, <strong>and</strong> choosing thejet that matches best. This process is a fairly time consuming process, <strong>and</strong>is a motivator for another jet representation see 5.1 on page 26.⇓⇓Figure 13: The process of generating the jet overview, from image, to <strong>Gabor</strong>responses to overview of average response20


3.2 KJet 3 WORK DONE SO FAR3.2.4 Jet changes under image transformationTests have been conducted in order to get a feeling of how the jets changeunder the transformation of the features they represent, Figure 14 shows theeffect of rotation of a simple texture, the intensity of the squares shows themaximum response of all jets in the image at the given jet coordinate.TextureJet overview0 ◦45 ◦90 ◦Figure 14: To the left is a texture composed of two main patterns wherethe horizontal pattern has half the wavelength of the vertical. The resultingKJet representation of 4 frequencies times eight orientations is shown in theoverview to the right. The overview shows the changes as the texture isrotated clockwise through 0, 45 <strong>and</strong> 90 degrees. The sigma value is 1.0The same process can be done for a natural texture. Figure 15 on page 23shows a texture rotated counter clockwise through the angles 0 to 60 degreesin steps of 15 degrees. In the left overview the bleeding effect is lesspronounced since the sigma value is set to 2.0, this makes the filter moreprecise.Certain responses fade in <strong>and</strong> out, as the texture is rotated. This effectis caused by texture frequency components, that are not perfectly alignedwith the 8 orientations in the filter. The rotation of the texture is done insteps of 15 degrees, but the <strong>Gabor</strong> filter resolution is in 180 = 22.5 ◦ , <strong>and</strong>8therefore features, that are not aligned with the jet orientations, are ca<strong>using</strong>an aliasing effect across the rotation of the texture. This effect is not shownin Figure 14, as the rotation of the texture is perfectly aligned with the jetorientations. This problem is present in all digital sampling of systems [10,section 1.3].21


3.2 KJet 3 WORK DONE SO FARThe aliasing effect is less pronounced when <strong>using</strong> a smaller values of sigma,as it essentially blurs the response, <strong>and</strong> thereby ”bleeds” to nearby frequencies<strong>and</strong> orientations in the jet. This can be seen in the right overviews. Thedrawback of blurring the responses is the fact that the individual responsesare less distinct, so a compromise has to be made when choosing the sigmavalue.22


3.2 KJet 3 WORK DONE SO FARTexture Sigma 2.0 Sigma 0.50 ◦15 ◦30 ◦45 ◦60 ◦Figure 15: To the left is a natural texture composed of several frequencies.The resulting KJet overview to the right shows the changes as the texture isrotated through 0 to 60 degrees in steps of 15 degrees. The sigma value is0.53.2.5 Jet to jet decompositionIt is essential for the success of the project, that features on the <strong>surface</strong>can be tracked even though the <strong>surface</strong> undergoes transformations. It istherefore necessary to have a method for transforming a jet representing aknown feature in a way, that makes it possible to locate the same featureafter transformation of the feature. At this point the transformation factorintegral of formula 5 on page 9 has been implemented, along a function forgenerating the leftmost matrix of Equation 7 on page 9.Progress is at this point stuck by the fact, the existing code base is unableto h<strong>and</strong>le matrix operations like multiplication <strong>and</strong> solving of matrix equationsfor advanced datatypes like complex numbers. This has been made ahigh priority to solve this problem, as it will be beneficial for all <strong>using</strong> theMoInS codebase, that this functionality is available.23


4 TOOLS NEEDED FOR PROJECT4 Tools needed for projectThere is still a list of tools needed, to do this project, both theory <strong>and</strong>implementation wise. This section touches briefly upon the tools that havebeen identified as necessary for the the project.4.1 2D to <strong>3D</strong> conversionIn order to convert 2D coordinates in the stereo images to <strong>3D</strong> coordinates,a method for intersection of <strong>3D</strong> lines has to be used. As long as the correspondenceproblem is solved, the 2D to <strong>3D</strong> conversion should be a fairlysimple problem to solve, since its only a matter of solving simple linear equations.This is allready implemented in the existing condebase, as a part ofthe feature detection software.4.2 <strong>3D</strong> to 2D conversionAll transformation of the <strong>surface</strong>s to be tracked, happen in <strong>3D</strong>, so it is necessaryto have a method for transforming hypothesis of local <strong>3D</strong> texturepatches into 2D, for comparison with the actual perceived stereo images.A simple projection might be sufficient, but more likely a method thatconverts <strong>3D</strong> <strong>surface</strong> coordinates to 2D image coordinates, would be better[16, section 3.1, page 96, formula 5].4.3 Tracking mechanismIn order to generate an estimation of the <strong>surface</strong> topology, it is necessary tobe able to track the featurepoints, as they move in <strong>3D</strong>. This requires anotherlist of tools.4.3.1 Implementing Kalman filterIn order to track points on the <strong>3D</strong> <strong>surface</strong>, a <strong>tracking</strong> filter is to be implemented.The Kalman filter is a relatively simple <strong>tracking</strong> filter, <strong>and</strong> can withsmall modifications be used as a basis for the more complex particle filter.The theory of the Kalman filter <strong>and</strong> particle filter has been covered brieflyin sections 2.7 on page 10 <strong>and</strong> 2.8 on page 12 respectively.4.3.2 Hypothesis generatorBoth the Kalman filter <strong>and</strong> the particle filter uses a technique of prediction<strong>and</strong> verification of predictions, in order to track input. This dictates the need24


4.4 Surface generation 4 TOOLS NEEDED FOR PROJECTfor a way to generate predictions/hypothesis of where a given feature will beafter the next timestep, <strong>and</strong> what it will look like, when it has undergonea certain transformation. The change of appearance of a feature has brieflybeen covered in Section 2.5.3 on page 8.4.3.3 RBM modelHypothesis generator uses the knowledge accumulated so far, together withthe known knowledge of the transformation of the <strong>surface</strong>. This known transformationis based on the RBM of the <strong>surface</strong>, <strong>and</strong> is therefore under the domainof robotics. At this point, the knowledge of RBM of robotics is limited,so investigation in this field is necessary.4.3.4 Particle filterOnce the Kalman filter is implemented, another level of complexity can beadded, by <strong>using</strong> the implemented Kalman filter as a basis for a more advancedparticle filter. The necessity of the particle filter depends on how well theKalman filter h<strong>and</strong>les the <strong>tracking</strong> task. This again is dependent on how wellthe assumption of Gauss distributed errors hold true.4.4 Surface generationOnce the <strong>3D</strong> <strong>surface</strong> points have been tracked over sufficiently long time,<strong>and</strong> their relative positions have been determined to an acceptable level, thecollections of tracked points needs to be combined in a way that can representthe <strong>surface</strong> that is spanned between them. For <strong>surface</strong>s without occlusion,this should not be a very difficult task.One representation of the <strong>surface</strong> could be a mesh spanned by the trackedpoints. In order to generate this mesh, triangles have to be spanned betweenthe tracked points, in a way that fills out the <strong>surface</strong> between the points. Onecan think of spanning a graph between the point nodes. Knowledge of howto efficiently span this graph is needed <strong>and</strong> therefore falls under the domainof graph theory.Another simpler method is to augment a quadratic mesh of a given numberof nodes, by the known <strong>surface</strong> points. this could be done by interpolatingthe mesh points between the known <strong>surface</strong> points, <strong>and</strong> on essence ”wrapping”the mesh around the known points.The creation of the <strong>surface</strong> <strong>approximation</strong>, in its simplest form is not acomplex task, <strong>and</strong> should therefore fairly easy to solve.25


5 OPEN PROBLEMS5 Open problems5.1 Symbolic representationThe current representation of the features, as a jet of <strong>Gabor</strong> responses is bothtime <strong>and</strong> space consuming to create <strong>and</strong> store because of the dimensions ofthe jet vector is the same as the number of frequencies times the orientationsused for jet decomposition. This makes any search in the jet representationfor the best matching jet very time consuming.It might be beneficial to find another representation, based on the existingjet representation, that compresses the jet representation, in such a way thatthe search problem is reduced. One such representation could be to only usea list of coordinates, that have a response over a certain threshold.Since low responses are more likely to be either ”bleeding” from a nearbyresponse, or a very indistinguishable feature. In both cases information isremoved, the hope is, that because of the insignificance of the response thisremoving of information does not matter. This method of information compressionis similar to the wavelet compression method [11, 8.5.3, Page 486]26


6 CONCLUSION6 ConclusionThe task of generating a <strong>surface</strong> <strong>approximation</strong> <strong>using</strong> <strong>tracking</strong> of texturepoints, is not a trivial one. A proposal of a solution has been formulated,<strong>and</strong> a list of the tools needed to complete the task, both implemented <strong>and</strong>yet to come, have been covered in this report.A few c<strong>and</strong>idate technologies for correspondence problem reduction, 2Dto <strong>3D</strong> conversion <strong>and</strong> <strong>tracking</strong> of texture points, have briefly been covered.<strong>Gabor</strong> <strong>filters</strong> have shown themselves as a solid tool for h<strong>and</strong>ling the 2Dto <strong>3D</strong> correspondence problem, <strong>and</strong> will therefore be the basis for furtherdevelopment.Kalman <strong>filters</strong> are conceptually the simplest <strong>tracking</strong> filter, this will bethe basis for developing the <strong>3D</strong> <strong>tracking</strong> system. The Kalman filter, is able tofunction as the basis for particle filtering, with a few modifications. Thereforethe particle filtering method might be beneficial to explore, in the case thatKalman filtering is not sufficient.Jesper Juul Henriksen27


ATIMETABLEATimetableThis section contains a timetable, showing the estimated time needed tocomplete the different sub tasks yet to be completed.Figure 16: Timetable for the masters project28


REFERENCESREFERENCESReferences[1] Abhijit S. Ogale <strong>and</strong> Yiannis Aloimonos: Stereo correspondence withslanted <strong>surface</strong>s: critical implications of horizontal slant, Paper, Centerfor Automation Reasearch, University of Maryl<strong>and</strong>.[2] Martin Kampel: Shape from Structured Light, Short web tutorial,http://www.prip.tuwien.ac.at/Research/<strong>3D</strong>Vision/struct.html, Lastmodified: Tuesday, 12-Sep-2000 13:58:02 CEST[3] Norbert Krüger <strong>and</strong> Florentin Wörgötter: Multi-modual Primitivesas functional Models of Hyper-colums <strong>and</strong> their use for contextual Integration,Paper, Published in the proceedings of the workshop Brain,Vision <strong>and</strong> Artificial Intelligense symposium 2005.[4] Norbert Krüger, Marcus Ackermann <strong>and</strong> Gerald Sommer: Accumulationof Object Representation utilizing Interaction of Robot Action<strong>and</strong> Perception, Paper, Published in Knowledge Based Systems 15:111-118, 2002.[5] Javier R. Movellan: Tutorial on <strong>Gabor</strong> Filters, Tutorial paperhttp://mplab.ucsd.edu/tutorials/pdfs/gabor.pdf.[6] Michael Lindenbaum, Roman S<strong>and</strong>ler: <strong>Gabor</strong> Filter Analysis for TextureSegmentation, Technical Report CIS-2005-05 - 2005, Technion -Computer Science Department.[7] Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger <strong>and</strong> Christophvon der Malsburg: Face Recognition by Elastic Bunch Graph Matching,Internal Report 96-08, ISSN 0943-2752, Institut für Neuroinformatik,Ruhr-Universität Bochum.[8] Thomas Mauer: Erkennung gedrehter Gesichter: von Einzelbildernzu Sequenzen, Paper, Ruhr-Universität Bochum, December 1998.[9] Song-Shun Zhu, Cheng-En Guo, Yizhou Wang <strong>and</strong> Zijian Xu:What are Texons?, Paper, International Journal of Computer Vision62(1/2). 121-143, 2005.[10] Sophocles J. Orfanidis: Introduction To Signal Processing, ISBN 0-13-209172-0.[11] Rafael C. Gonzalez <strong>and</strong> Richard E. Woods: Digital Image Processing,Second edition ISBN 0-13-094650-8.29


REFERENCESREFERENCES[12] P. Kruizinga, N. Petkov <strong>and</strong> S.E. Grigorescu: Comparison of texturefeatures based on <strong>Gabor</strong> <strong>filters</strong>, Paper, Institute of Mathmatics<strong>and</strong> Computing Science, University of Groningen. Publiched in Proceedingsof the 10th International Conference on Image Analysis <strong>and</strong>Processing.[13] Greg Welch <strong>and</strong> Gary Bishop: An Introduction to the Kalman Filter,Department of Computer Science, University of North Carolina atChapel Hill, Publiched at SIGGRAPH 2001[14] Erik Cuevas, Daniel Zaldivar <strong>and</strong> Raul Rojas: Kalman filter for vision<strong>tracking</strong> Technical Report B 05-12, publiched 10 August 2005[15] Sebastian Thun, Michael Montemerlo, Daphne Koller, Ben Wegbreit,Juan Nieto <strong>and</strong> Eduardo Nebot: FastSLAM: An Efficient Solution tothe Simultanious Localization And Mapping Problem with UnknownData Association, Paper, Computer Science Department, St<strong>and</strong>fortUniversity, Australian Center for Field Robotics, The University ofSydney, Australia.[16] Corentin Massot <strong>and</strong> Jeanny Hérault: Recovering the Shape fromTexture Using Lognormal Filters, Paper, Laboratory of Image <strong>and</strong>Signals, Grenoble, France[17] http://www.covig.imi.aau.dk/30

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!