View Synthesis of Symmetric Objects using Examples - International ...

View Synthesis of Symmetric Objects using ExamplesVisesh Chari P. J. Narayanan C. V. JawaharCentre for Visual Information TechnologyInternational Institute of Information Technology{visesh@research.,pjn@,jawahar@}iiit.ac.inAbstractTraditional image based rendering methods synthesizenew views from a few input views of an object frompositions defined by the input views. In this paper, weextend this approach using information from views ofother “similar” objects. We pose the view synthesisproblem as an image completion problem using a suitablerepresentation of the scene. We specifically showhow a database of images of symmetric objects can beused to generate new views of the input object that areimpossible with other image based rendering methods.This is done by cleverly sampling the Plenoptic functionof objects in a mosaic so that information about its symmetryin 3D is preserved. Algorithms based on texturetransfer are further used to refine the solution. Resultsshow the usefulness of our approach.1. IntroductionImage Based Rendering (IBR) has been a widely researchedarea in recent years [13]. Since typical applicationsof IBR include virtual reality, animations, etc.,generating high quality images is of prime importance.Accomplishing this task is difficult because it requirescomplete knowledge of the geometry and photometryof the object. Additionally, producing high quality resultsseem to require elaborate camera setups [8] or producesvisible holes in the novel view because of occlusion[13]. In this paper, we present a method to computenovel views of nearly symmetric objects from positionsimpossible to current methods: views that require informationnot available in the input sequence. This is doneby transferring semantic information from a database of“similar” objects, for which it is affordable to capturemultiple views with precision.IBR methods pertaining to 3 basic categories havebeen proposed earlier. Techniques based on stereo reconstruction[13] and volumetric techniques [7] rely onFigure 1: Traditional IBR vs Our Approach. Whiletraditional IBR use information only from input sequences,we transfer Semantic information from similarscenes too.accurate and complete knowledge of the geometry ofthe scene for rendering purposes. In contrast to thesegeometry based approaches, methods based on samplingthe Plenoptic function [8] aim at capturing all raysrequired for rendering views within a specified 3D volume.A hybrid approach where geometry is implicityused to generate images at novel view points is viewdependent geometry [5].These techniques, however, only use infomationavailable in the input image/depth map sequence,thereby limiting the gamut of possible reconstructions.By augmenting the current scene using “similar” scenesthat might be available to us, this bottleneck may beovercome. Advantages of such an approach are threefolds.The availability of more information increasesthe gamut of possible novel views. Secondly, only thedatabase of “similar” scenes needs to be carefully cap-

tured. Thirdly, the presence of a database provides severalalternative solutions in the absence of knowledgeabout the scene’s actual content (Figure 7) Effectively,where other IBR approaches describe algorithms forview interpolation, we describe an algorithm for viewextrapolation: the generation of plausible views givenan input image sequence.We demonstrate our approach using a representationthat is a hybrid of Plenoptic modeling and stereo reconstructionbased methods [11]. Sections 2 discuss theproblem and our approach, and we show how the representationof [11] allows us to transfer information likesymmetry and texture across objects in Section 3. Detailsof implementation are presented in Sections (3,3.1)and results in Section 4. Finally we enumerate futuredirections of research and conclude with Section 5.Figure 2: Symmetry extraction: Top row: Original image,difference between original and flipped images, opticalflow information between the original and flippedversion computed using a variation of [9]. Bottom row:A frontal image formed by using symmetry informationfrom different faces.2. Problem DefinitionGiven a collection of n 2D images I 1...n ={I 1 , . . . , I n } and their corresponding depth mapsD 1...n = {D 1 , . . . , D n } of an object, we wish to infera new view V, all of whose information cannot besupplied by the input images and depth maps. Additionalinformation in the form of knowledge from similarscenes is available to us. Currently, let us denote thisinformation by S 1...m = {S 1 , . . . , S m }, where m is thenumber of scenes available.Since much of the information available in the inputsequence is redundant, it would be advantageousto have a representative mosaic of the scene. We usea Multiple Centre of Projection (MCOP) image [11]to represent the information about the geometry ofthe symmetric object, since this representation easilycaptures semantic information like the symmetry of ascene (Section 3.2), while allowing texture transfer approachesto augment or replace information in any partof the scene [4]. A limited number of images of a sceneproduces an incomplete MCOP image, and producingnovel views of a scene now amounts to completing thisimage. We use a nearest neighbhor approach to findMCOP images similar to the input sequence from thedatabase, and then transfer semantic information fromthem to complete the input MCOP image.3. RepresentationsWe use methods from the Plenoptic modeling literatureto find a representation of the entire scene usinga single image. Such images have been given manynames, one of them being Multiple Centre of Projection(MCOP) Images [11], and are the collection of lightrays and range maps computed using a slit camera anda laser range finder, moving along a pre-defined paththrough the scene. The 3D model to be rendered is nowgiven by the equation [11]( xyz)= δ i,j[Uix V ix O ixU iy V iy O iyU iz V iz O iz] ( ij1)+(Cix)C iyC izwhere for each pixel (i, j), C i represents the centreof projection, O i represents a vector from C i to the imageplane origin, U i represents the horizontal axis ofthe projection plane, and V i represents the vertical axis.Each pixel’s depth is represented by δ i,j .3.1. Semantic MatchingIn order to transfer semantic information, we firstpick an MCOP image from the database, that is similarin a suitable feature space to the incomplete inputMCOP image. First, a feature vector containing grayvalue and gradient information for each MCOP image(texture and depth) is constructed. To ensure that colorsimilarity is not given importance, a weighted cost functionof the feature distance between input and databaseimages is minimized.mini∑ ∑sΩW ∗ (F sinput − F si )2 (2)where s denotes the scale, Ω denotes the region ofinterest, W denotes the weight matrix and F denotesthe feature vector for each pixel of the image.(1)

3.3. Texture Transfer(a) Input(b) Database image(c) Symmetry transferFigure 3: Symmetry Transfer: (a) The input to symmetrybased hole filling. (b) The database image used totransfer symmetry. (c) The generated result.The MCOP images generated using few images willcontain holes (Figure 3a). Symmetry based methodscan be used to fill some of these holes. In some caseshowever, enough information is not available for symmetrybased methods to produce novel views withoutholes (Figure 3c). Thus, methods based on texturetransfer from other images have to be employed.Given the closest MCOP image(s) to the input as perequation 2, we transfer appropriate texture from thedatabase image and use to method of [10] to blendthe textures from different images into the currentone [6](Figure 3b). Unlike [6], pixels belonging to theMCOP image are left unmodified by setting them ashard constraints. To ensure consistency, all the texturetransferred to fill a hole is taken from one image andblended using a poisson solver [2].3.2. SymmetryAny symmetry in a 3D object is captured by its correspondingPlenoptic function. Since MCOP imagescapture a sample of the Plenoptic function, by cleverlydesigning the path of the camera employed to capturethe desired MCOP image, we may capture the symmetryproperty in an image.Real world objects are however never perfectly symmetric(Figure 2). This asymmetry is sometimes differentfor objects of the same class (eg. faces). By registeringthe different parts of the object with each other,this slight deviation from symmetry may be capturedaccurately. We register an MCOP image with its mirrorimage by using a variation of the optical flow algorithmin [9] presented in [12]. The only modification we addis to weight each of the different channels according toour need. In essence we try to find the flow that minimizesthe following functional∫F low(u, v) = (W ∗ Ψ(I [k] (x + w) − I [k] (x))) 2 (3)Ω+Ψ(|∆u| 2 + |∆v| 2 )Once this registration information is given, imagesof the whole object may be generated from its partialviews. In the absence of this information, it may beborrowed from other objects of the same class, as shownin Figure 2.Figure 4: Texture Transfer: Result of grafting a noseonto the result in Figure 3c4. ResultsResults for symmetry and texture synthesis areshown in Figure 6. Here, the nose of the subjectwas removed from the depth map before capturingits various views into an MCOP image. Figure 3ashows the MCOP image after symmetry informationwas transferred from the database. The remainingtexture gap was then filled using the subject in Figure4. Novel views of the subject are then rendered(Figures 6a, 6b, 6c). Additional results may be foundhere [1].An advantage of such augmentation of informationis the variety of modifications that are also available tothe user. Figure 5 shows one such result. We call this“symmetrizing” an object, in the sense that the inherentasymmetry present in the object is removed. This isdone by computing the optical flow between an MCOPimage and its mirror image, and warping the MCOPimage with the flow. Then, the warped MCOP image is

5. Conclusions(a)Figure 5: (a): Original image. (b) Symmetrized result.Notice the change in the right half of the face.(b)We presented a novel framework to extend view synthesisto situations where enough information for renderingthe entire scene is not given as input. Semanticinformation is taken from images of similar objects orscenes that are available in the database to synthesizeviews that are impossible for traditional methods. Webelieve this paper is an important first step towards theuse of semantic information from a known database ofimages into view synthesis.References(a) (b) (c)Figure 6: Symmetry and Texture transfer: Novel viewsof the scene generated in Figure 4.merged with its intact mirror image to produce a “symmetric”version of the person.Finally, any missing information has several alternatesources to be borrowed from. Figure 7 shows thevariations of the same object produced with informationtaken from the first 5 nearest neighbors. As can be seen,different sources not only produce change in texture information,but also produce changes in the geometry ofthe object. All the results in this paper are on the USFHuman ID 3D Face Database [3].Figure 7: Various variations of the nose, mouth andears of a person by transferring information from thedatabase.[1] http://research.iiit.ac.in/∼visesh/myweb/Publications/ibr.pdf.[2] A. Agrawal, R. Raskar, and R. Chellappa. What is therange of surface reconstructions from a gradient field. EuropeanConference on Computer Vision (ECCV), 2006.[3] V. Blanz and T. Vetter. A morphable model for the synthesisof 3d faces. In SIGGRAPH ’99: Proceedings of the26th annual conference on Computer graphics and interactivetechniques, pages 187–194, New York, NY, USA,1999. ACM Press/Addison-Wesley Publishing Co.[4] A. Efros and W. T. Freeman. Image quilting for texturesynthesis. ACM SIGGRAPH, pages 341–346, 2001.[5] A. W. Fitzgibbon, Y. Wexler, and A. Zisserman. Imagebased rendering using image based priors. InternationalConference on Computer Vision (ICCV), 2003.[6] J. Hays and A. A. Efros. Scene completion using millionsof photographs. In SIGGRAPH ’07: ACM SIGGRAPH2007 papers, page 4, New York, NY, USA, 2007. ACM.[7] K. N. Kutulakos and S. M. Seitz. A theory of shapeby space carving. Int. J. Comput. Vision, 38(3):199–218,2000.[8] M. Levoy and P. Hanrahan. Light field rendering. In SIG-GRAPH ’96: Proceedings of the 23rd annual conferenceon Computer graphics and interactive techniques, pages31–42, New York, NY, USA, 1996. ACM.[9] N. Papenberg, A. Bruhn, T. Brox, S. Didas, and J. Weickert.Highly accurate optic flow computation with theoreticallyjustified warping. International Journal of ComputerVision, 67(2):141–158, 2006.[10] P. Pérez, M. Gangnet, and A. Blake. Poisson image editing.In SIGGRAPH ’03: ACM SIGGRAPH 2003 Papers,pages 313–318, New York, NY, USA, 2003. ACM.[11] P. Rademacher and G. Bishop. Multiple-center-ofprojectionimages. In SIGGRAPH ’98: Proceedings of the25th annual conference on Computer graphics and interactivetechniques, pages 199–206, New York, NY, USA,1998. ACM.[12] P. sand and S. teller. Particle video. IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2006.[13] D. Scharstein. View synthesis using stereo vision. volume1583 of Lecture Notes on Computer Science (LNCS),Springer-Verlag New York, Inc.

View Synthesis of Symmetric Objects using Examples - International ...

Create successful ePaper yourself

Delete template?

Save as template?