GPU Acceleration of Structure-from-Motion Pipeline

报告人:刘鑫

SFM系统及其时间复杂度分析 

利用GPU加速的解决方案 

特征点检测 

特征点匹配 

bundler 

已完成的工作及存在的问题 

进一步的工作

Bundler 

•特征点检测 

•特征点匹配 

•Bundle 

Adjustment 

•SBA 

PMVS 

•Matching 

•Expansion 

•Filtering 

PSR和纹理生成 

•Possion Surface 

Reconstruction 

•纹理生成

输入:图像序列 

输出:恢复相机参数(f,k1,k2)、相机位置估计 

(R,t),并得到离散的3D场景点信息(位置、 

颜色、在图像上可见信息) 

特征点检测 

• SIFT特征检 

测 

• SIFT描述子 

特征点匹配 

• ANN 

• 两两匹配 

Bundle 

Adjustment 

• SBA (Sparse 

Bundle 

Adjustment) 

Photo tourism: Exploring photo collections in 3D (ACM Transactions on Graphics 2006) 

Noah Snavely, Steven M. Seitz, Richard Szeliski.

Data 罗睺寺洛阳体育 

中心 

黛螺顶法雨寺应县木塔 

Number of Pic. 105 237 240 290 349 

Total Running Time 5.5 16 35 38 3.2day 

Bundler Detecting

Feature Detecting 

寻找所有图像的特征点 O(n) 

Feature Matching 

获得任意两幅图像的匹配,作n 2 /2次match 

时间复杂度为 O(n 2 ) 

BA 

O(mn(m+2n) 2 ) (m:feature,n:image) 

O(n 7 ) 

Efficient Bundle Adjustment with Virtual Key Frames: A Hierarchical Approach to Multi- 

Frame Structure from Motion (CVPR 1999) 

H.-Y Shum, Q.Ke….

SBA 

观察:pic\捕获1.PNG pic\捕获2.PNG 

时间复杂度:O(n 4 ) 

应为最耗时部分 

PMVS 

Scene Reconstruction and Visualization from Internet Photo Collections 

Keith. N Snavely

结论 

最耗时部分为Match和BA; 

目标 

对SFM系统加速 

用GPU对Match和BA模块加速 

保持重建的结果 

如何评价?Accuracy, Completeness (Coverage), 

Run Time (Size, Compactness) (Snavely) 

实验评估?(camera定位的精确性)

CPU: 8 

Memory: 32GB 

GPU 

C1060: 240 cores, 4GB 

Qudro FX 570: 30 cores, 512M 

Compare to (Cloudless Day) 

4 CPU, 4 GPU, 48GB memory 

10 threads, one thread for each CPU and GPU 

Building Rome on a Cloudless Day 

Jan-Michael Frahm…

目标 

加速Matching和BA模块,并行化改造 

思路 

向上:降低模块的时间复杂度。 

Skeletal Graphs (Canonical View) 

Out of Core BA 

向下: 

加快两两特征点匹配的速度。(GPU) 

加快SBA的速度。(GPU)

Detecting 

(down/all 

features) 

Matching 

(down 

features) 

Skeletal 

Graph 

Detailed 

Matching 

(all 

features) 

Out of 

Core SBA 

SBA

Step 1 : Detecting 

GPU Sift Detecting 

All Features: 30000 左右 

Down Features: 5000 左右 

Step 2 : Matching 

GPU Matching. 

对所有图像进行两两匹配,O(n 2 ). 

利用down features. 

Step 3 : Skeletal Graph 

Canonical View

Step 4 : Detailed Matching 

GPU Matching. 

利用Skeletal Graph进行两两匹配. 

利用all features. 

Step 5 : Out of Core BA 

GPU SBA. 

Out of core, 分两层使用SBA。 

循环. 

Step 6 : SBA for whole Graph 

GPU SBA

Shape Descriptor 

HOG ? 

Gist features (cloudless day) 

Appearance Descriptor 

Feature Descriptors (SIFT) 

Combining efficient object localization and image classification (ICCV 09) 

Hedi Harzallah, Frederic Jurie, Cordelia Schmid

其他表示 

Feature Descriptors + Visual Words + Verification 

(in a day) 

Gist Features (small code) + Visual Words (cloudless 

day) 

SFM系统分析 

Not millions of Pictures, but thousands of Pictures. 

高分辨率图像。 

Down Features 

Down Sample/Random/Pyramid …

高分辨率图像特征点检测 

SiftGPU (cuda) 

效率 

找点方式平均找点时间 

GPU Tesla C1060 1.795s 

GPU Quadro FM580 2.806s 

Tesla+Quadro 1.4~~1.6s 

1 CPU 80~~100s 

8 CPU 8~~9s

GPU Surf Descriptor (64) 

Matching Time; 4? 

SBA Time; >2 

Feature Descriptor 

All Features (30000) 

Down Features (5000) 

Down Sample/Random/Constraints/Pyramid…. 

Thousands 

Total Time

Using Down Features 

作两两匹配 

时间复杂度分析 O(n 2 ) 

GPU Matching Time 

4000*4000 0.07s 

Surf Descriptor 

Multi GPU 

Total Time 1.5h (400)

目标: 

Caninocal View Set 

Submap Sets 

输入 

无向图 

节点是Image, 边是Maching的结果

自上而下 

Connected Dominating Set 

Greedy (最大生成树) 

Graph Cut 

K-means 

自下而上 

Clustering (CMVS) 

Agglomerative Cluster

Skeletal graphs for efficient structure from 

motion (CVPR 08). Snavely. 

Spectral Partitioning for Structure from Motion 

(ICCV 03). Drew Steedly. 

Towards Internet-scale Multi-view 

Stereo.(CVPR 10). Yasutaka Furukawa. 

Structure and Motion Pipeline on a hierarchical 

Cluster tree (). Michela Farenzena. 

……

Algorithm 

Step 1: Selecting Canonical View Set 

Step 2: Submap Set 

Step 3: Repeat 1 2 

时间复杂度分析 

Minutes

Using All Features。 

按照Skeletal Graph的边做两两匹配。 

Verification: 与上一步循环。 

时间复杂度分析 O(n) 

GPU Matching Time 

GPU KD-Tree 

KD Tree 

Total Time 2-3h

Step 1: Canonical View Set SBA 

如何加入image? 

Step 2: Submap Sets SBA 

固定Canonical View 参数,SBA; 

对整个Set 进行SBA; 

Step 3: 

Repeat Step 1 2 until convergence 

Out-of-Core Bundle Adjustment for Large-Scale 3D Reconstruction (ICCV 07) 

Kai Ni, Drew Steedlyy, and Frank Dellaert

GPU SBA 

ECCV2010, no code; 

对整个Graph做SBA 

时间复杂度分析 

可对每个Set并行化做SBA 

O(n 3 )?? 

时间估计。4-5h?? 

Practical Time Bundle Adjustment for 3D Reconstruction on the GPU (ECCV 2010) 

Siddharth Choudhary, Shubham Gupta, and P J Narayanan

Sift Detector O(n) 

Sift Feature Detector on GPU 

由SIFTGPU改写,以适合大分辨率图像。 

Feature Match 

由SIFTGPU改写,以适合大分辨率图像。 

GPU KD-Tree Traversal (待改进) 

Pthread + GPU 

Multi GPUs 控制类 

GPU Surf Descriptor (改写中)

SBA on GPU 

ECCV2010, no code; 

Float or Double 

用Float 做循环,然后用Double做循环 

Fermi GPU 

GPU加速 

SBA 内部函数

KD-Tree Travesal on GPU 

每个block找一个点 

Surf Descriptor 

实验评估 

camera定位的精确性? 

CPU和GPU的并行 

Feature Detecting 

Visual Words

Skeletal Graph 

Canonical View set连通性 

Submap Set的大小 

Canonical View的意义与评估 

Scene Summarization for Online Image Collection. 

Inn Simmon, Noah Snavely…

进一步的并行化 

连接3台有GPU的机器,将任务分配 

MPI +Pthread + CUDA 

问题 

对PMVS的GPU加速 

Bundler并行化结束后,PMVS有可能成为影响速度的 

瓶颈。

GPU Acceleration of Structure-from-Motion Pipeline

Create successful ePaper yourself

Delete template?

Save as template?