ppt_learning_Stereo_Vision

原文链接

Stereo Vision:Algorithms and Applications

Author:Stefano Mattoccia

Lab:University of Bologna

包括：1.介绍 2.综述 3.匹配算法 4.计算优化 5.硬件实现 6.应用

1.Intro

Target:Stereo Vision的目的在于从两个或以上的摄像机获取深度信息

双目视觉系统
密度立体算法
立体视觉应用

定义:Epipolar constraint(极几何约束)

对于图像R，P和Q在同一点上，近的掩盖远的
对于图像T，P和Q映在p和q上，红线 $PO_R$ 落在绿线pq的同一平面 $\Pi_T$上，这个称为极约束

定义：视差（Disparity）

经过三角形相似性原理可以推导得到

定义：双目视界（Horopter）

The range field of system is constrained by disparity range $[d_{min},d_{max}]$
一般可以离散化视差值，较好的离散化是通过subpixel方法可得
若用5个离散化的值，则可设置 $[d_{min},d_{min}+4]$

2.Overview of stereo vision system

2.1 Calibration(offline)

target:finding parameters of the camera system

Intrinsic parameters of two cameras:Focal length,image center,lenses distortion
Extrinsic parameters R and T aligns of two cameras
Methods:用10+对已知的立体点匹配，最典型用checkerboard
可用Opencv,Matlab[1],详细的标定方法可参考资料[2,3,4]

2.2 Rectification

target: Adjust stereo camera in standard form
steps: a) removes lens distortions
b) turns the stereo pair in standard form

2.3 Stereo Correspondence

target: finding homologous points in stereo pair, generate disparity map

2.4 Triangulation

target: Calculate the position of the correspondence in the 3D

$Z=\frac{b*f}{d}$ $X=Z\frac{x_R}{f}$ $Y=Z\frac{y_R}{f}$

Relevant:

Datasets:stereo sequences
Including

calibration parameters
original sequences
rectified sequences
disparity maps

Middlebury stereo evaluation
提供了一个框架和一个数据集来测试新颖方法的性能和效果基线

待克服的难点：The common pitfalls make the stereo correspondence so challenging: Photometric distortions and noise, Specular surfaces, Foreshortening, Perspective distortions, Uniform/Ambiguous regions, Repetitive patterns, Transparent objects, Occlusions and discontinuities

3.The correspondence problem

由于[5]，大部分立体算法基于一下步骤：

1.匹配代价计算（Matching cost computation）
2.代价聚合(Cost Aggregation)
3.视差计算/优化(Disparity computation/Optimization)
4.视差精细(Disparity refinement)
Local:1->2->3 (用WTA策略)
Global:1(->2)->3 semi-global

3.1 预处理（0）

典型方法：Laplacian of Gaussian(LoG)滤波[6],邻域均值去除[7],Bilateral Filtering[8]

优化简单的像素比对方法:

Local 用窗口像素代价聚合减小 Signal to noise ratio(SNR)
Global 最小化代价函数，优化Pixel-based的代价匹配

$$E(d)=E_{data}(d) + E_{smooth}(d)$$

3.2 匹配代价计算(1)

3.2.1 单值匹配

绝对值差
$e(x,y,d) = |I_R(x,y)-I_T(x+d,y)|$
平方差
$e(x,y,d) = (I_R(x,y)-I_T(x+d,y))^2$
鲁棒方法
限制outliers的影响，如Truncated Absolute Differences(TAD)
$e(x,y,d) = min \{ |I_R(x,y)-I_T(x+d,y)|,T \}$
3.2.2 区域匹配
绝对和（SAD）
$C(x,y,d) = \sum_{x\in S}|I_R(x,y)-I_T(x+d,y)|$
平方差和（SSD）
$C(x,y,d) = \sum_{x\in S}(I_R(x,y)-I_T(x+d,y))^2$
截断绝对差和（STAD)
$C(x,y,d) = \sum_{x\in S}\{|I_R(x,y)-I_T(x+d,y)|,T\}$
其他一些方法：Normalized Cross Correlation[9], Zero mean Normalized Cross Correlation[10], Gradient based MF[11], Non parametric[12,13], Mutual Information[14]

3.3 代价聚合(2)

固定窗口(FW):
弊端：1.假设图像正面平行 2.忽略深度不连续性 3.不能解决均匀区域 4.无法解决重复区域
优势：简单，使用，实时，不占内存空间，硬件耗电小
优化方法：1.积分图Integral Images（II 1984） 2.箱过滤Box-Filtering(BF 1981) 3.Single Instruction Multiple Data(SIMD)

两者比较：每个点都需要四次运算，II可以处理不通形状的窗口，但需要更多内存，容易溢出，因为若果图片大小为S，则需要$S^2$。不同的BF可以参看[15]

3.3.1 一个具体的示例（LIVE DEMO）[16][17]

- 灰度图
  - 预处理：均值提取
  - 代价匹配： 绝对差
  - 聚合代价： 固定窗口（FW）
  - 视差选择： 胜者为王(WTA)
  - 局外点优化
  - 抛弃均匀区域
  - 优化：BF+SIMD
  - 像素插值1/16对于每个像素
  - 实时运行在普通PC上

3.3.2 一些好的方法
Shiftable Windows[18], Multiple Window[19], Variable Window [20], Segmentation based(Assume each segment smoothly)[21], Bilateral Filtering[22], Adaptive Weights[23], Segment Supoort[24], Fast Aggregation[25], Fast Bilateral Stereo(FBS)[26], Locally Consistent(LC) stereo[27]

3.4 视差计算/优化(3)

目的：寻找最佳的视差点，最小化代价函数
由于是NP-Hard问题，可以用近似的方法求解

Graph Cuts[28]
Belief Propagation[29]
Cooperative optimization[30]
一些能量最小化的方法可以参看PAMI期刊文章[31]
Dynamic Programming(DP) [5]
Scanline Optimization(SO) [32]
SO + Support Aggregation [33]
Enforcing Local Consistency of disparity in SO/DP [34]

3.5 视差精细(4)

匹配算法包括一些outliers，必须识别并纠正
由于细化了像素级别，需要优化视差的精确性
3.5.1 Sub-pixel插值
计算临近插值最小的值，[35]提出了floating-point free方法
3.5.2 图像滤波
中值滤波，形态操作，BF
3.5.3 双向匹配（Bidirection Matching,BM）
用于检测outliers[36],左右匹配差异小于T，一般T为1
$$|d_{LR}(x,y) - d_{RL}(x+d_{LR}(x,y),y)|<T$$
3.5.4 单向匹配步骤(Single Matching Phase,SMP)[37]
3.5.5 分割方法
Segmentation based outliers identification and replacement
基于两个假设:1.每段分割片的视差变化平滑 2.每个分割面近似在同一平面
对于每个切割面都在3D平面,满足 $d(x,y)=ax+by+\gamma$
对于这样鲁棒的平面，可以使用方法：RANSAC[38]和Histogram Voting[30]
3.5.6 Accurate Localization of borders and occlusions[39]

4.应用

3D跟踪：物体计数、监控轨迹、安保
扫描，2D和3D转化，增强现实

列出的下列Reference请耐心仔细阅读，可参考原PPT的概括进行参考阅读。由于PPT是2012更新的，一些2010后的前沿方法请另外查询，从CVPR、PAMI等会议或MiddleBury的测评网站上查询阅读。

[1] Jean-Yves Bouguet , Camera Calibration Toolbox for Matlab
[2] E. Trucco, A. Verri, Introductory Techniques for 3-D Computer Vision, Prentice Hall, 1998
[3] R.I.Hartley, A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, 2000
[4] G. Bradsky, A. Kaehler, Learning Opencv, O’Reilly, 2008
[5] D. Scharstein and R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
Int. Jour. Computer Vision, 47(1/2/3):7–42, 2002
[6] T. Kanade, H. Kato, S. Kimura, A. Yoshida, and K. Oda, Development of a Video-Rate Stereo Machine
International Robotics and Systems Conference (IROS ‘95), Human Robot Interaction and Cooperative Robots, 1995
[7] O. Faugeras, B. Hotz, H. Mathieu, T. Viville, Z. Zhang, P. Fua, E. Thron, L. Moll, G. Berry,
Real-time correlation-based stereo: Algorithm. Implementation and Applications, INRIA TR n. 2013, 1993
[8] A. Ansar, A. Castano, L. Matthies, Enhanced real-time stereo using bilateral filtering
IEEE Conference on Computer Vision and Pattern Recognition 2004
[9] S. Mattoccia, F. Tombari, L. Di Stefano, Fast full-search equivalent template matching by Enhanced Bounded
Correlation, IEEE Transactions on Image Processing, 17(4), pp 528-538, April 2008
[10] L. Di Stefano, S. Mattoccia, F. Tombari, ZNCC-based template matching using Bounded Partial Correlation
Pattern Recognition Letters, 16(14), pp 2129-2134, October 2005
[11] F. Tombari, L. Di Stefano, S. Mattoccia, A. Galanti, Performance evaluation of robust matching measures
3rd International Conference on Computer Vision Theory and Applications (VISAPP 2008)
[12] R. Zabih, J John Woodll Non-parametric Local Transforms for Computing Visual Correspondence, ECCV 1994
[13] D. N. Bhat, S. K. Nayar, Ordinal measures for visual correspondence, CVPR 1996
[14] H. Hirschmüller. Stereo vision in structured environments by consistent semi-global matching.
CVPR 2006, PAMI 30(2):328-341, 2008
[15] Changming Sun, Recursive Algorithms for Diamond, Hexagon and General Polygonal Shaped Window Operations
Pattern Recognition Letters, 27(6):556-566, April 2006
[16] L. Di Stefano, M. Marchionni, S. Mattoccia, A fast area-based stereo matching algorithm, Image and Vision Computing,
22(12), pp 983-1005, October 2004
[17] L. Di Stefano, M. Marchionni, S. Mattoccia, A PC-based real-time stereo vision system, Machine Graphics & Vision,
13(3), pp. 197-220, January 2004
[18] D. Scharstein and R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
Int. Jour. Computer Vision, 47(1/2/3):7–42, 2002
[19] H. Hirschmuller, P. Innocent, and J. Garibaldi, Real-time correlation-based stereo vision with reduced border errors
Int. Journ. of Computer Vision, 47:1–3, 2002
[20] O. Veksler. Fast variable window for stereo correspondence using integral images, In Proc. Conf. on Computer Vision
and Pattern Recognition (CVPR 2003), pages 556–561, 2003
[21] M. Gerrits and P. Bekaert. Local Stereo Matching with Segmentation-based Outlier Rejection
In Proc. Canadian Conf. on Computer and Robot Vision (CRV 2006), pages 66-66, 2006
[22] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In ICCV98, pages 839–846, 1998
[23] K. Yoon and I. Kweon. Adaptive support-weight approach for correspondence search IEEE PAMI, 28(4):650–656, 2006
[24] F. Tombari, S. Mattoccia, L. Di Stefano, Segmentation-based adaptive support for accurate stereo correspondence IEEE Pacific-Rim Symposium on Image and Video Technology (PSIVT 2007)
[25] F. Tombari, S. Mattoccia, L. Di Stefano, E. Addimanda, Near real-time stereo based on effective cost aggregation International Conference on Pattern Recognition (ICPR 2008)
[26] S. Mattoccia, S. Giardino,A. Gambini, Accurate and efficient cost aggregation strategy for stereo correspondence
based on approximated joint bilateral filtering, Asian Conference on Computer Vision (ACCV2009)
[27] S. Mattoccia, A locally global approach to stereo correspondence, 3D Digital Imaging and Modeling (3DIM2009)
[28] V. Kolmogorov and R. Zabih, Computing visual correspondence with occlusions using graph cuts, ICCV 2001
[29] A. Klaus, M. Sormann and K. Karner, Segment-based stereo matching using belief propagation and a self-adapting
dissimilarity measure, ICPR 2006
[30] Z. Wang and Z. Zheng, A region based stereo matching algorithm using cooperative optimization, CVPR 2008
[31] R.Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, C. Rother, A Comparative
Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors, IEEE Transactions on
Pattern Analysis and Machine Intelligence, 30, 6, June 2008, pp 1068-1080
[32] H. Hirschmüller. Stereo vision in structured environments by consistent semi-global matching.
CVPR 2006, PAMI 30(2):328-341, 2008
[33] S. Mattoccia, F. Tombari, and L. Di Stefano, Stereo vision enabling precise border localization within a scanline
optimization framework, ACCV 2007
[34] S. Mattoccia, Improving the accuracy of fast dense stereo correspondence algorithms by enforcing local consistency of disparity fields, 3DPVT2010
[35] L. Di Stefano, S. Mattoccia, Real-time stereo within the VIDET project Real-Time Imaging, 8(5), pp. 439-453, Oct. 2002
[36] P. Fua, Combining stereo and monocular information to compute dense depth maps that preserve depth discontinuities 12th. Int. Joint Conf. on Artificial Intelligence, pp 1292–1298, 1993
[37] L. Di Stefano, M. Marchionni, S. Mattoccia, A fast area-based stereo matching algorithm, Image and Vision Computing,
22(12), pp 983-1005, October 2004
[38] M. A. Fischler and R. C. Bolles, Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image
Analysis and Automated Cartography, Comm. of the ACM 24: 381–395, June 1981
[39] S. Mattoccia, F. Tombari, and L. Di Stefano, Stereo vision enabling precise border localization within a scanline optimization framework, ACCV 2007