VisualComputing_1

老板的CV课程,在期末前做一下相关笔记总结

Introduction

What is vision?

  • Perceive an integration of image data and prior knowledge in brain
  • A field acquiring, processing, analyzing and understanding visual data

Computer Vision & Human Vision?

  • ill-posed problems
  • mathematical models
  • discrete vs. continuous
  • local vs. global optimization

What kinds of Topics?

相关学科
Low Level: Image Denoising, Deblurring, Super-Resolution, photo-sketch synthesis, texture synthesis, optical flow, image matching

Middle Level: image segmentation, motion capture, visual tracking, 3D reconstruction

High Level: object detection, image understanding, video understanding

具体应用问题

Related Problems: medical imaging, optical character recognition (OCR), face detection, smile detection, vision-based biometrics, shape capture, automatic driving

Why image restoration challenging?

  • Real noise much more complex than additive white Gaussian
  • Blur is non-uniform and complex to accurately estimate
  • Space of image local structures is huge, inverse problem highly ill-posed

Sparse Representation and Dictionary Learning on Restoration

Linear system $Ax=b$, if A full rank, $x = A^{-1}b$; if tall matrix(over-determined) than approximate solution by $minimize||Ax-b||_2^2$; if fat matrix(underdetermined), no solution in general and some constraint should be imposed

假设estimation与observer的最小距离是L0,L1,L2或其他:L0,非凸优化; L1,tightest convex relaxation of L0, 稀疏解; L2有闭合Dense解。具体到一个优化问题的等高线逼近时,各个NORM BALL的图形如下。
各个Norm Ball
尽管L1能逼近L0,但有时候L1也会出现非稀疏解,数学上已经证明,满足RIP性质的话,用L1近似L0能确保得到稀疏解。RIP又称有限等距性质,直观解释为从A矩阵中的部分列向量与任意向量x的乘积结果收敛在一个环形邻域,如下图
RIP

直观解释

图像复原问题

Modeling

$y=Hx+v$, H:observation matrix, v:noise
Keys to solve ill-posed problems:

  • Modeling the degradation process
  • Good Prior knowledge about the clean image
  • Good objective function for minimization

其中H在denoise是identity matrix; deblurring为blurring matrix; supperresolution是compound matrix of blurring and downsampling matrix; Inpainting是indication matrix of damaged pixels;

Methodology

Filter based methods: Gaussian low-pass, PDE-based anisotropic diffusion, Bilateral filtering, Nonlocal means filtering; (local->non local performance improve greatly)

Transform based methods: Fourier(‘Global, Orthogonal’), Wavelet(‘local, small’), Ridgelet(‘more redundant’), Dictionary Learning(‘over-complete’)

  • Represent x over dictionary D, enforcing the new vector be sparse(robust)
  • objective model $min_\alpha ||HD\alpha-y||_2^2+\lambda ||\alpha||_1$

The basic procedure

  1. Partition degraded image into overlapped patches(8*8)
  2. For each patch, solve the nonlinear L1-norm sparse coding problem:
    $\hat{\alpha} = argmin_\alpha ||HD\alpha-y||_2^2+\lambda||\alpha||_1$
  3. Reconstruct each patch by $\hat{x}=D\hat{\alpha}$
  4. put the reconstructed patch back and average the overlapped pixels
  5. In practice, the 1~4 can be iterated for several rounds

why sparse?

  • Neuronscience
  • Bayersian
  • Compressive Sensing

How to solve?

L0: Greddy search(Matching pursuit, Orthogonal matching pursuit)

  • MP: 贪婪地选取相关性最大的atoms
  • OMP: 正交地,把曾选的atoms的信息均用上,组合出新的投影向量
    L1:
  • Linear programming
  • Iteratively reweighted least squares:Trickly weighted L2 to L1
  • Proximal gradient descent: Soft-Thresholding with analytic solution
  • Augmented Lagrangian methods(Alternating Direction Method of Multipliers, ADMM)

ADMM

拉格朗日变换

将Constraint结合拉格朗日乘子放在Objective function里面。
e.g $min\ f(x) \ s.t. Ax=b$
拉格朗日形式:$L(x,\lambda)=f(x)+\lambda (Ax-b)$
对于含有不等式约束的情况,结合KKT条件,$h(x)=0, \lambda>=0, \lambda*g(x)=0, g(x)<=0 $, 其中h是等式约束,g是不等式约束,$\lambda$是不等式乘子

KKT条件:对于问题 $L(x,\lambda) = f(x)+\lambda g(x)$
满足

  • $\Delta_x h(x,\lambda)=0 $
  • $\lambda>=0$
  • $\lambda *g(x)=0$
  • $g(x)<=0$
  • $\Delta_{xx} L(x,\lambda)$ is PSD

对偶问题

对于原问题$L(x,\lambda)=f(x)+\lambda(Ax-b)$
对偶形式为:$g(\lambda)= inf_x(L(x,\lambda)) = -f^*(-A^T\lambda)-b^T\lambda)$,其中inf为确认下界(infimum)
对偶问题: $max\ g(\lambda)$
对偶上升法: $x^{k+1} = argmin_x L(x,\lambda^k)$

变量更新:$\lambda^{k+1}=\lambda^k +\alpha^k(Ax^{k+1}-b)$
对偶分解法:将目标函数分解成多个子函数 $f(x)=\Sigma_{i=1}^Nf_i(x_i)$

增广拉格朗日,为了增加Dual Ascent的鲁棒性,加入松弛函数
$$L_p(x,\lambda)=f(x)+\lambda^T(Ax-b)+(\rho/2)||Ax-b||_2^2$$

ADMM

ADMM旨在将对偶上升可分解性和乘子法上界收敛属性融合在一起的算法;
优化问题:$$ min\ f(x)+g(z) \ \ s.t. \ Ax+Bz=c $$
得到增广拉格朗日形式:$ L_\rho(x,z,\lambda)=f(x)+g(z)+y^T(Ax+Bz-c)+(\rho/2)||Ax+Bz-c||_2^2 $

迭代方式:

  • $ x^{k+1} = argmin_x L_p (x,z^k,\lambda^k) $
  • $ z^{k+1} = argmin_z L_p (x^{k+1},z,\lambda^k) $
  • $\lambda^{k+1} = \lambda^k + \rho(Ax^{k+1}+Bz^{k+1}-c) $
    $\rho >0$,停止准则:对偶残差小于某个极小值$\epsilon$
    收敛速度:对于一个高的精度要求收敛多次,但可以融合其他算法快速产生高精度
    对于凸优化问题,KKT条件是对偶问题有相同解的保证。非凸的问题会存在Dual Gap.
坚持分享,支持原创