VisualComputing_1

老板的CV课程，在期末前做一下相关笔记总结

Introduction

What is vision?

Perceive an integration of image data and prior knowledge in brain
A field acquiring, processing, analyzing and understanding visual data

Computer Vision & Human Vision?

ill-posed problems
mathematical models
discrete vs. continuous
local vs. global optimization

What kinds of Topics?

Low Level: Image Denoising, Deblurring, Super-Resolution, photo-sketch synthesis, texture synthesis, optical flow, image matching

Middle Level: image segmentation, motion capture, visual tracking, 3D reconstruction

High Level: object detection, image understanding, video understanding

Related Problems: medical imaging, optical character recognition (OCR), face detection, smile detection, vision-based biometrics, shape capture, automatic driving

Why image restoration challenging?

Real noise much more complex than additive white Gaussian
Blur is non-uniform and complex to accurately estimate
Space of image local structures is huge, inverse problem highly ill-posed

Sparse Representation and Dictionary Learning on Restoration

Linear system $Ax=b$, if A full rank, $x = A^{-1}b$; if tall matrix(over-determined) than approximate solution by $minimize||Ax-b||_2^2$; if fat matrix(underdetermined), no solution in general and some constraint should be imposed

假设estimation与observer的最小距离是L0,L1,L2或其他：L0，非凸优化； L1，tightest convex relaxation of L0, 稀疏解; L2有闭合Dense解。具体到一个优化问题的等高线逼近时，各个NORM BALL的图形如下。

尽管L1能逼近L0，但有时候L1也会出现非稀疏解，数学上已经证明，满足RIP性质的话，用L1近似L0能确保得到稀疏解。RIP又称有限等距性质,直观解释为从A矩阵中的部分列向量与任意向量x的乘积结果收敛在一个环形邻域，如下图

图像复原问题

Modeling

$y=Hx+v$, H:observation matrix, v:noise
Keys to solve ill-posed problems:

Modeling the degradation process
Good Prior knowledge about the clean image
Good objective function for minimization

其中H在denoise是identity matrix; deblurring为blurring matrix; supperresolution是compound matrix of blurring and downsampling matrix; Inpainting是indication matrix of damaged pixels;

Methodology

Filter based methods: Gaussian low-pass, PDE-based anisotropic diffusion, Bilateral filtering, Nonlocal means filtering; (local->non local performance improve greatly)

Transform based methods: Fourier(‘Global, Orthogonal’), Wavelet(‘local, small’), Ridgelet(‘more redundant’), Dictionary Learning(‘over-complete’)

Represent x over dictionary D, enforcing the new vector be sparse(robust)
objective model $min_\alpha ||HD\alpha-y||_2^2+\lambda ||\alpha||_1$

The basic procedure

Partition degraded image into overlapped patches(8*8)
For each patch, solve the nonlinear L1-norm sparse coding problem:
$\hat{\alpha} = argmin_\alpha ||HD\alpha-y||_2^2+\lambda||\alpha||_1$
Reconstruct each patch by $\hat{x}=D\hat{\alpha}$
put the reconstructed patch back and average the overlapped pixels
In practice, the 1~4 can be iterated for several rounds

why sparse?

Neuronscience
Bayersian
Compressive Sensing

How to solve?

L0: Greddy search(Matching pursuit, Orthogonal matching pursuit)

MP: 贪婪地选取相关性最大的atoms
OMP: 正交地，把曾选的atoms的信息均用上，组合出新的投影向量
L1:
Linear programming
Iteratively reweighted least squares：Trickly weighted L2 to L1
Proximal gradient descent: Soft-Thresholding with analytic solution
Augmented Lagrangian methods(Alternating Direction Method of Multipliers, ADMM)

ADMM

拉格朗日变换

将Constraint结合拉格朗日乘子放在Objective function里面。
e.g $min\ f(x) \ s.t. Ax=b$
拉格朗日形式：$L(x,\lambda)=f(x)+\lambda (Ax-b)$
对于含有不等式约束的情况，结合KKT条件，$h(x)=0, \lambda>=0, \lambda*g(x)=0, g(x)<=0 $, 其中h是等式约束，g是不等式约束，$\lambda$是不等式乘子

KKT条件：对于问题 $L(x,\lambda) = f(x)+\lambda g(x)$
满足

$\Delta_x h(x,\lambda)=0 $
$\lambda>=0$
$\lambda *g(x)=0$
$g(x)<=0$
$\Delta_{xx} L(x,\lambda)$ is PSD

对偶问题

对于原问题$L(x,\lambda)=f(x)+\lambda(Ax-b)$
对偶形式为：$g(\lambda)= inf_x(L(x,\lambda)) = -f^*(-A^T\lambda)-b^T\lambda)$,其中inf为确认下界（infimum)
对偶问题： $max\ g(\lambda)$
对偶上升法： $x^{k+1} = argmin_x L(x,\lambda^k)$

变量更新：$\lambda^{k+1}=\lambda^k +\alpha^k(Ax^{k+1}-b)$
对偶分解法：将目标函数分解成多个子函数 $f(x)=\Sigma_{i=1}^Nf_i(x_i)$

增广拉格朗日，为了增加Dual Ascent的鲁棒性，加入松弛函数
$$L_p(x,\lambda)=f(x)+\lambda^T(Ax-b)+(\rho/2)||Ax-b||_2^2$$

ADMM

ADMM旨在将对偶上升可分解性和乘子法上界收敛属性融合在一起的算法；
优化问题：$$ min\ f(x)+g(z) \ \ s.t. \ Ax+Bz=c $$
得到增广拉格朗日形式：$ L_\rho(x,z,\lambda)=f(x)+g(z)+y^T(Ax+Bz-c)+(\rho/2)||Ax+Bz-c||_2^2 $

迭代方式：

$ x^{k+1} = argmin_x L_p (x,z^k,\lambda^k) $
$ z^{k+1} = argmin_z L_p (x^{k+1},z,\lambda^k) $
$\lambda^{k+1} = \lambda^k + \rho(Ax^{k+1}+Bz^{k+1}-c) $
$\rho >0$,停止准则：对偶残差小于某个极小值$\epsilon$
收敛速度：对于一个高的精度要求收敛多次，但可以融合其他算法快速产生高精度
对于凸优化问题，KKT条件是对偶问题有相同解的保证。非凸的问题会存在Dual Gap.