VisualComputing_2

上一节讲了CV的介绍和Sparse Representation的内容，包括CV的概念、应用和难点；Sparse Representation的formulation, method以及步骤。当然还有为何Sparse Representation can work. 这一节讲一下Dictionary Learning和Representative works

Dictionary Learning

Introduction

在稀疏编码之前，需要学习一组过完备的字典，从而使得编码向量是稀疏的。以下分为两种字典，Analytical and Learn;
Analytical包括DCT bases, Wavelets, Curvelets…; Learn dictionaries from natural images: K-SVD, Coordinate descent, Online dictionary learning;

为什么需要字典学习？

Over-complete learned dictionary often work better than analytically
More adaptive to specific task/data
Less strict constraints on mathematical properties of bases
More flexible to model data
Tend to produce sparser solution

L0：K-SVD

对于L0稀疏的字典学习，我们可以用K-SVD方法近似求解，其中可以看成是K-MEANS的一种扩展
字典学习的问题可以Modeling为：
$$min_{D,A}||Y-DA||_F^2 \ \ s.t.\ \ ||a_i||_0 <= T_0$$ 其中i为任意正数，$T_0$为稀疏值

如图，对于字典学习：

首先是稀疏编码，可以用Matching Pursuit来优化求解；然后用K-SVD方法更新字典。
然后将DA进行K次分片叠加得到$DA=\sum_{i=1}^K d_i a_i^T$, 这里便是一个可用词典；剥离第K条，寻找新的d,x来更新该条目
最后，只抽取非零的a组成新的矩阵$\Omega$作为系数矩阵，对误差能量矩阵作SVD分解，d取U的第一行，x取$\sum V^T$的乘积第一列

总结K-SVD的思想：K次分片，使得最后学得的字典over-complete; 选用第K个条目更新，每次只更新一个字典atom(one column in fat matrix); 对剥离后的‘空洞’做K-SVD， $E_k = U \sum V^T$, 新的d,a则取里面能量最大的元素，这是对误差’空洞’的最佳逼近；只抽取非零系数组成新矩阵更新，有助于保持原来字典的稀疏性；

对于L1字典，可以对D和A交替学习：当更新D时，这是Quadratic Programming; 当更新A时，这是LASSO Optimization (ADMM);

Representative Work

Online learning: 考虑新来的样本，直接在原来基础上更新词典的策略以及收敛性

Multi-scale Dictionary learning: 由于complexity increases exponentially with signal dimension,所以一般用较小的patch size; 而multi-scale可以自适应地融合不同scale字典编码

Double Sparsity: 可以针对高层次稀疏特征或者large patch再进行一次dictionary learning, 基于稀疏编码或着高维编码的再一次稀疏表示；

Restoration Methods

Filtering-based methods: Isotropic method, Anisotropic method
Transformation methods: Motivation, find new representation where signal and noise can be better separated; Wavelet transform

K-SVD denoising

Basic Idea: 1.train over-complete dictionary 2.adopt trained dictionary to denoise patch in noisy image 3.Utilize the patch to reconstruct

Limitations: 1.Solving sparse coding not effective enough 2.L0 is not good choice

BM3D

BM3D denoising算是业内最为经典的去噪算法了，其中结合了Nonlocal self-similarity和sparsity两个最重要的priors，效果非常不错，速度一般

步骤：首先通过non-local matching找到一组图片块；组成tensor进行维纳滤波，之后进行阈值抑制（这里相当于稀疏去噪）；最后对新的tensor结合原来的tensor再重复做维纳滤波和阈值抑制；得到去噪patches reconstruct到图像即可

优缺点：1.有效挖掘了nonlocal similarity和sparsity 2.在DWT（小波)做协同滤波并不能描述复杂的图像结构

LSSC

Group Sparsity

与普通的L1 sparsity不同，Marial提出系数矩阵满足group sparsity的L1，2范数；使得同样的patch，在字典下应该具有统一的稀疏编码，保持元组具有相同稀疏的特性；仔细看12范数的表达形式，j是行，i是列，使得系数尽可能在同一行；

整个流程和BM3D相似，只是在协同滤波和阈值抑制上，改成用group sparsity的字典学习和稀疏编码去噪

Adaptive Sparse Domain Selection：由于大的词典使得稀疏编码过程非常耗时，而大的词典对于描述图像局部结构又是很有必要；这个方法提出从大字典中选择一个子集可以提速

Piece-wise Linear Estimation (PLE), Motivations:

Sparse representation assumes Laplacian prior on coefficients, lead to nonlinear sparse coding estimator
Use Mixture of Gaussians to approximate Laplacian
Select one appropriate Gaussian Prior to reconstruct

Coupled Dictionary Learning

Motivations:

Used coupled dictionary to model the relationship between degraded image and its corresponding images
Build the corresponding in sparse domain(same code but different dictionary)

Semi-coupled Dictionary Learning: flexible the relationship between two dictionary, the sparse code with a pre-learned mapping