linear decoder

Deep learning：二十二(linear decoder练习)

　　前言：

　　本节是练习Linear decoder的应用，关于Linear decoder的相关知识介绍请参考： Deep learning ：十七 (Linear Decoders ， Convolution 和 Pooling) ，实验步骤参考 Exercise: Implement deep networks for digit classification 。本次实验是用linear decoder的sparse autoencoder来训练出stl-10数据库图片的patch特征。并且这次的训练权值是针对rgb图像块的。

　　基础知识：

　　PCA Whitening是保证数据各维度的方差为1，而ZCA Whitening是保证数据各维度的方差相等即可，不一定要唯一。并且这两种whitening的一般用途也不一样，PCA Whitening主要用于降维且去相关性，而ZCA Whitening主要用于去相关性，且尽量保持原数据。

　　 Matlab的一些知识：

　　函数句柄的好处就是把一个函数作为参数传入到本函数中，在该函数内部可以利用该函数进行各种运算得出最后需要的结果，比如说函数中要用到各种求导求积分的方法，如果是传入该函数经过各种运算后的值的话，那么在调用该函数前就需要不少代码，这样比较累赘，所以采用函数句柄后这些代码直接放在了函数内部，每调用一次无需在函数外面实现那么多的东西。

　　Matlab中保存各种数据时可以采用save函数，并将其保持为.mat格式的，这样在matlab的current folder中看到的是.mat格式的文件，但是直接在文件夹下看，它是不直接显示后缀的，且显示的是Microsoft Access Table Shortcut，也就是.mat的简称。

　　关于实验的一些说明：

　　在Ng的教程和实验中，它的输入样本矩阵是每一列代表一个样本的，列数为样本的总个数。

　　matlab中矩阵64*10w大小肯定是可以的。

　　在本次实验中，ZCA Whitening是针对patches进行的，且patches的均值化是对每一维进行的（感觉这种均值化比较靠谱，前面有文章是进行对patch中一个样本求均值，感觉那样很不靠谱，不过那是在natural image中做的，因为natural image每一维的统计特性都一样，所以可以那样均值化，但还是感觉不太靠谱）。因为使用的是ZCA whitening，所以新的向量并没有进行降维，只是去了相关性和让每一维的方差都相等而已。另外，由此可见，在进行数据Whitening时并不需要对原始的大图片进行whitening，而是你用什么数据输入网络去训练就对什么数据进行whitening，而这里，是用的小patches来训练的，所以应该对小patches进行whitening。

　　关于本次实验的一些数据和变量分配如下：

　　总共需训练的样本矩阵大小为192*100000。因为输入训练的一个patch大小为8*8的，所以网络的输入层节点数为192（=8*8*3，因为是3通道的，每一列按照rgb的顺序排列），另外本次试验的隐含层个数为400，权值惩罚系数为0.003，稀疏性惩罚系数为5，稀疏性体现在3.5%的隐含层节点被激发。ZCA白化时分母加上0.1的值防止出现大的数值。

　　用的是Linear decoder，所以最后的输出层的激发函数为1，即输出和输入相等。这样在问题内部的计算量变小了点。

　　程序中最后需要把学习到的网络权值给显示出来，不过这个显示的内容已经包括了whitening部分了，所以是whitening和sparse autoencoder的组合。程序中显示用的是displayColorNetwork( (W*ZCAWhite)');

　　这里为什么要用(W*ZCAWhite)'呢？首先，使用W*ZCAWhite是因为每个样本x输入网络，其输出等价于W*ZCAWhite*x；另外，由于W*ZCAWhite的每一行才是一个隐含节点的变换值,而displayColorNetwork函数是把每一列显示一个小图像块的，所以需要对其转置。

　　实验结果：

　　原始图片截图：

　　ZCA Whitening后截图;

　　学习到的400个特征显示如下：

　　实验主要部分代码：

%% CS294A/ CS294W Linear Decoder Exercise

 %   Instructions
 %  ------------
% 
%  This  file  contains code that helps you get started  on   the
 %  linear decoder exericse. For this exercise, you will only need  to   modify
 %  the code  in  sparseAutoencoderLinearCost.m. You will  not  need  to   modify
 %  any code  in  this  file  .

 %%======================================================================
%% STEP  0  : Initialization
 %  Here we initialize some parameters used  for   the exercise.

imageChannels  =  3 ;     % number  of  channels (rgb, so  3  )

patchDim    =  8 ;          %  patch dimension
numPatches  =  100000 ;   % number  of   patches

visibleSize  = patchDim * patchDim * imageChannels;  % number  of   input   units   
outputSize   = visibleSize;   % number  of   output   units  
hiddenSize   =  400 ;           % number  of  hidden  units  % 中间的隐含层还变多了

sparsityParam  =  0.035 ; % desired average activation  of  the hidden  units  .
lambda  = 3e- 3 ;         %  weight decay parameter       
beta  =  5 ;              % weight  of   sparsity penalty term       

epsilon  =  0.1 ;           % epsilon  for   ZCA whitening

 %%======================================================================
%% STEP  1 : Create  and  modify sparseAutoencoderLinearCost.m  to   use   a linear decoder,
 %           and   check gradients
 %   You should copy sparseAutoencoderCost.m from your earlier exercise 
 %   and  rename it  to   sparseAutoencoderLinearCost.m. 
 %  Then you need  to  rename the  function  from sparseAutoencoderCost  to 
%  sparseAutoencoderLinearCost,  and   modify it so that the sparse autoencoder
 %  uses a linear decoder instead. Once that  is   done, you should check 
 % your gradients  to   verify that they are correct.

 %  NOTE : Modify sparseAutoencoderCost first!

% To speed up gradient checking, we will  use  a reduced network  and   some
 %  dummy patches

debugHiddenSize  =  5  ;
debugvisibleSize  =  8  ;
patches  = rand([ 8   10 ]);%随机产生10个样本，每个样本为一个8维的列向量，元素值为0~ 1  
theta  =  initializeParameters(debugHiddenSize, debugvisibleSize); 

[cost, grad]  =  sparseAutoencoderLinearCost(theta, debugvisibleSize, debugHiddenSize, ...
                                           lambda, sparsityParam, beta, ...
                                           patches);

 %  Check gradients
numGrad  =  computeNumericalGradient( @(x) sparseAutoencoderLinearCost(x, debugvisibleSize, debugHiddenSize, ...
                                                  lambda, sparsityParam, beta, ...
                                                  patches), theta);

 % Use this  to  visually compare the gradients  side  by  side  
disp([numGrad cost]); 

diff  = norm(numGrad-grad)/norm(numGrad+ grad);
 % Should be small. In our implementation, these values are usually less than 1e- 9  .
disp(diff); 

  assert (diff < 1e- 9 ,  '  Difference too large. Check your gradient computation again  '  );

 %  NOTE : Once your gradients check  out , you should run step  0  again  to 
%        reinitialize the parameters
 % }

 %%======================================================================
%% STEP  2 : Learn features  on   small patches
 %  In this step, you will  use  your sparse autoencoder (which  now   uses a 
 %  linear decoder)  to  learn features  on   small patches sampled from related
 %   images.

 %%  STEP 2a: Load patches
 %  In this step, we load 100k patches sampled from the STL10 dataset  and 
%  visualize them. Note that these patches have been scaled  to  [ 0 , 1  ]

load stlSampledPatches.mat

displayColorNetwork(patches(:,   1 : 100  ));

 %%  STEP 2b: Apply preprocessing
 %  In this sub-step, we preprocess the sampled patches,  in   particular, 
 %   ZCA whitening them. 
 % 
%  In a later exercise  on  convolution  and  pooling, you will need  to   replicate 
 %  exactly the preprocessing steps you apply  to   these patches before 
 %  using the autoencoder  to  learn features  on   them. Hence, we will save the
 %  ZCA whitening  and  mean image matrices together  with   the learned features
 %  later  on  .

 % Subtract mean patch (hence zeroing the mean  of   the patches)
meanPatch  = mean(patches,  2 );  % 注意这里减掉的是每一维属性的均值，为什么会和其它的不同呢？
patches  = bsxfun(@minus, patches, meanPatch);% 每一维都均值化

 %  Apply ZCA whitening
sigma  = patches * patches '   / numPatches; 
[u, s, v] =  svd(sigma);
ZCAWhite  = u * diag( 1  ./ sqrt(diag(s) + epsilon)) * u '  ;%求出ZCAWhitening矩阵 
patches = ZCAWhite *  patches;
figure
displayColorNetwork(patches(:,   1 : 100  ));

 %%  STEP 2c: Learn features
 %  You will  now   use  your sparse autoencoder ( with  linear decoder)  to   learn
 %  features  on  the preprocessed patches. This should take around  45   minutes.

theta  =  initializeParameters(hiddenSize, visibleSize);

 % Use minFunc  to  minimize the  function  
addpath minFunc / 

options  =  struct;
options.Method  =  '  lbfgs  '  ; 
options.maxIter  =  400  ;
options.display  =  '  on  '  ;

[optTheta, cost]  =  minFunc( @(p) sparseAutoencoderLinearCost(p, ...
                                   visibleSize, hiddenSize, ...
                                   lambda, sparsityParam, ...
                                   beta, patches), ...
                              theta, options); % 注意它的参数

 % Save the learned features  and  the preprocessing matrices  for   use   in  
% the later exercise  on  convolution  and   pooling
fprintf(  '  Saving learned features and preprocessing matrices...\n  '  );                          
save(  '  STL10Features.mat  ' ,  '  optTheta  ' ,  '  ZCAWhite  ' ,  '  meanPatch  '  );
fprintf(  '  Saved\n  '  );

 %%  STEP 2d: Visualize learned features

W  = reshape(optTheta( 1 :visibleSize *  hiddenSize), hiddenSize, visibleSize);
b  = optTheta( 2 *hiddenSize*visibleSize+ 1 : 2 *hiddenSize*visibleSize+ hiddenSize);
figure;
 %这里为什么要用(W*ZCAWhite) '  呢？首先，使用W*ZCAWhite是因为每个样本x输入网络， 
%其输出等价于W*ZCAWhite*x；另外，由于W* ZCAWhite的每一行才是一个隐含节点的变换值
 % 而displayColorNetwork函数是把每一列显示一个小图像块的，所以需要对其转置。
displayColorNetwork( (W *ZCAWhite) '  );

　　参考资料：

Deep learning ：十七 (Linear Decoders ， Convolution 和 Pooling)

Exercise: Implement deep networks for digit classification

作者：tornadomeet 出处：http://www.cnblogs.com/tornadomeet 欢迎转载或分享，但请务必声明文章出处。

分类: 机器学习

标签: 机器学习

作者： Leo_wl
　　　　
出处： http://www.cnblogs.com/Leo_wl/
　　　　
本文版权归作者和博客园共有，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文连接，否则保留追究法律责任的权利。
版权信息

查看更多关于linear decoder的详细内容...

声明：本文来自网络，不代表【好得很程序员自学网】立场，转载请注明出处：http://haodehen.cn/did46321

更新时间：2022-09-24 阅读：52次

linear decoder

linear decoder

Deep learning：二十二(linear decoder练习)

Deep learning：二十二(linear decoder练习)

作者：tornadomeet 出处：http://www.cnblogs.com/tornadomeet 欢迎转载或分享，但请务必声明文章出处。

分类: 机器学习 标签: 机器学习

分类: 机器学习

标签: 机器学习