为什么要把图片转换为灰色?
彩色图像单个像素是(R, G, B),转换成灰度图就是简化矩阵, 提高运算速度 比如一个点,灰度的话,就256个维度而已,但是如果算上RGB色彩的话,那就是1600万以上维度。然后再相互组合,或者说找梯度,可以想象 计算量非常大,于是就先降维(灰度)来计算.
背景 opencv 2.4
人脸识别是指将一个需要识别的人脸和人脸库中的某个人脸对应起来(类似于指纹识别),目的是完成识别功能.从OpenCV2.4开始,加入了新的 类 FaceRecognizer,该类用于人脸识别.
原始的LBP算子定义为在3*3的窗口内,以窗口中心像素为阙值,将相邻的8个像素的灰度值与其进行比较,若周围像素值大于或等于中心像素值,则该像素点的位置被标记为1,否则为0。这样,33邻域内的8个点经比较可产生8位二进制数(通常转换为***LBP码,共256(2^8)种),即得到该 窗口中心 像素点 的LBP值,并用这个值来反映该区域的纹理特征。 如下图所示:
对于半径为R的圆形区域内含有P个采样点的LBP算子将会产生P^2种模式但是维数太多了 更加优化下
等价模式类包含P(P-1) 2种模式,混合模式类只有1种模式。对于3×3邻域内8个采样点来说,二进制模式由原始的256种减少为59种,这使得特征向量的维数更少
改进后的 LBP算子 LBP算子进行了改进,将3×3邻域扩展到任意邻域, 并用圆形邻域代替了正方形邻域,改进后的LBP算子允许在半径为R的圆形邻域内有任意多个像素点,从而得到了诸如半径为R的圆形区域内含有P个采样点的LBP算子,OpenCV中正是使用圆形LBP算子,
原理
提取的LBP算子在每个像素点都可以得到一个LBP“编码”,那么,对一幅图提取其原始的LBP算子之后,的得到LBP图.再将一幅图片划分为若干的子区域,对每个子区域内的每个像素点都提取LBP特征,然后,在每个子区域内建立LBP特征的统计直方图。如此一来,每个子区域,就可以用一个统计直方图来进行描述,整个图片就由若干个统计直方图组成,这样做的好处是在一定范围内减小图像没完全对准而产生的误差,分区的另外一个意义在于我们可以根据不同的子区域给予不同的权重,比如说我们认为中心部分分区的权重大于边缘部分分区的权重,意思就是说中心部分在进行图片匹配识别时的意义更为重大。 例如:一幅100100像素大小的图片,划分为**1010**=100个子区域,每个子区域的大小为10*10像素;在每个子区域内的每个像素点,提取其LBP特征,然后,建立统计直方图;这样,这幅图片就有10*10个子区域,也就有了10*10个统计直方图,利用这10*10个统计直方图,就可以描述这幅图片了。之后,我们利用各种 相似性 度量函数,就可以判断两幅图像之间的相似性了.
关键源码分析
代码语言:javascript复制1 **FaceRecognizer**
CV_EXPORTS_W Ptr<FaceRecognizer> createLBPHFaceRecognizer(int radius=1, int neighbors=8,int grid_x=8, int grid_y=8, double threshold = DBL_MAX);
Ptr<FaceRecognizer> createLBPHFaceRecognizer(int radius, int neighbors,int grid_x, int grid_y, double threshold)
{
return new LBPH(radius, neighbors, grid_x, grid_y, threshold);
}
由代码可见LBPH使用圆形LBP算子,默认情况下,圆的半径是1,
采样点P为8,x方向和y方向上的分区个数都为8,即有8*8=64个分区,
最后一个参数为相似度阈值,待识别图像也图像库中图像相似度小于
该值时才会产生匹配结果。
2 **train**
void LBPH::train(InputArrayOfArrays _in_src, InputArray _in_labels, bool preserveData) {
if(_in_src.kind() != _InputArray::STD_VECTOR_MAT && _in_src.kind() != _InputArray::STD_VECTOR_VECTOR) {
string error_message = "The images are expected as InputArray::STD_VECTOR_MAT (a std::vector<Mat>) or _InputArray::STD_VECTOR_VECTOR (a std::vector< vector<...> >).";
CV_Error(CV_StsBadArg, error_message);
}
if(_in_src.total() == 0) {
string error_message = format("Empty training data was given. You'll need more than one sample to learn a model.");
CV_Error(CV_StsUnsupportedFormat, error_message);
} else if(_in_labels.getMat().type() != CV_32SC1) {
string error_message = format("Labels must be given as integer (CV_32SC1). Expected %d, but was %d.", CV_32SC1, _in_labels.type());
CV_Error(CV_StsUnsupportedFormat, error_message);
}
// get the vector of matrices
vector<Mat> src;
_in_src.getMatVector(src);
// get the label matrix
Mat labels = _in_labels.getMat();
// check if data is well- aligned
if(labels.total() != src.size()) {
string error_message = format("The number of samples (src) must equal the number of labels (labels). Was len(samples)=%d, len(labels)=%d.", src.size(), _labels.total());
CV_Error(CV_StsBadArg, error_message);
}
// if this model should be trained without preserving old data, delete old model data
if(!preserveData) {
_labels.release();
_histograms.clear();
}
// append labels to _labels matrix
for(size_t labelIdx = 0; labelIdx < labels.total(); labelIdx ) {
_labels.push_back(labels.at<int>((int)labelIdx));
}
// store the spatial histograms of the original data
for(size_t sampleIdx = 0; sampleIdx < src.size(); sampleIdx ) {
// calculate lbp image
Mat lbp_image = elbp(src[sampleIdx], _radius, _neighbors);
// get spatial histogram from this lbp image
Mat p = spatial_histogram(
lbp_image, /* lbp_image */
static_cast<int>(std::pow(2.0, static_cast<double>(_neighbors))), /* number of possible patterns */
_grid_x, /* grid size x */
_grid_y, /* grid size y */
true);
// add to templates
_histograms.push_back(p);
}
}
由代码可见LBPH使用圆形LBP算子,默认情况下,
圆的半径是1,采样点P为8,x方向和y方向上的分区个数都为8,
即有8*8=64个分区,最后一个参数为相似度阈值,
待识别图像也图像库中图像相似度小于该值时才会产生匹配结果。
3 **elbp和spatial_histogram**
template <typename _Tp> static
inline void elbp_(InputArray _src, OutputArray _dst, int radius, int neighbors) {
//get matrices
Mat src = _src.getMat();
// allocate memory for result
_dst.create(src.rows-2*radius, src.cols-2*radius, CV_32SC1);
Mat dst = _dst.getMat();
// zero
dst.setTo(0);
for(int n=0; n<neighbors; n ) {
// sample points
float x = static_cast<float>(radius * cos(2.0*CV_PI*n/static_cast<float>(neighbors)));
float y = static_cast<float>(-radius * sin(2.0*CV_PI*n/static_cast<float>(neighbors)));
// relative indices
int fx = static_cast<int>(floor(x));
int fy = static_cast<int>(floor(y));
int cx = static_cast<int>(ceil(x));
int cy = static_cast<int>(ceil(y));
// fractional part
float ty = y - fy;
float tx = x - fx;
// set interpolation weights
float w1 = (1 - tx) * (1 - ty);
float w2 = tx * (1 - ty);
float w3 = (1 - tx) * ty;
float w4 = tx * ty;
// iterate through your data
for(int i=radius; i < src.rows-radius;i ) {
for(int j=radius;j < src.cols-radius;j ) {
// calculate interpolated value
float t = static_cast<float>(w1*src.at<_Tp>(i fy,j fx) w2*src.at<_Tp>(i fy,j cx) w3*src.at<_Tp>(i cy,j fx) w4*src.at<_Tp>(i cy,j cx));
// floating point precision, so check some machine-dependent epsilon
dst.at<int>(i-radius,j-radius) = ((t > src.at<_Tp>(i,j)) || (std::abs(t-src.at<_Tp>(i,j)) < std::numeric_limits<float>::epsilon())) << n;
}
}
}
}
static void elbp(InputArray src, OutputArray dst, int radius, int neighbors)
{
int type = src.type();
switch (type) {
case CV_8SC1: elbp_<char>(src,dst, radius, neighbors); break;
case CV_8UC1: elbp_<unsigned char>(src, dst, radius, neighbors); break;
case CV_16SC1: elbp_<short>(src,dst, radius, neighbors); break;
case CV_16UC1: elbp_<unsigned short>(src,dst, radius, neighbors); break;
case CV_32SC1: elbp_<int>(src,dst, radius, neighbors); break;
case CV_32FC1: elbp_<float>(src,dst, radius, neighbors); break;
case CV_64FC1: elbp_<double>(src,dst, radius, neighbors); break;
default:
string error_msg = format("Using Original Local Binary Patterns for feature extraction only works on single-channel images (given %d). Please pass the image data as a grayscale image!", type);
CV_Error(CV_StsNotImplemented, error_msg);
break;
}
}
static Mat
histc_(const Mat& src, int minVal=0, int maxVal=255, bool normed=false)
{
Mat result;
// Establish the number of bins.
int histSize = maxVal-minVal 1;
// Set the ranges.
float range[] = { static_cast<float>(minVal), static_cast<float>(maxVal 1) };
const float* histRange = { range };
// calc histogram
calcHist(&src, 1, 0, Mat(), result, 1, &histSize, &histRange, true, false);
// normalize
if(normed) {
result /= (int)src.total();
}
return result.reshape(1,1);
}
static Mat histc(InputArray _src, int minVal, int maxVal, bool normed)
{
Mat src = _src.getMat();
switch (src.type()) {
case CV_8SC1:
return histc_(Mat_<float>(src), minVal, maxVal, normed);
break;
case CV_8UC1:
return histc_(src, minVal, maxVal, normed);
break;
case CV_16SC1:
return histc_(Mat_<float>(src), minVal, maxVal, normed);
break;
case CV_16UC1:
return histc_(src, minVal, maxVal, normed);
break;
case CV_32SC1:
return histc_(Mat_<float>(src), minVal, maxVal, normed);
break;
case CV_32FC1:
return histc_(src, minVal, maxVal, normed);
break;
default:
CV_Error(CV_StsUnmatchedFormats, "This type is not implemented yet."); break;
}
return Mat();
}
static Mat spatial_histogram(InputArray _src, int numPatterns,
int grid_x, int grid_y, bool /*normed*/)
{
Mat src = _src.getMat();
// calculate LBP patch size
int width = src.cols/grid_x;
int height = src.rows/grid_y;
// allocate memory for the spatial histogram
Mat result = Mat::zeros(grid_x * grid_y, numPatterns, CV_32FC1);
// return matrix with zeros if no data was given
if(src.empty())
return result.reshape(1,1);
// initial result_row
int resultRowIdx = 0;
// iterate through grid
for(int i = 0; i < grid_y; i ) {
for(int j = 0; j < grid_x; j ) {
Mat src_cell = Mat(src, Range(i*height,(i 1)*height), Range(j*width,(j 1)*width));
Mat cell_hist = histc(src_cell, 0, (numPatterns-1), true);
// copy to the result matrix
Mat result_row = result.row(resultRowIdx);
cell_hist.reshape(1,1).convertTo(result_row, CV_32FC1);
// increase row count in result matrix
resultRowIdx ;
}
}
// return result as reshaped feature vector
return result.reshape(1,1);
}
//------------------------------------------------------------------------------
// wrapper to cv::elbp (extended local binary patterns)
//------------------------------------------------------------------------------
static Mat elbp(InputArray src, int radius, int neighbors) {
Mat dst;
elbp(src, dst, radius, neighbors);
return dst;
}
需要注意的是在求图像中每个位置的8个采样点的值时,
是使用的采样点四个角上相应位置的加权平均值才作为
采样点的值(见上面函数elbp_中12~35行处代码),
这样做能降低噪音点对LBP值的影响。
而spatial_histogram函数把最后的分区直方图结果reshape成一行,这样做能方便识别时的相似度计算。
4 **predict**
void LBPH::predict(InputArray _src, int &minClass, double &minDist) const {
if(_histograms.empty()) {
// throw error if no data (or simply return -1?)
string error_message = "This LBPH model is not computed yet. Did you call the train method?";
CV_Error(CV_StsBadArg, error_message);
}
Mat src = _src.getMat();
// get the spatial histogram from input image
Mat lbp_image = elbp(src, _radius, _neighbors);
Mat query = spatial_histogram(
lbp_image, /* lbp_image */
static_cast<int>(std::pow(2.0, static_cast<double>(_neighbors))), /* number of possible patterns */
_grid_x, /* grid size x */
_grid_y, /* grid size y */
true /* normed histograms */);
// find 1-nearest neighbor
minDist = DBL_MAX;
minClass = -1;
for(size_t sampleIdx = 0; sampleIdx < _histograms.size(); sampleIdx ) {
double dist = compareHist(_histograms[sampleIdx], query, CV_COMP_CHISQR);
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int) sampleIdx);
}
}
}
函数中7~15行是计算带预测图片_src的分区直方图query,
19~25行的for循环分别比较query和人脸库直方图数组_histograms中
每一个直方图的相似度(比较方法正是CV_COMP_CHISQR),
并把相似度最小的作为最终结果,
该部分也可以看成创建LBPH类时threshold的作用,
即相似度都不小于threshold阈值则识别失败。