《PaddleOCR C 学习笔记(二)》尝试做图像的分割,结果都效果不明显,所以这篇我们从OCR识别这里来处理,将返回的识别字符和对应的识别矩形框都显示出来,用于区分识别的效果。
实现效果
上面的就是实现的效果图,从上面可以看出,识别的位置及识别的字符串也都在原图中绘制出来了,知道了对应的位置,比返回一串整体的字符串要效果好不少。
相应的里面也可以看出,识别的效果还是有待加强,几张图中,做过透视变换后的这一张图识别的效果是最好的,所有的数字都识别了出来。
而同样的做了透视变换,下面这两张:
上面这个可以看出,定位文本时数字1只截取了其中一部分,所以识别时被认成T了,而定位的文本框中也有重复的,像23这一个框定位了一次,结果又把234678这个框定位了一下,并且只识别出来278。
而上面这个图中,定位出来识别的效果还不错,除了数字5识别为5.0,只要识别出来的都对,但是同样,数字10,13,15,11还有3都没有检测出来。
其实这也看出来,如果真的是想识别效果好,还是需要自己训练模型,这个OCR应该是对文本的效果更好。
当然本篇的重点其实还是对PaddleOCR的动态为封装,实现外部调用好返回的是字符串加对应位置的列表,接下来就是正篇开始。
代码实现
微卡智享
PaddleOCR动态库部分修改
01
定义结构体
要返回对应的数组列表,首先就是要在动态库中定义名为OCRTextRect结构体,位置定义在了自己新建的ocr_export.h里。
代码语言:javascript复制struct OCRTextRect {
public:
char* OCRText; //识别的信息
int ptx, pty; //Rect的起始坐标
int width, height; //Rect的宽和高
OCRTextRect() : OCRText(""), ptx(0), pty(0), width(0), height(0)
{
}
};
结构体中定义了返回的字符串char*,然后加上了矩形Rect的起始坐标点X,Y,剩下的就是宽和高的长度。
这里要强调一个重点,为什么会用结构体?在动态库中,千万不要使用STL库的东西,容易发生内存的重分配问题,原因STL库全都是基于模板的,模板是在编译器生成的。这也就是说同一份STL代码在不同动态库中有各自的实现,如果只是方法多了一份自然就没问题,但是部分STL容器里面存有一些静态变量,因此多个实现会导致多份静态变量,然后导致某些方法的调用出现差别,最终导致内存操作异常而崩溃。
因此像STL库中std::vector,std::string这些都不能使用。
02
增加动态库外部调用函数
增加了一个PaddleOCRTextRect外部调用的函数。
ocr_export.cpp中的实现方法:
代码语言:javascript复制DLLEXPORT int PaddleOCRTextRect(cv::Mat& img, OCRTextRect* resptr)
{
std::vector<std::pair<std::string, cv::Rect>> str_res;
std::string tmpstr;
if (!img.data) {
return 0;
}
PaddleOCR::OCRConfig config = readOCRConfig();
//打印config参数
config.PrintConfigInfo();
//图像检测文本
PaddleOCR::DBDetector det(config.det_model_dir, config.use_gpu, config.gpu_id,
config.gpu_mem, config.cpu_math_library_num_threads,
config.use_mkldnn, config.max_side_len, config.det_db_thresh,
config.det_db_box_thresh, config.det_db_unclip_ratio,
config.use_polygon_score, config.visualize,
config.use_tensorrt, config.use_fp16);
PaddleOCR::Classifier* cls = nullptr;
if (config.use_angle_cls == true) {
cls = new PaddleOCR::Classifier(config.cls_model_dir, config.use_gpu, config.gpu_id,
config.gpu_mem, config.cpu_math_library_num_threads,
config.use_mkldnn, config.cls_thresh,
config.use_tensorrt, config.use_fp16);
}
PaddleOCR::CRNNRecognizer rec(config.rec_model_dir, config.use_gpu, config.gpu_id,
config.gpu_mem, config.cpu_math_library_num_threads,
config.use_mkldnn, config.char_list_file,
config.use_tensorrt, config.use_fp16);
//检测文本框
std::vector<std::vector<std::vector<int>>> boxes;
det.Run(img, boxes);
//OCR识别
str_res = rec.RunOCR(boxes, img, cls);
try
{
for (int i = 0; i < str_res.size(); i) {
char* reschar = new char[str_res[i].first.length() 1];
str_res[i].first.copy(reschar, std::string::npos);
resptr[i].OCRText = reschar;
resptr[i].ptx = str_res[i].second.x;
resptr[i].pty = str_res[i].second.y;
resptr[i].width = str_res[i].second.width;
resptr[i].height = str_res[i].second.height;
//std::cout << "cout:" << str_res[i].first << std::endl;
}
}
catch (const std::exception& ex)
{
std::cout << ex.what() << std::endl;
}
return str_res.size();
}
方法中返回的int是具体识别的数组中的个数,在外部调用时可以用这个来判断,因为传入参数中OCRTextRect的指针,需要外部调用前先分配的数组的大小,所以外面的定义数组大小可能会定义更大,返回的int可以知道具体是识别了多少个矩形框。
03
ocr_rec.cpp的修改
前面文章说了ocr_rec.cpp里面是识别的方法,里面通过RunOCR函数进入,其中GetRotateCropImage的函数,用于处理生成的boxes的矩形点,然后截图这里面的图形进行OCR识别的。
在不动原来的GetRotateCropImage函数方法,我们再重写一个GetRotateCropImage,加入一个cv::Rect的参数用于生成截取的矩形。
代码语言:javascript复制cv::Mat CRNNRecognizer::GetRotateCropImage(const cv::Mat& srcimage, std::vector<std::vector<int>> box, cv::Rect& rect)
{
cv::Mat image;
srcimage.copyTo(image);
std::vector<std::vector<int>> points = box;
int x_collect[4] = { box[0][0], box[1][0], box[2][0], box[3][0] };
int y_collect[4] = { box[0][1], box[1][1], box[2][1], box[3][1] };
int left = int(*std::min_element(x_collect, x_collect 4));
int right = int(*std::max_element(x_collect, x_collect 4));
int top = int(*std::min_element(y_collect, y_collect 4));
int bottom = int(*std::max_element(y_collect, y_collect 4));
cv::Mat img_crop;
rect = cv::Rect(left, top, right - left, bottom - top);
image(rect).copyTo(img_crop);
for (int i = 0; i < points.size(); i ) {
points[i][0] -= left;
points[i][1] -= top;
}
int img_crop_width = int(sqrt(pow(points[0][0] - points[1][0], 2)
pow(points[0][1] - points[1][1], 2)));
int img_crop_height = int(sqrt(pow(points[0][0] - points[3][0], 2)
pow(points[0][1] - points[3][1], 2)));
cv::Point2f pts_std[4];
pts_std[0] = cv::Point2f(0., 0.);
pts_std[1] = cv::Point2f(img_crop_width, 0.);
pts_std[2] = cv::Point2f(img_crop_width, img_crop_height);
pts_std[3] = cv::Point2f(0.f, img_crop_height);
cv::Point2f pointsf[4];
pointsf[0] = cv::Point2f(points[0][0], points[0][1]);
pointsf[1] = cv::Point2f(points[1][0], points[1][1]);
pointsf[2] = cv::Point2f(points[2][0], points[2][1]);
pointsf[3] = cv::Point2f(points[3][0], points[3][1]);
cv::Mat M = cv::getPerspectiveTransform(pointsf, pts_std);
cv::Mat dst_img;
cv::warpPerspective(img_crop, dst_img, M,
cv::Size(img_crop_width, img_crop_height),
cv::BORDER_REPLICATE);
if (float(dst_img.rows) >= float(dst_img.cols) * 1.5) {
cv::Mat srcCopy = cv::Mat(dst_img.rows, dst_img.cols, dst_img.depth());
cv::transpose(dst_img, srcCopy);
cv::flip(srcCopy, srcCopy, 0);
return srcCopy;
}
else {
return dst_img;
}
}
同样的RunOCR方法原来是void没有返回函数的,这里面我们我们也重写了这个方法返回为std::vector<std::pair<std::string, cv::Rect>>,用于最终处理存放到结构体中。
代码语言:javascript复制std::vector<std::pair<std::string, cv::Rect>> CRNNRecognizer::RunOCR(std::vector<std::vector<std::vector<int>>> boxes, cv::Mat& img, Classifier* cls)
{
cv::Mat srcimg;
img.copyTo(srcimg);
cv::Mat crop_img;
cv::Mat resize_img;
std::cout << "The predicted text is :" << std::endl;
int index = 0;
std::vector<std::pair<std::string, cv::Rect>> vtsresstr;
std::vector<std::string> str_res;
cv::Rect tmprect;
for (int i = 0; i < boxes.size(); i ) {
crop_img = GetRotateCropImage(srcimg, boxes[i], tmprect);
if (cls != nullptr) {
crop_img = cls->Run(crop_img);
}
float wh_ratio = float(crop_img.cols) / float(crop_img.rows);
this->resize_op_.Run(crop_img, resize_img, wh_ratio, this->use_tensorrt_);
this->normalize_op_.Run(&resize_img, this->mean_, this->scale_,
this->is_scale_);
std::vector<float> input(1 * 3 * resize_img.rows * resize_img.cols, 0.0f);
this->permute_op_.Run(&resize_img, input.data());
// Inference.
auto input_names = this->predictor_->GetInputNames();
auto input_t = this->predictor_->GetInputHandle(input_names[0]);
input_t->Reshape({ 1, 3, resize_img.rows, resize_img.cols });
input_t->CopyFromCpu(input.data());
this->predictor_->Run();
std::vector<float> predict_batch;
auto output_names = this->predictor_->GetOutputNames();
auto output_t = this->predictor_->GetOutputHandle(output_names[0]);
auto predict_shape = output_t->shape();
int out_num = std::accumulate(predict_shape.begin(), predict_shape.end(), 1,
std::multiplies<int>());
predict_batch.resize(out_num);
output_t->CopyToCpu(predict_batch.data());
// ctc decode
int argmax_idx;
int last_index = 0;
float score = 0.f;
int count = 0;
float max_value = 0.0f;
for (int n = 0; n < predict_shape[1]; n ) {
argmax_idx =
int(Utility::argmax(&predict_batch[n * predict_shape[2]],
&predict_batch[(n 1) * predict_shape[2]]));
max_value =
float(*std::max_element(&predict_batch[n * predict_shape[2]],
&predict_batch[(n 1) * predict_shape[2]]));
if (argmax_idx > 0 && (!(n > 0 && argmax_idx == last_index))) {
score = max_value;
count = 1;
str_res.push_back(label_list_[argmax_idx]);
}
last_index = argmax_idx;
}
score /= count;
cv::String tmpstr;
//for (int i = 0; i < str_res.size(); i ) {
// tmpstr = str_res[i];
// std::cout << tmpstr;
//}
for (int i = index; i < str_res.size(); i ) {
tmpstr = str_res[i];
std::cout << tmpstr;
}
index = str_res.size();
std::cout << "tscore: " << score << std::endl;
std::pair<std::string, cv::Rect> tmppair;
tmppair.first = tmpstr;
tmppair.second = tmprect;
vtsresstr.push_back(tmppair);
}
return vtsresstr;
}
这样最终PaddleOCRTextRect外部调用里面就可以给OCRTextRect结构体数组进行赋值了。
调用程序修改
01
定义结构体
和动态库里面一样,在调用动态库的程序里面也要先定义OCRTextRect的结构体。
02
加入调用函数
加入typedef定义动态库的调用函数,并写一个外部调用的方法。
03
其实的修改
再增加两个函数,实现将返回的OCRTextRect结构体数组转换为vector容器,插入的过程按照从上到下,从左到右的顺序进行排序,所以又写了一个二分查找的算法。
完整的PaddleOCRAPI
PaddleOCRApi.h
代码语言:javascript复制#pragma once
//通过调用windowsAPI 来加载和卸载DLL
#include <Windows.h>
#include <opencv2/opencv.hpp>
#include <iostream>
#include <string>
#include <locale>
#include <codecvt>
#include "....UtilsCvUtils.h"
struct OCRTextRect {
public:
char* OCRText; //识别的信息
int ptx, pty; //Rect的起始坐标
int width, height; //Rect的宽和高
OCRTextRect() {
OCRText = nullptr;
ptx = 0;
pty = 0;
width = 0;
height = 0;
}
};
class PaddleOcrApi
{
private:
typedef char*(*DllFun)(cv::Mat&);
typedef int (*DllFunOCRTextRect)(cv::Mat&, OCRTextRect*);
//二分查找
static int binarySearch(std::vector<std::pair<std::string, cv::Rect>>& vtsrect, const OCRTextRect rect);
public:
static std::string GetPaddleOCRText(cv::Mat& src);
static std::string GetPaddleOCRTextRect(cv::Mat& src, std::vector<std::pair<std::string, cv::Rect>>& vtsocr);
//排序OCRTextRect转为vector容器
static std::vector<std::pair<std::string, cv::Rect>> SortRectPair(const OCRTextRect* vtsrect, const int count);
//透视变换获取图像
static cv::Mat GetPerspectiveMat(cv::Mat& src, int iterations = 1);
//分割数据华容道图像
static std::vector<cv::Mat> GetNumMat(cv::Mat& src);
// string的编码方式为utf8,则采用:
static std::string wstr2utf8str(const std::wstring& str);
static std::wstring utf8str2wstr(const std::string& str);
// string的编码方式为除utf8外的其它编码方式,可采用:
static std::string wstr2str(const std::wstring& str, const std::string& locale);
static std::wstring str2wstr(const std::string& str, const std::string& locale);
};
PaddleOCRAPI.cpp
代码语言:javascript复制#include "PaddleOcrApi.h"
//二分查找定位当前插入序号
int PaddleOcrApi::binarySearch(std::vector<std::pair<std::string, cv::Rect>>& vtsrect, const OCRTextRect rect)
{
int left = 0;
int right = vtsrect.size() - 1;
int res = 0;
std::pair<std::string, cv::Rect> lastitem("", cv::Rect());
while (left <= right) {
int mid = left (right - left) / 2;
//获取中位值
std::pair<std::string, cv::Rect> item(vtsrect[mid].first,vtsrect[mid].second);
//判断最后值是否相等
if (item.first == lastitem.first && item.second.x == lastitem.second.x
&& item.second.y == lastitem.second.y) {
res = mid;
break;
}
else if (rect.pty rect.height > item.second.y item.second.height / 2) {
lastitem.first = item.first;
lastitem.second = item.second;
left = mid 1;
}
else if (rect.ptx < item.second.x) {
lastitem.first = item.first;
lastitem.second = item.second;
right = mid - 1;
}
else if (rect.ptx >= item.second.x) {
lastitem.first = item.first;
lastitem.second = item.second;
left = mid 1;
}
}
return res;
}
std::string PaddleOcrApi::GetPaddleOCRText(cv::Mat& src)
{
std::string resstr;
DllFun funName;
HINSTANCE hdll;
try
{
hdll = LoadLibrary(L"PaddleOCRExport.dll");
if (hdll == NULL)
{
resstr = "加载不到PaddleOCRExport.dll动态库!";
FreeLibrary(hdll);
return resstr;
}
funName = (DllFun)GetProcAddress(hdll, "PaddleOCRText");
if (funName == NULL)
{
resstr = "找不到PaddleOCRText函数!";
FreeLibrary(hdll);
return resstr;
}
resstr = funName(src);
// 将utf-8的string转换为wstring
std::wstring wtxt = utf8str2wstr(resstr);
// 再将wstring转换为gbk的string
resstr = wstr2str(wtxt, "Chinese");
FreeLibrary(hdll);
}
catch (const std::exception& ex)
{
resstr = ex.what();
return "Error:" resstr;
FreeLibrary(hdll);
}
return resstr;
}
std::string PaddleOcrApi::GetPaddleOCRTextRect(cv::Mat& src, std::vector<std::pair<std::string, cv::Rect>>& vtsocr)
{
std::string resstr;
DllFunOCRTextRect funName;
HINSTANCE hdll;
try
{
hdll = LoadLibrary(L"PaddleOCRExport.dll");
if (hdll == NULL)
{
resstr = "加载不到PaddleOCRExport.dll动态库!";
FreeLibrary(hdll);
return resstr;
}
funName = (DllFunOCRTextRect)GetProcAddress(hdll, "PaddleOCRTextRect");
if (funName == NULL)
{
resstr = "找不到PaddleOCRText函数!";
FreeLibrary(hdll);
return resstr;
}
OCRTextRect vts[100];
int count = funName(src, vts);
std::cout << "size:" << std::to_string(count) << std::endl;
for (int i = 0; i< count; i) {
std::cout << vts[i].OCRText<< std::endl;
std::cout << "Rect:x=" << std::to_string(vts[i].ptx);
std::cout << " y=" << std::to_string(vts[i].pty);
std::cout << " width=" << std::to_string(vts[i].width);
std::cout << " height=" << std::to_string(vts[i].height) << std::endl;
OCRTextRect tmprect = vts[i];
// 将utf-8的string转换为wstring
std::wstring wtxt = utf8str2wstr(tmprect.OCRText);
// 再将wstring转换为gbk的string
std::string tmpstr = wstr2str(wtxt, "Chinese");
// 通过二分查找排序插入到vtsocr的容器中
int index = binarySearch(vtsocr, vts[i]);
vtsocr.insert(vtsocr.begin() index, std::pair<std::string, cv::Rect>(tmpstr,
cv::Rect(tmprect.ptx, tmprect.pty, tmprect.width, tmprect.height)));
}
resstr = "OK";
FreeLibrary(hdll);
}
catch (const std::exception& ex)
{
resstr = ex.what();
return "Error:" resstr;
FreeLibrary(hdll);
}
return resstr;
}
//排序OCRTextRect
std::vector<std::pair<std::string, cv::Rect>> PaddleOcrApi::SortRectPair(const OCRTextRect* vtsrect, const int count)
{
std::vector<std::pair<std::string, cv::Rect>> resvts;
return std::vector<std::pair<std::string, cv::Rect>>();
}
cv::Mat PaddleOcrApi::GetPerspectiveMat(cv::Mat& src, int iterations)
{
cv::Mat tmpsrc, cannysrc, resultMat;
src.copyTo(tmpsrc);
//高斯滤波
cv::GaussianBlur(tmpsrc, tmpsrc, cv::Size(5, 5), 0.5, 0.5);
int srcArea = tmpsrc.size().area();
float maxArea = 0;
int maxAreaidx = -1;
std::vector<cv::Mat> channels;
cv::Mat B_src, G_src, R_src, dstmat;
cv::split(tmpsrc, channels);
int minthreshold = 120, maxthreshold = 200;
//B进行Canny
//大津法求阈值
CvUtils::GetMatMinMaxThreshold(channels[0], minthreshold, maxthreshold, 1);
std::cout << "OTSUmin:" << minthreshold << " OTSUmax:" << maxthreshold << std::endl;
//Canny边缘提取
cv::Canny(channels[0], B_src, minthreshold, maxthreshold);
//大津法求阈值
CvUtils::GetMatMinMaxThreshold(channels[1], minthreshold, maxthreshold, 1);
std::cout << "OTSUmin:" << minthreshold << " OTSUmax:" << maxthreshold << std::endl;
//Canny边缘提取
Canny(channels[1], G_src, minthreshold, maxthreshold);
//大津法求阈值
CvUtils::GetMatMinMaxThreshold(channels[2], minthreshold, maxthreshold, 1);
std::cout << "OTSUmin:" << minthreshold << " OTSUmax:" << maxthreshold << std::endl;
//Canny边缘提取
Canny(channels[2], R_src, minthreshold, maxthreshold);
bitwise_or(B_src, G_src, dstmat);
bitwise_or(R_src, dstmat, dstmat);
//CvUtils::SetShowWindow(dstmat, "dstmat", 700, 20);
//imshow("dstmat", dstmat);
std::vector<std::vector<cv::Point>> contours;
std::vector<cv::Vec4i> hierarchy;
findContours(dstmat, contours, hierarchy, cv::RETR_TREE, cv::CHAIN_APPROX_SIMPLE);
cv::Mat dstcontour = cv::Mat::zeros(cannysrc.size(), CV_8SC3);
cv::Mat tmpcontour;
dstcontour.copyTo(tmpcontour);
//定义拟合后的多边形数组
std::vector<std::vector<cv::Point>> vtshulls(contours.size());
for (int i = 0; i < contours.size(); i) {
//判断轮廓形状,不是四边形的忽略掉
double lensval = 0.01 * arcLength(contours[i], true);
std::vector<cv::Point> convexhull;
approxPolyDP(cv::Mat(contours[i]), convexhull, lensval, true);
//拟合的多边形存放到定义的数组中
vtshulls[i] = convexhull;
//不是四边形的过滤掉
if (convexhull.size() != 4) continue;
//求出最小旋转矩形
cv::RotatedRect rRect = minAreaRect(contours[i]);
//更新最小旋转矩形中面积最大的值
if (rRect.size.height == 0) continue;
if (rRect.size.area() > maxArea && rRect.size.area() > srcArea * 0.1
&& !CvUtils::CheckRectBorder(src, rRect)) {
maxArea = rRect.size.area();
maxAreaidx = i;
}
}
//找到符合条码的最大面积的轮廓进行处理
if (maxAreaidx >= 0) {
std::cout << "iterations:" << iterations << " maxAreaidx:" << maxAreaidx << std::endl;
//获取最小旋转矩形
cv::RotatedRect rRect = minAreaRect(contours[maxAreaidx]);
cv::Point2f vertices[4];
//重新排序矩形坐标点,按左上,右上,右下,左下顺序
CvUtils::SortRotatedRectPoints(vertices, rRect);
std::cout << "Rect:" << vertices[0] << vertices[1] << vertices[2] << vertices[3] << std::endl;
//根据获得的4个点画线
for (int k = 0; k < 4; k) {
line(dstcontour, vertices[k], vertices[(k 1) % 4], cv::Scalar(255, 0, 0));
}
//计算四边形的四点坐标
cv::Point2f rPoints[4];
CvUtils::GetPointsFromRect(rPoints, vertices, vtshulls[maxAreaidx]);
for (int k = 0; k < 4; k) {
line(dstcontour, rPoints[k], rPoints[(k 1) % 4], cv::Scalar(255, 255, 255));
}
//采用离最小矩形四个点最近的重新设置范围,将所在区域的点做直线拟合再看看结果
cv::Point2f newPoints[4];
CvUtils::GetPointsFromFitline(newPoints, rPoints, vertices);
for (int k = 0; k < 4; k) {
line(dstcontour, newPoints[k], newPoints[(k 1) % 4], cv::Scalar(255, 100, 255));
}
//根据最小矩形和多边形拟合的最大四个点计算透视变换矩阵
cv::Point2f rectPoint[4];
//计算旋转矩形的宽和高
float rWidth = CvUtils::CalcPointDistance(vertices[0], vertices[1]);
float rHeight = CvUtils::CalcPointDistance(vertices[1], vertices[2]);
//计算透视变换的左上角起始点
float left = dstcontour.cols;
float top = dstcontour.rows;
for (int i = 0; i < 4; i) {
if (left > newPoints[i].x) left = newPoints[i].x;
if (top > newPoints[i].y) top = newPoints[i].y;
}
rectPoint[0] = cv::Point2f(left, top);
rectPoint[1] = rectPoint[0] cv::Point2f(rWidth, 0);
rectPoint[2] = rectPoint[1] cv::Point2f(0, rHeight);
rectPoint[3] = rectPoint[0] cv::Point2f(0, rHeight);
//计算透视变换矩阵
cv::Mat warpmatrix = getPerspectiveTransform(rPoints, rectPoint);
cv::Mat resultimg;
//透视变换
warpPerspective(src, resultimg, warpmatrix, resultimg.size(), cv::INTER_LINEAR);
/*CvUtils::SetShowWindow(resultimg, "resultimg", 200, 20);
imshow("resultimg", resultimg);*/
//载取透视变换后的图像显示出来
cv::Rect cutrect = cv::Rect(rectPoint[0], rectPoint[2]);
resultMat = resultimg(cutrect);
//CvUtils::SetShowWindow(resultMat, "resultMat", 600, 20);
//cv::imshow("resultMat", resultMat);
iterations--;
if (iterations > 0) {
resultMat = GetPerspectiveMat(resultMat, iterations);
}
}
else {
src.copyTo(resultMat);
}
return resultMat;
}
std::vector<cv::Mat> PaddleOcrApi::GetNumMat(cv::Mat& src)
{
std::vector<cv::Mat> vts;
cv::Mat tmpsrc, tmpgray, threshsrc;
src.copyTo(tmpsrc);
//使用拉普拉斯算子实现图像对比度提高
cv::Mat Laplancekernel = (cv::Mat_<float>(3, 3) << 1, 1, 1, 1, -8, 1, 1, 1, 1);
cv::Mat imgLaplance, resimg;
cv::filter2D(tmpsrc, imgLaplance, CV_32F, Laplancekernel);
tmpsrc.convertTo(resimg, CV_32F);
resimg = resimg - imgLaplance;
resimg.convertTo(tmpsrc, CV_8UC3);
CvUtils::SetShowWindow(tmpsrc, "resimg", 700, 20);
cv::imshow("resimg", tmpsrc);
cv::cvtColor(tmpsrc, tmpgray, cv::COLOR_BGR2GRAY);
//二值化
cv::threshold(tmpgray, threshsrc, 0, 255, cv::THRESH_BINARY_INV | cv::THRESH_OTSU);
CvUtils::SetShowWindow(threshsrc, "threshsrc", 700, 20);
cv::imshow("threshsrc", threshsrc);
cv::Mat dst;
cv::distanceTransform(threshsrc, dst, cv::DIST_L1, 3, 5);
CvUtils::SetShowWindow(dst, "dst1", 700, 20);
cv::imshow("dst1", dst);
cv::normalize(dst, dst, 0, 1, cv::NORM_MINMAX);
CvUtils::SetShowWindow(dst, "dst2", 500, 20);
cv::imshow("dst2", dst);
cv::threshold(dst, dst, 0.1, 1, cv::THRESH_BINARY);
CvUtils::SetShowWindow(dst, "dst3", 500, 20);
cv::imshow("dst3", dst);
//std::vector<cv::Vec4f> lines;
//cv::HoughLinesP(dst_8u, lines, 1, CV_PI / 180.0, 200, 50, 40);
//cv::Scalar color = cv::Scalar(0, 0, 255);
//for (int i = 0; i < lines.size(); i) {
// cv::Vec4f line = lines[i];
// cv::putText(tmpsrc, std::to_string(i), cv::Point(line[0], line[1]), 1, 1, color);
// cv::line(tmpsrc, cv::Point(line[0], line[1]), cv::Point(line[2], line[3]), color);
//}
//CvUtils::SetShowWindow(tmpsrc, "tmpsrc", 300, 20);
//cv::imshow("tmpsrc", tmpsrc);
//开运算
cv::Mat morph1, morph2, morphcalc;
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(5, 1));
cv::morphologyEx(dst, morph1, cv::MORPH_CLOSE, kernel, cv::Point(-1, -1), 1);
CvUtils::SetShowWindow(morph1, "morph1", 500, 20);
cv::imshow("morph1", morph1);
//cv::morphologyEx(threshsrc, morph2, cv::MORPH_TOPHAT, kernel);
//CvUtils::SetShowWindow(morph2, "morph2", 500, 20);
//cv::imshow("morph2", morph2);
//morphcalc = threshsrc - morph2;
//CvUtils::SetShowWindow(morphcalc, "morphcalc", 500, 20);
//cv::imshow("morphcalc", morphcalc);
cv::Mat dst_8u;
morph1.convertTo(dst_8u, CV_8U);
CvUtils::SetShowWindow(dst_8u, "dst_8u", 300, 20);
cv::imshow("dst_8u", dst_8u);
std::vector<std::vector<cv::Point>> contours;
std::vector<cv::Vec4i> hierarchy;
findContours(dst_8u, contours, hierarchy, cv::RETR_TREE, cv::CHAIN_APPROX_SIMPLE);
////定义拟合后的多边形数组
std::vector<std::vector<cv::Point>> vtshulls;
//for (int i = 0; i < contours.size(); i) {
// //cv::drawContours(tmpsrc, contours, i, cv::Scalar(0, 0, 255));
// //判断轮廓形状,不是四边形的忽略掉
// double lensval = 0.01 * arcLength(contours[i], true);
// std::vector<cv::Point> convexhull;
// approxPolyDP(cv::Mat(contours[i]), convexhull, lensval, true);
// //不是四边形的过滤掉
// if (convexhull.size() != 4) continue;
// vtshulls.push_back(convexhull);
//}
std::cout << "contourssize:" << contours.size() << std::endl;
cv::Mat dstimg = cv::Mat::zeros(src.size(), CV_8UC1);
for (int i = 0; i < contours.size(); i) {
cv::drawContours(dstimg, contours, static_cast<int>(i), cv::Scalar::all(255), -1);
}
CvUtils::SetShowWindow(dstimg, "dstimg", 300, 20);
cv::imshow("dstimg", dstimg);
return vts;
}
std::string PaddleOcrApi::wstr2utf8str(const std::wstring& str)
{
static std::wstring_convert<std::codecvt_utf8<wchar_t> > strCnv;
return strCnv.to_bytes(str);
}
std::wstring PaddleOcrApi::utf8str2wstr(const std::string& str)
{
static std::wstring_convert< std::codecvt_utf8<wchar_t> > strCnv;
return strCnv.from_bytes(str);
}
std::string PaddleOcrApi::wstr2str(const std::wstring& str, const std::string& locale)
{
typedef std::codecvt_byname<wchar_t, char, std::mbstate_t> F;
static std::wstring_convert<F> strCnv(new F(locale));
return strCnv.to_bytes(str);
}
std::wstring PaddleOcrApi::str2wstr(const std::string& str, const std::string& locale)
{
typedef std::codecvt_byname<wchar_t, char, std::mbstate_t> F;
static std::wstring_convert<F> strCnv(new F(locale));
return strCnv.from_bytes(str);
}
04
main主程序中的调用
代码语言:javascript复制 std::vector<std::pair<std::string, cv::Rect>> vtsocrs;
PaddleOcrApi::GetPaddleOCRTextRect(resultMat, vtsocrs);
//输出识别文字
//if (!resultMat.empty()) {
// putText::putTextZH(resultMat, resstr.data(), cv::Point(20, 20), cv::Scalar(0, 0, 255), 1);
// cv::putText(resultMat, resstr, cv::Point(20, 50), 1, 1, cv::Scalar(0, 0, 255));
//}
std::cout << "输出:" << std::endl;
for (int i = 0; i < vtsocrs.size(); i) {
int B = cv::theRNG().uniform(0, 255);
int G = cv::theRNG().uniform(0, 255);
int R = cv::theRNG().uniform(0, 255);
cv::Rect tmprect = vtsocrs[i].second;
std::string tmptext = "N" std::to_string(i) ":" vtsocrs[i].first;
cv::Point pt = cv::Point(tmprect.x, tmprect.y);
cv::rectangle(resultMat, tmprect, cv::Scalar(B, G, R));
cv::putText(resultMat, tmptext, pt, 1, 1.2, cv::Scalar(B, G, R));
std::cout << tmptext << std::endl;
}
CvUtils::SetShowWindow(resultMat, "cutMat", 600, 20);
cv::imshow("cutMat", resultMat);
将调用成功后的列表,使用随机颜色显示出来,就实现了文章开头的效果了。
源码中关于动态库里的修改我会上传上来,完整的PaddleOCR的源码各位从PaddleOCR的源码地址中下载即可。