简单验证码识别实现

2022-01-14 18:06:32 浏览数 (1)

新学考成绩释放在即,故更新一下之前写的查询。这半年终于把原来的验证码存在Cookie里改成了session。那么还是来看看这个验证码吧:

验证码形式比较简单。比如:

。4位数字,每位为0-8,颜色随机。不过好在数字的位置是固定的。验证码有简单的扭曲处理,不过这个扭曲……看边框,似乎还是生成一个验证码再扭曲。拖进PS,发现背景的杂色一般是灰色小斑点。这种杂色的滤波非常简单,只需要过滤灰色。一般特征就是RGB三个分量差值小,为了防止黑色也被和谐,所以加上任一分量小于128的设定。进一步还发现有浅色的杂色,比如浅紫灰色。那么过滤就靠RGB三个分量相加,结果小于某一值。代码实现如下:

代码语言:javascript复制
private static boolean isBackgroundColor(int colorInt) {
    Color color = new Color(colorInt);
    int inter;
    inter = Math.abs(color.getRed() - color.getGreen())   Math.abs(color.getGreen() - color.getBlue())   Math.abs(color.getRed() - color.getBlue());
    return inter < 40 && color.getRed() > 128;
}

private static boolean isBackgroundColor2(int colorInt) {
    Color color = new Color(colorInt);
    return color.getRed() color.getGreen() color.getBlue() > 550;
}

然后就直接二值化咯:

代码语言:javascript复制
public static BufferedImage binaryzation(BufferedImage image)
        throws Exception {
    int width = image.getWidth();
    int height = image.getHeight();
    for (int x = 0; x < width;   x) {
        for (int y = 0; y < height;   y) {
            if (isBackgroundColor(image.getRGB(x, y))) {
                image.setRGB(x, y, Color.WHITE.getRGB());
            } else if(isBackgroundColor2(image.getRGB(x, y))) {
                image.setRGB(x, y, Color.WHITE.getRGB());
            } else {
                image.setRGB(x, y, Color.BLACK.getRGB());
            }
        }
    }
    return image;
}

来跑一边看看效果:

。还不错!接下来分割数字。因为有不同程度的拉伸,所以还是分为四位,每位分别识别好了。分割:

代码语言:javascript复制
public static List<BufferedImage> splitImage(BufferedImage image) throws Exception {
    List<BufferedImage> digitImageList = new ArrayList<>();
    digitImageList.add(image.getSubimage(0, 0, 16, 40));
    digitImageList.add(image.getSubimage(16, 0, 19, 40));
    digitImageList.add(image.getSubimage(36, 0, 22, 40));
    digitImageList.add(image.getSubimage(58, 0, 22, 40));
    return digitImageList;
}

分割结果:

 、

。分割完就可以来收集每一位数字了:

然后读入:

代码语言:javascript复制
static { // 装载模型
    try {
        model = new ArrayList<>();
        List<BufferedImage> list;
        for (int i = 0; i <= 3; i  ) {
            list = new ArrayList<>();
            for (int ii = 0; ii <= 8; ii  ) {
                list.add(ImageIO.read(new File("captcha/"   i   "/"   ii   ".png")));
            }
            model.add(list);
        }
    } catch (Exception e) {
        System.out.println("Error occurred in reading captcha model: "   e   ", "   e.getLocalizedMessage());
    }
}

因为字体也没变,所以直接逐像素比对,统计不同像素,取最小的一个数字。统计不同:

代码语言:javascript复制
private static int diff(BufferedImage img_a, BufferedImage img_b) {
    int diff = 0;
    int width = img_a.getWidth();
    int height = img_a.getHeight();
    for (int x = 0; x < width;   x) {
        for (int y = 0; y < height;   y) {
            if (img_a.getRGB(x, y) != img_b.getRGB(x, y)) diff  ;
        }
    }
    return diff;
}

最后就是比对,加入读入、二值化等等如下:

代码语言:javascript复制
public static String read(BufferedImage image) throws Exception {
    Filtering.binaryzation(image);
    List<BufferedImage> imgs = Infer.splitImage(image);
    BufferedImage cur;
    String result = "";
    int cur_diff, min_diff, min;
    for (int idx = 0; idx <= 3; idx  ) {
        cur = imgs.get(idx);
        min_diff = 999;  // 初始化一个极大值
        min = 0;
        for (int i = 0; i <= 8; i  ) {
            cur_diff = diff(cur, model.get(idx).get(i));
            // System.out.println("Diff for image: " idx ", " i ", result: " cur_diff);
            if (cur_diff < min_diff) {
                min_diff = cur_diff;
                min = i;
            }
        }
        result  = min;
    }
    return result;
}

测试起来,识别率基本就是100%。当然主要是因为验证码太简单了。

0 人点赞