FaceX-Zoo | 使用PyTorch Toolbox进行人脸识别(附源代码)

2021-07-09 16:19:46 浏览数 (1)

计算机视觉研究院专栏

作者:Edison_G

近年来,基于深度学习的人脸识别已经取得了显著的进展。然而,深度人脸识别的实际模型制作和进一步研究却非常需要相应的公众支持。

一、简要

近年来,基于深度学习的人脸识别已经取得了显著的进展。然而,深度人脸识别的实际模型制作和进一步研究却非常需要相应的公众支持。

例如,人脸表示网络的产生取消了一个模块化的训练方案,以考虑各种最先进的主干网络和训练监督对现实人脸识别需求的适当选择;对于性能分析和比较,在多个基准上使用一堆模型的标准和自动评估也是一个理想的工具;此外,欢迎以整体pipeline的形式部署人脸识别的公共基础。此外,还有一些新出现的挑战,例如最近全球COVID-19新冠造成的人脸遮挡识别,这在实际应用中引起了越来越多的关注。一个可行的解决方案是构建一个易于使用的统一框架来满足上述需求。

为此,有研究者引入了一种新的开源框架,名为FaceX-Zoo,它面向人脸识别的研究-开发社区。恢复高度模块化和可伸缩的设计,FaceX-Zoo提供了一个训练模块与各种监督头和主干的最先进人脸识别,以及一个在大多数流行的基准通过编辑一个简单的配置来实现标准化的测试模块。此外,还提供了一个简单但功能齐全的Face SDK来进行训练模型的验证和主要应用。没有包括尽可能多的先前的技术,而是使Facex-Zoo能够随着人脸相关领域的开发而轻松地升级和扩展。

二、新框架

FaceX-Zoo的整体体系结构详见上图。整个项目主要由四个部分组成:训练模块、测试模块、附加模块和Face SDK组成,其中前两个模块是本项目的核心部分。训练和评估模块中包含了几个组件,包括Pre-Processing, Training Mode, Backbone, Supervisory Head和Test Protocol。我们将详细说明如下。

Pre-Processing。此模块在将图像发送到网络之前,完成对图像的基本转换。对于训练实现了常用的操作,如调整大小、规范化、随机裁剪、随机翻转、随机旋转等。可以根据各种需求灵活地添加定制的操作。对于测试,只使用调整大小和标准化。同样,测试增强,如五种crops,水平翻转等,也可以很容易地添加到新框架中。

Training Mode。传统的人脸识别训练模块作为基线训练。具体地,通过DataLoader调度训练输入,然后将输入发送到主干网络进行学习,最后计算一个准则作为向后更新的训练损失。此外,还考虑了人脸识别的实际情况,即用浅层分布式数据训练网络。因此,整合了最近的训练策略,以促进对浅层人脸数据的训练。

Backbone。主干网络用于提取人脸图像的特征。在FaceX-Zoo中提供了一系列最先进的主干网络,如下所示。此外,在PyTorch的支持下,还可以轻松定制任何其他体系结构选择,只要修改配置文件并添加体系结构定义文件。

  • MobileFaceNet: An efficient network for the appli- caiton on mobile devices.
  • ResNet: A series of classic architectures for gen- eral vision tasks.
  • SE-ResNet: ResNet equipped with SE blocks that recalibrates the channel wise feature responses.
  • HRNet: A network for deep high-resolution rep- resentation learning.
  • EfficientNet: A bunch of architectures that scale among depth, width and resolution.
  • GhostNet : A model aiming at generating more feature maps from cheap operations.
  • AttentionNet: A network built by stacking atten- tion modules to learn attention-aware features.
  • TF-NAS: A series of architectures searched by NAS with the latency constraint.

Supervisory Head。为了学习人脸识别的鉴别特征,预测的logits 通常会通过一些特定的操作来处理,如normalization, scaling, adding margin等。在FaceX-Zoo中实现了一系列softmax- style的损失,如下:

  • AM-Softmax: An additive margin loss that adds a cosine margin penalty to the target logit.
  • ArcFace: An additive angular margin loss that adds a margin penalty to the target angle.
  • AdaCos: A cosine-based softmax loss that is hyperparameter-free and adaptive scaling.
  • AdaM-Softmax: An adaptive margin loss that can adjust the margins for different classes adaptively.
  • CircleLoss: A unified formula that learns with class-level labels and pair-wise labels.
  • CurricularFace: An loss function that adaptively adjusts the importance of easy and hard samples during different training stages.
  • MV-Softmax: A loss function that adaptively em- phasizes the mis-classified feature vectors to guide the discriminative feature learning.
  • NPCFace:A loss function that emphasizes the training on both the negative and positive hard cases

Test Protocol。有各种基准来测量人脸识别模型的准确性。他们中的许多人关注特定的人脸识别挑战,比如cross age, cross pose, and cross race。其中,常用的测试主要基于LFW和MegaFace。将这些协议与简单的使用和清晰的指令集成到FaceX-Zoo中,人们可以通过简单的配置在单个或多个基准测试上轻松地测试他们的模型。此外,通过添加测试数据和分析测试对,可以方便地扩展额外的测试协议。值得注意的是,还提供了一个基于MegaFace的蒙面人脸识别基准。

  • LFW: It contains 13,233 web-collected images of 5,749 identities with the pose, expression and illu- mination variations. We report the mean accuracy of 10-fold cross validation on this classic benchmark.
  • CPLFW: It contains 11,652 images of 3,930 iden- tities, which focuses on cross-pose face verification. Following the official protocol, the mean accuracy of 10-fold cross validation is adopted.
  • CALFW: It contains 12,174 images of 4,025 identities, aiming at cross-age face verification. The mean accuracy of 10-fold cross validation is adopted.
  • AgeDB30: It contains 12,240 images of 440 iden- tities, where each test pair has an age gap of 30 years. We report the mean accuracy of 10-fold cross valida- tion.
  • RFW: It contains 40,607 images of 11,430 identi- ties, which is proposed to measure the potential racial bias in face recognition. There are four test subsets in RFW, named African, Asian, Caucasian and Indian, and we report the mean accuracy of each subset, re- spectively.
  • MegaFace: It contains 80 probe identities with 1 million gallery distractors, aiming at evaluating large- scale face recognition performance. We report the Rank-K identification accuracy on MegaFace.
  • MegaFace-Mask: It contains the same probe identities and gallery distractors with MegaFace, while each probe image is added by a virtual mask. This protocol is designed to evaluate large-scale masked face recog- nition performance.

三、设计细节

训练模块

评估模块

Face SDK

四、实践

在图像上佩戴口罩。mask template模板可以从基于输入的遮挡人脸的各种选择中进行采样。

上面:原始的无面具的人脸图像。下面:由FMA-3D合成的戴口罩人脸图像。

0 人点赞