Detectron2
是Facebook AI Research
的检测和分割框架,其主要基于PyTorch
实现,但具有更模块化设计,因此它是灵活且便于扩展的,具体简介可见Github库和Meta AI Blog Post。
@misc{wu2019detectron2,
author = {Yuxin Wu and Alexander Kirillov and Francisco Massa and
Wan-Yen Lo and Ross Girshick},
title = {Detectron2},
howpublished = {url{https://github.com/facebookresearch/detectron2}},
year = {2019}
}
1. Detectron2安装
首先官方要求的环境条件如下:
- Linux or macOS with Python ≥ 3.6
- PyTorch ≥ 1.8 and torchvision that matches the PyTorch installation. Install them together at pytorch.org to make sure of this
- OpenCV is optional but needed by demo and visualization
- gcc & g ≥ 5.4 are required
- ninja is optional but recommended for faster build
- Cuda & Cudnn
因此想要安装并使用Detectron2,需要有:
- 环境:Python,Cuda,Cudnn,gcc&g
- Python包:pytorch,torchvision,python-opencv
- 推荐:Anaconda
1.1 Linux
Linux安装直接按照官方文档的安装步骤即可
代码语言:javascript复制python -m pip install 'git https://github.com/facebookresearch/detectron2.git'
# (add --user if you don't have permission)
# Or, to install it from a local clone:
git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2
# On macOS, you may need to prepend the above commands with a few environment variables:
CC=clang CXX=clang ARCHFLAGS="-arch x86_64" python -m pip install ...
如果以上安装失败,可以尝试直接安装预编译文件,同样在官方文档有提供
1.2 Windows
1.2.1 VS2019 C 编译环境
Windows想要安装Detectron2,需要提前安装Microsoft Visual Studio 2019,然后选择安装“使用C 的桌面开发”,其他均默认即可。
1.2.2 pycocotools
安装方法一:
代码语言:javascript复制pip install git https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
安装方法二:
代码语言:javascript复制git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
python setup.py build_ext --inplace
python setup.py build_ext install
如果安装失败(一般都会失败),尝试下载“Microsoft Visual C Build Tools.exe” ,官网链接:https://go.microsoft.com/fwlink/?LinkId=691126。
如果在安装的过程中因网络问题失败,可以使用离线包,网盘链接:https://pan.baidu.com/s/1GeJ2c8MxnZP8lAYAwQACzg,提取码1114
。
1.2.3 Detectron2
使用Conda(推荐!之前有过同一个包使用conda安装的好用而pip安装的不好用的经历)或者pip下载包:
代码语言:javascript复制conda install cython
conda install ninja
conda install pywin32
下载Detectron2到本地:
代码语言:javascript复制git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2
或者
代码语言:javascript复制git clone https://github.com/facebookresearch/detectron2.git
cd detectron2
python setup.py build develop
2. 自定义数据集
2.1 关于COCO格式
Detectron2已经写好了COCO格式的数据集图像和标注的读取,因此通常减少工作量,可以自己写一个脚本将数据集转为COCO格式的标注。
COCO的文件目录如下:
代码语言:javascript复制-coco/
|-train2017/
|-1.jpg
|-2.jpg
|-val2017/
|-3.jpg
|-4.jpg
|-test2017/
|-5.jpg
|-6.jpg
|-annotations/
|-instances_train2017.json
|-instances_val2017.json
|-*.json
其中标注文件(json)最为重要,其格式如下:
代码语言:javascript复制{
"info": {//数据集信息,对于训练而言不重要
"year": int,
"version": str,
"description": str,
"contributor": str,
"url": str,
"date_created": datetime,
},
"images": [{
"id": int, //必要
"width": int, //必要
"height": int, //必要
"file_name": str, //必要
"license": int,
"flickr_url": str,
"coco_url": str,
"date_captured": datetime,
},{...}], //列表
"annotations": [{
"id": int, //标注id
"image_id": int, //所属图像id
"category_id": int, //类别id
"segmentation": RLE or [polygon], //图像分割标注
"area": float, //区域面积
"bbox": [x,y,width,height], //目标框左上角坐标以及宽高
"iscrowd": 0 or 1, //是否密集
},{...}], //列表
"categories": [{
"id": int, //类别序号
"name": str, //类别名称
"supercategory": str, //父类别
}], //列表
"licenses": [{//对于训练,不重要
"id": int,
"name": str,
"url": str,
}], //列表
}
2.2 注册数据集
代码语言:javascript复制import os
from detectron2.data import DatasetCatalog, MetadataCatalog
from detectron2.data.datasets.register_coco import register_coco_instances
DATA_ALL_CATEGORIES = [
{'id': 1, 'name': 'airplane'},
{'id': 2, 'name': 'ship'},
{'id': 3, 'name': 'storage tank'},
{'id': 4, 'name': 'baseball diamond'},
{'id': 5, 'name': 'tennis court'},
{'id': 6, 'name': 'basketball court'},
{'id': 7, 'name': 'ground track field'},
{'id': 8, 'name': 'harbor'},
{'id': 9, 'name': 'bridge'},
{'id': 10, 'name': 'vehicle'}
]
DATA_SPLITS = {}
DATA_SPLITS['nwpu_all'] = {
'nwpu_all_trainval': (
os.path.join(DATA_ROOT,"positive image set"),
os.path.join(DATA_ROOT,'trainval.json')
),
'nwpu_all_test': (
os.path.join(DATA_ROOT,"positive image set"),
os.path.join(DATA_ROOT,'test.json')
)
}
def _get_data_all_instance_meta():
thing_ids = [k["id"] for k in DATA_ALL_CATEGORIES]
thing_dataset_id_to_contiguous_id = {k: i for i, k in enumerate(thing_ids)}
thing_classes = [k["name"] for k in DATA_ALL_CATEGORIES]
ret = {
"thing_dataset": thing_dataset_id_to_contiguous_id,
"thing_classes": thing_classes,
}
return ret
def _get_builtin_metadata(dataset_name):
if dataset_name == "nwpu_all":
return _get_data_instance_meta(DATA_ALL_CATEGORIES)
def register_all(root):
for dataset_name, splits_per_dataset in DATA_SPLITS.items():
for key, (image_root, json_file) in splits_per_dataset.items():
assert os.path.exists(os.path.join(root, json_file))
register_coco_instances(
key,
_get_builtin_metadata(dataset_name),
os.path.join(root, json_file) if "://" not in json_file else json_file,
os.path.join(root, image_root),
)
DATA_ROOT = "D:/GISP/XIEMINGJIE/Code/Detection/dataset/NWPU VHR-10 dataset/"
register_all(DATA_ROOT)
此时已完成nwpu_all_trainval
以及nwpu_all_test
数据集的注册,可以通过代码查看:
print(DatasetCatalog.get("nwpu_all_trainval"))
print(DatasetCatalog.get("nwpu_all_test"))
当然,如果不想要使用COCO格式数据集也可以自定义注册函数,可以参考register_coco_instances
的代码:
def register_coco_instances(name, metadata, json_file, image_root):
"""
Args:
name (str): the name that identifies a dataset, e.g. "coco_2014_train".
metadata (dict): extra metadata associated with this dataset. You can
leave it as an empty dict.
json_file (str): path to the json instance annotation file.
image_root (str or path-like): directory which contains all the images.
"""
assert isinstance(name, str), name
assert isinstance(json_file, (str, os.PathLike)), json_file
assert isinstance(image_root, (str, os.PathLike)), image_root
# 1. register a function which returns dicts
DatasetCatalog.register(name, lambda: load_coco_json(json_file, image_root, name))
# 2. Optionally, add metadata about this dataset,
# since they might be useful in evaluation, visualization or logging
MetadataCatalog.get(name).set(
json_file=json_file, image_root=image_root, evaluator_type="coco", **metadata
)
其中load_coco_json
函数的功能是读取数据集标注文件,并以固定的形式返回,详细可见官网:
# load_coco_json返回的是一个列表
# 返回格式如下:
def load_coco_json(json_file, image_root, dataset_name=None, extra_annotation_keys=None):
# read and do something
# ...
# generate dataset_dicts like: ↓
dataset_dicts = [{'file_name': '...\images\001.jpg', 'height': 939, 'width': 1356, 'image_id': 0, 'annotations': [{'iscrowd': 0, 'bbox': [903, 57, 129, 123], 'category_id': 0, 'bbox_mode': <BoxMode.XYWH_ABS: 1>}]}, {...}]
return dataset_dicts
2.3 可视化工具
代码语言:javascript复制import random
import cv2
from detectron2.data import MetadataCatalog
from detectron2.data import detection_utils as utils
from detectron2.utils.visualizer import Visualizer
datasets_dicts = DatasetCatalog.get("nwpu_all_trainval_1shot")
for data in random.sample(datasets_dicts, 1):
img = utils.read_image(data["file_name"])
visual = Visualizer(img, metadata=MetadataCatalog.get("nwpu_all_trainval"),scale=0.5)
vis = visual.draw_dataset_dict(data)
cv2.imshow("window", vis.get_image()[:, :, ::-1])
cv2.waitKey()
2.4 自定义数据增强
在注册了数据集之后就可以用detectron2.data.build_detection_train_loader
和detectron2.data.build_detection_test_loader
构建Dataloader
,即数据集的加载方式。
from detectron2.config import get_cfg
import detectron2.data.transforms as T
from detectron2.model_zoo import model_zoo
from detectron2.data import build_detection_train_loader
from detectron2.data import DatasetMapper # the default mapper
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/retinanet_R_50_FPN_1x.yaml"))
cfg.DATASETS.TRAIN = ("nwpu_all_trainval",)
mapper = DatasetMapper(cfg,is_train=True,augmentations=[T.Resize((800, 800))])
train_loader = build_detection_train_loader(cfg,mapper=mapper)
build_detection_train_loader()
的参数如下:
build_detection_train_loader(
dataset,
*,
mapper,
sampler=None,
total_batch_size,
aspect_ratio_grouping=True,
num_workers=0,
collate_fn=None,
)
其中mapper
对应的就是数据增强部分,默认为detectron2.data.DatasetMapper
,sampler
对应的采样策略部分,通常只需要关注mapper
即可。
class DatasetMapper:
@configurable
def __init__(self,is_train: bool):
pass
@classmethod
def from_config(cls, cfg, is_train: bool = True):
pass
def _transform_annotations(self, dataset_dict, transforms, image_shape):
pass
def __call__(self, dataset_dict):
pass
return dataset_dict
官方给的自定义简化DataMapper:
代码语言:javascript复制from detectron2.data import detection_utils as utils
import detectron2.data.transforms as T
def mapper(dataset_dict):
dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
# can use other ways to read image
image = utils.read_image(dataset_dict["file_name"], format="BGR")
# "Data Augmentation"
auginput = T.AugInput(image)
transform = T.Resize((800, 800))(auginput)
image = torch.from_numpy(auginput.image.transpose(2, 0, 1))
annos = [
utils.transform_instance_annotations(annotation, [transform], image.shape[1:])
for annotation in dataset_dict.pop("annotations")
]
return {
# create the format that the model expects
"image": image,
"instances": utils.annotations_to_instances(annos, image.shape[1:])
}
dataloader = build_detection_train_loader(cfg, mapper=mapper)
因此自定义的数据增强需要满足,输入为dataset_dict
,输出为:
{
"images": image_tensor,
"instances": utils.annotations_to_instances => Instances类
}
3. 自定义模型
Detectron2的模型是分模块的,它将目标检测模型拆分为了4个核心模块:backbone
,proposal_generator
,roi_heads
以及meta_arch
。
3.1 特征提取网络(backbone)
在detectron2.modeling.backbone
路径下可以看到,目前只有ResNet
、FPN
和RegNet
可直接使用的backbone
:
build_resnet_backbone
build_resnet_fpn_backbone
build_retinanet_resnet_fpn_backbone
官方的自定义backbone
的案例:
from detectron2.modeling import BACKBONE_REGISTRY, Backbone, ShapeSpec
@BACKBONE_REGISTRY.register()
class ToyBackbone(Backbone):
def __init__(self, cfg, input_shape):
super().__init__()
# create your own backbone
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=16, padding=3)
def forward(self, image):
return {"conv1": self.conv1(image)}
def output_shape(self):
return {"conv1": ShapeSpec(channels=64, stride=16)}
3.2 候选框生成器(proposal_generator)
同样可以自定义注册
代码语言:javascript复制@PROPOSAL_GENERATOR_REGISTRY.register()
class ToyRPN(RPN):
def __init__(self,*args, **kwargs):
super().__init__()
pass
@RPN_HEAD_REGISTRY.register()
class ToyRPNHead(StandardRPNHead):
def __init__(self,*args, **kwargs):
super().__init__()
pass
3.3 检测器(roi_heads)
代码语言:javascript复制@ROI_MASK_HEAD_REGISTRY.register()
@ROI_KEYPOINT_HEAD_REGISTRY.register()
@ROI_HEADS_REGISTRY.register()
@ROI_BOX_HEAD_REGISTRY.register()
3.4 模型框架(meta_arch)
代码语言:javascript复制@META_ARCH_REGISTRY.register()
class ToyNet(nn.Module):
@configurable
def __init__(self,*args, **kwargs):
super().__init__()
pass
@classmethod
def from_config(cls, cfg):
pass
def forward_training(self,*args, **kwargs):
pass
def loss(self,*args, **kwargs):
pass
@torch.no_grad()
def label_anchors(self,*args, **kwargs):
pass
def forward_inference(self,*args, **kwargs):
pass
def inference_single_image(self,*args, **kwargs):
pass
具体可参考官方复现的projects
4. 模型训练
4.1 默认训练
一般而言,我们可以继承使用默认的目标检测任务训练器DefalutTrainer
,而DefalutTrainer
又是继承自TrainerBase
,TrainerBase
中又使用到了HookBase
。我的理解是HookBase
和TrainerBase
是将一个训练过程抽象并拆分成阶段步骤的过程,先看HookBase
:
class HookBase:
def before_train(self):
"""
Called before the first iteration.
"""
pass
def after_train(self):
"""
Called after the last iteration.
"""
pass
def before_step(self):
"""
Called before each iteration.
"""
pass
def after_step(self):
"""
Called after each iteration.
"""
pass
def state_dict(self):
return {}
对于训练而言,它将一个完整的训练拆分成:
代码语言:javascript复制class TrainerBase:
def __init__(self):
self._hooks: List[HookBase] = []
def register_hooks(self, hooks: List[Optional[HookBase]]) -> None:
hooks = [h for h in hooks if h is not None]
for h in hooks:
assert isinstance(h, HookBase)
h.trainer = weakref.proxy(self)
self._hooks.extend(hooks)
def train(self, start_iter: int, max_iter: int):
self.iter = self.start_iter = start_iter
self.max_iter = max_iter
with EventStorage(start_iter) as self.storage:
try:
self.before_train()
for self.iter in range(start_iter, max_iter):
self.before_step()
self.run_step()
self.after_step()
finally:
self.after_train()
def before_train(self):
for h in self._hooks:
h.before_train()
def after_train(self):
self.storage.iter = self.iter
for h in self._hooks:
h.after_train()
def before_step(self):
self.storage.iter = self.iter
for h in self._hooks:
h.before_step()
def after_step(self):
for h in self._hooks:
h.after_step()
def run_step(self):
raise NotImplementedError
def state_dict(self):
pass
def load_state_dict(self, state_dict):
pass
简化一点,它将一个训练过程抽象成:
代码语言:javascript复制hook.before_train()
for iter in range(start_iter, max_iter):
hook.before_step()
trainer.run_step()
hook.after_step()
iter = 1
hook.after_train()
具体到目标检测任务,DefaultTrainer
:
class DefaultTrainer(TrainerBase):
def __init__(self, cfg):
super().__init__()
pass
def build_hooks(self):
pass
def build_writers(self):
pass
def train(self):
pass
@classmethod
def test(cls, cfg, model, evaluators=None):
pass
#
# a lot of
# def ...(...):
# .....
#
@classmethod
def build_train_loader(cls, cfg):
return build_detection_train_loader(cfg)
@classmethod
def build_test_loader(cls, cfg, dataset_name):
return build_detection_test_loader(cfg, dataset_name)
@classmethod
def build_evaluator(cls, cfg, dataset_name):
pass
4.2 自定义训练
由于Detectron2已经将训练过程模块化,因此只需要修改对应模块即可,而一般而言,我们只需要修改数据加载和evaluate
部分:
from detectron2.engine import DefaultTrainer
class Trainer(DefaultTrainer):
@classmethod
def build_evaluator(cls, cfg, dataset_name, output_folder=None):
evaluator_list = []
pass
return DatasetEvaluators(evaluator_list)
@classmethod
def build_test_loader(cls, cfg, dataset_name):
return build_detection_test_loader(cfg, dataset_name, mapper=my_mapper(cfg,"test"))
@classmethod
def build_train_loader(cls, cfg):
return build_detection_train_loader(cfg, mapper=my_mapper(cfg, "train"))
4.3 完整训练流程
参考tools/train_net.py
或者tools/plain_train_net.py
,已经写的非常详细了