【从零开始学Mask RCNN】三,Mask RCNN网络架构解析及TensorFlow和Keras的交互

2020-07-02 10:33:38 浏览数 (1)

0. 前言

上一节把握了一下Mask RCNN项目的整体逻辑,这一节主要从TensorFlow和Keras的交互以及Mask RCNN的网络结构入手来分析一下。

1. TensorFlow和Keras的交互说明

相信熟悉Keras的同学都经常看到这行代码:

import keras.backend as K

如果Keras的后端是基于TensorFlow的,那么这个K就是Tensorflow了,那么自然会想一个问题,为什么不直接import tensorflow呢,这样不是多此一举吗?这个问题就涉及到TensorFlow和Keras的交互方法了。在这个Mask RCNN项目的构建模型的文件(mrcnn/model.py)中就涉及到了很多TensorFlow和Keras的交互方法,这些交互方法基本上都是对Keras的函数式API进行操作,但是Keras的函数模型转化为Model对象也是很方便,只需要用KM.Model(input_tensors, output_tensors)操作一下即可,其中KM表示的是:

代码语言:javascript复制
import keras.models as KM

接下啦我们结合mrcnn/model.py这个文件来展示一下TensorFlow和Keras交互的一些方法。

1.1 使用TensorFlow建立Keras新的Layer对象

model.py中可以看到大量的继承了keras.engine.Layer类的新类,例如DetectionTargetLayerPyramidROIAlign,这是因为TensorFlow的函数可以操作Keras的Tensor,但是它返回的TensorFlow的Tensor不能被Keras继续处理,因此我们需要建立新的Keras层进行转换,将TensorFlow的Tensor作为Keras层的__init__函数进行构建层,然后在__call__方法中使用TensorFlow的函数进行细粒度的数据处理,最后返回Keras层对象。

接下来我们首先看一下Keras的Tensor是可以直接送入到TensorFlow中的,例如语义分割中计算Dice系数的代码如下:

代码语言:javascript复制
def Dice_coeff(y_true, y_pred):
    smooth = 1.
    y_true_f = Flatten()(y_true)
    y_pred_f = Flatten()(y_pred)
    intersection = tf.reduce_sum(y_true_f * y_pred_f)
    score = (2. * intersection   smooth) / (tf.reduce_sum(y_true_f)   
                    tf.reduce_sum(y_pred_f)   smooth)
    return score

接下来我们看一下model.pyPyramidROIAlign这个类的代码实现:

代码语言:javascript复制
############################################################
#  ROIAlign Layer
############################################################

def log2_graph(x):
    """Log2函数的实现。TF没有现成的实现。"""
    return tf.log(x) / tf.log(2.0)

# TensorFlow和Keras的交互
class PyramidROIAlign(KE.Layer):
    """在特征金字塔的多个层上实现了ROI Pooling

    参数:
    - pool_shape: [pool_height, pool_width] 代表池化输出. 例如[7, 7]

    输入:
    - boxes: 归一化坐标系中的[batch, num_boxes, (y1, x1, y2, x2)]. 如果没有足够的边界框来填充数组就填0.
    - image_meta: [batch, (meta data)] Image details. See compose_image_meta()
    - feature_maps: 金字塔不同层级的特征图列表,每一个都是[batch, height, width, channels]

    输出:
    池化后的区域形状: [batch, num_boxes, pool_height, pool_width, channels].
    宽度和高度是Layer构造函数中池化层中的特定值。
    """

    def __init__(self, pool_shape, **kwargs):
        super(PyramidROIAlign, self).__init__(**kwargs)
        self.pool_shape = tuple(pool_shape)

    def call(self, inputs):
        # Crop boxes [batch, num_boxes, (y1, x1, y2, x2)] in normalized coords
        # num_boxes指的是proposal数目,它们均会作用于每张图片上,只是不同的proposal作用于图片
        # 的特征级别不同,这里通过循环特征层寻找符合的proposal,应用ROIAlign
        boxes = inputs[0]

        # Image meta
        # Holds details about the image. See compose_image_meta()
        image_meta = inputs[1]

        # Feature Maps. List of feature maps from different level of the
        # feature pyramid. Each is [batch, height, width, channels]
        feature_maps = inputs[2:]

        # Assign each ROI to a level in the pyramid based on the ROI area.
        y1, x1, y2, x2 = tf.split(boxes, 4, axis=2)
        h = y2 - y1
        w = x2 - x1
        # 使用第一张图像的形状,批处理中的图像必须具有相同的大小。
        image_shape = parse_image_meta_graph(image_meta)['image_shape'][0]
        # 特征金字塔网络论文中的方程式1。我们的坐标在这里是标准化的。
        # 例如 一个224x224 的ROI (像素) 和P4匹配
        image_area = tf.cast(image_shape[0] * image_shape[1], tf.float32)
        # h、w已经归一化
        roi_level = log2_graph(tf.sqrt(h * w) / (224.0 / tf.sqrt(image_area)))
        # 确保值位于2到5之间
        roi_level = tf.minimum(5, tf.maximum(
            2, 4   tf.cast(tf.round(roi_level), tf.int32)))
        # 删掉维度为1的维也就是第二维,还剩下的Tensor形状为:[batch, num_boxes]
        roi_level = tf.squeeze(roi_level, 2)

        # Loop through levels and apply ROI pooling to each. P2 to P5.
        # 遍历每个维度并且对P2到P5应用ROI Pooling
        pooled = []
        box_to_level = []
        for i, level in enumerate(range(2, 6)):
            # tf.where 返回值格式 [坐标1, 坐标2……]
            # np.where 返回值格式 [[坐标1.x, 坐标2.x……], [坐标1.y, 坐标2.y……]]
            
            # 返回坐标表示:第n张图片的第i个proposal,shape为
            ix = tf.where(tf.equal(roi_level, level))
            

            # tf.gather_nd(params,indices,name=None)
            # 按照indices的格式从params中抽取切片(合并为一个Tensor)
            # indices是一个K维整数Tensor,
            # [本level的proposal数目, 4]
            level_boxes = tf.gather_nd(boxes, ix)
            # Box indices for crop_and_resize.
            box_indices = tf.cast(ix[:, 0], tf.int32)

            # Keep track of which box is mapped to which level
            # 对哪个box和哪个level匹配上的关系进行跟踪记录
            box_to_level.append(ix)

            # Stop gradient propogation to ROI proposals
            level_boxes = tf.stop_gradient(level_boxes)
            box_indices = tf.stop_gradient(box_indices)

            # Crop and Resize
            # 来自Mask RCNN论文: “我们对四个常规位置进行抽样,这样我们就可以评估
            # 最大池或平均池。事实上,在每个bin中心仅插入一个值(无池)几乎同样有效。”
            # 在这里,我们使用简单的方法,即每个bin一个值,使用tf.crop_and_resize() 
            # 结果: [batch * num_boxes, pool_height, pool_width, channels]
            pooled.append(tf.image.crop_and_resize(
                feature_maps[i], level_boxes, box_indices, self.pool_shape,
                method="bilinear"))

        # 将池化后的特征打包到一个Tensor中
        pooled = tf.concat(pooled, axis=0)

        # 将box_to_level打包映射到一个序列并且添加另外一列代表ROI Pool后的候选框的顺序
        box_to_level = tf.concat(box_to_level, axis=0)    # [batch*num_boxes, 2]
        box_range = tf.expand_dims(tf.range(tf.shape(box_to_level)[0]), 1) # [batch*num_boxes, 1]
        box_to_level = tf.concat([tf.cast(box_to_level, tf.int32), box_range],
                                 axis=1) # [batch*num_boxes, 3]

        # 截止到目前,我们获取了记录全部ROIAlign结果feat集合的张量pooled,和记录这些feat相关信息的张量box_to_level,
        # 由于提取方法的原因,此时的feat并不是按照原始顺序排序(先按batch然后按box index排序),下面我们设法将之恢复顺
        # 序(ROIAlign作用于对应图片的对应proposal生成feat)
        # Rearrange pooled features to match the order of the original boxes
        # Sort box_to_level by batch then box index
        # TF doesn't have a way to sort by two columns, so merge them and sort.
        # box_to_level[i, 0]表示的是当前feat隶属的图片索引,box_to_level[i, 1]表示的是其box序号
        
        sorting_tensor = box_to_level[:, 0] * 100000   box_to_level[:, 1] # [batch*num_boxes]
        ix = tf.nn.top_k(sorting_tensor, k=tf.shape(
            box_to_level)[0]).indices[::-1]
        # tf.gather 切片操作
        ix = tf.gather(box_to_level[:, 2], ix)
        pooled = tf.gather(pooled, ix)

        # Re-add the batch dimension
        shape = tf.concat([tf.shape(boxes)[:2], tf.shape(pooled)[1:]], axis=0)
        pooled = tf.reshape(pooled, shape)
        return pooled

    def compute_output_shape(self, input_shape):
        return input_shape[0][:2]   self.pool_shape   (input_shape[2][-1], )

1.2 利用Keras的Lambda函数将TensorFlow函数引入Keras

除了上面的方法外,我们还可以引入Keras的Lamda函数将TensorFlow的操作转化为Keras的数据流,举例如下:

代码语言:javascript复制
 
rpn_bbox = KL.Lambda(lambda t: tf.reshape(t, [tf.shape(t)[0], -1, 4]))(x)

这样就可以将TensorFlow写好的函数输出直接转换为Keras的Module可以接收的类型。

1.3 继承Keras的层对象

还有一种方法是直接继承某个keras.layer,这种方法和方法1相比同样需要实现call方法,不过一般会继承父类,以改写Keras已经实现的层方法。例子如下,重新定义了一个BatchNorm层,继承了Keras的BatchNormlization。

代码语言:javascript复制
class BatchNorm(KL.BatchNormalization):
    """Extends the Keras BatchNormalization class to allow a central place
    to make changes if needed.

    Batch normalization has a negative effect on training if batches are small
    so this layer is often frozen (via setting in Config class) and functions
    as linear layer.
    """
    def call(self, inputs, training=None):
        """
        Note about training values:
            None: Train BN layers. This is the normal mode
            False: Freeze BN layers. Good when batch size is small
            True: (don't use). Set layer in training mode even when making inferences
        """
        return super(self.__class__, self).call(inputs, training=training)

2. 网络结构解析

首先我们来关注一下Mask RCNN整体结构图中的特征提取部分,也就是下图中的红色部分:

Mask RCNN的特征提取部分

上节已经讲到这里是主干网络为ResNet101的FPN网络作特征提取部分,接下来我们就来分析一下这部分的代码,首先我们定位到MaskRCNN这个最核心的类,我们先看一下__init__函数和build函数的准备部分,代码如下:

代码语言:javascript复制
class MaskRCNN():
    """Encapsulates the Mask RCNN model functionality.

    The actual Keras model is in the keras_model property.
    """

    def __init__(self, mode, config, model_dir):
        """
        mode: Either "training" or "inference"
        config: A Sub-class of the Config class
        model_dir: Directory to save training logs and trained weights
        """
        assert mode in ['training', 'inference']
        self.mode = mode
        self.config = config
        self.model_dir = model_dir
        self.set_log_dir()
        self.keras_model = self.build(mode=mode, config=config)

    def build(self, mode, config):
        """Build Mask R-CNN architecture.
            input_shape: The shape of the input image.
            mode: Either "training" or "inference". The inputs and
                outputs of the model differ accordingly.
        """
        assert mode in ['training', 'inference']

        # Image size must be dividable by 2 multiple times
        h, w = config.IMAGE_SHAPE[:2]
        if h / 2**6 != int(h / 2**6) or w / 2**6 != int(w / 2**6):
            raise Exception("Image size must be dividable by 2 at least 6 times "
                            "to avoid fractions when downscaling and upscaling."
                            "For example, use 256, 320, 384, 448, 512, ... etc. ")

        # Inputs
        input_image = KL.Input(
            shape=[None, None, config.IMAGE_SHAPE[2]], name="input_image")
        input_image_meta = KL.Input(shape=[config.IMAGE_META_SIZE],
                                    name="input_image_meta")
        if mode == "training":
            # RPN GT
            input_rpn_match = KL.Input(
                shape=[None, 1], name="input_rpn_match", dtype=tf.int32)
            input_rpn_bbox = KL.Input(
                shape=[None, 4], name="input_rpn_bbox", dtype=tf.float32)

            # Detection GT (class IDs, bounding boxes, and masks)
            # 1. GT Class IDs (zero padded)
            input_gt_class_ids = KL.Input(
                shape=[None], name="input_gt_class_ids", dtype=tf.int32)
            # 2. GT Boxes in pixels (zero padded)
            # [batch, MAX_GT_INSTANCES, (y1, x1, y2, x2)] in image coordinates
            input_gt_boxes = KL.Input(
                shape=[None, 4], name="input_gt_boxes", dtype=tf.float32)
            # Normalize coordinates
            gt_boxes = KL.Lambda(lambda x: norm_boxes_graph(
                x, K.shape(input_image)[1:3]))(input_gt_boxes)
            # 3. GT Masks (zero padded)
            # [batch, height, width, MAX_GT_INSTANCES]
            if config.USE_MINI_MASK:
                input_gt_masks = KL.Input(
                    shape=[config.MINI_MASK_SHAPE[0],
                           config.MINI_MASK_SHAPE[1], None],
                    name="input_gt_masks", dtype=bool)
            else:
                input_gt_masks = KL.Input(
                    shape=[config.IMAGE_SHAPE[0], config.IMAGE_SHAPE[1], None],
                    name="input_gt_masks", dtype=bool)
        elif mode == "inference":
            # Anchors in normalized coordinates
            input_anchors = KL.Input(shape=[None, 4], name="input_anchors")

这里强制要求了图像的输入尺寸为

2^6

的倍数,以保证下采样后不产生小数。可以看到初始化函数中需要外面输入的变量只有三个,另外注意到在build()函数里面有大量的变量用KL.Input来定义,我们看一下Keras.Input层的API:

代码语言:javascript复制
tf.keras.Input(
    shape=None,
    batch_size=None,
    name=None,
    dtype=None,
    sparse=False,
    tensor=None,
    **kwargs
)

其中,

代码语言:javascript复制
参数:
shape:形状元组(整数),不包括批量大小。例如,shape=(32,)表示预期输入将是32维向量的批次。
batch_size:可选的静态批处理大小(整数)。
name:图层的可选名称字符串。在模型中应该是唯一的(不要重复使用相同的名称两次)。如果没有提供,它将自动生成。
dtype:数据类型由输入预期的,作为字符串(float32,float64,int32...)
sparse:一个布尔值,指定要创建的占位符是否稀疏。
tensor:可选的现有张量以包装到Input图层中。如果设置,该图层将不会创建占位符张量。
**kwargs:不推荐的参数支持。

由于在上面的代码中没有指定batch_size参数,所以它们的实际shape还需要加上batch,实际shape如下:

代码语言:javascript复制
input_image:输入图片,[batch, None, None, config.IMAGE_SHAPE[2]]
input_image_meta:图片的信息(包含形状、预处理信息等,后面会介绍),[batch, config.IMAGE_META_SIZE]
input_anchors:锚框,[batch, None, 4]

接下来就是BackBone(ResNet101)部分的代码了:

代码语言:javascript复制
## ResNet101部分
        # Build the shared convolutional layers.
        # Bottom-up Layers
        # Returns a list of the last layers of each stage, 5 in total.
        # Don't create the thead (stage 5), so we pick the 4th item in the list.
        if callable(config.BACKBONE):
            _, C2, C3, C4, C5 = config.BACKBONE(input_image, stage5=True,
                                                train_bn=config.TRAIN_BN)
        else:
            _, C2, C3, C4, C5 = resnet_graph(input_image, config.BACKBONE,
                                             stage5=True, train_bn=config.TRAIN_BN)

可以看到代码设置了2种可选的方式,一种是在config中自定义的BackBone网络,一种是默认实现即ResNet101 BackBone,我们来分析一下resnet_graph函数:

代码语言:javascript复制
############################################################
#  Resnet Graph
############################################################

# Code adopted from:
# https://github.com/fchollet/deep-learning-models/blob/master/resnet50.py

def identity_block(input_tensor, kernel_size, filters, stage, block,
                   use_bias=True, train_bn=True):
    """The identity_block is the block that has no conv layer at shortcut
    # Arguments
        input_tensor: input tensor
        kernel_size: default 3, the kernel size of middle conv layer at main path
        filters: list of integers, the nb_filters of 3 conv layer at main path
        stage: integer, current stage label, used for generating layer names
        block: 'a','b'..., current block label, used for generating layer names
        use_bias: Boolean. To use or not use a bias in conv layers.
        train_bn: Boolean. Train or freeze Batch Norm layers
    """
    nb_filter1, nb_filter2, nb_filter3 = filters
    conv_name_base = 'res'   str(stage)   block   '_branch'
    bn_name_base = 'bn'   str(stage)   block   '_branch'

    x = KL.Conv2D(nb_filter1, (1, 1), name=conv_name_base   '2a',
                  use_bias=use_bias)(input_tensor)
    x = BatchNorm(name=bn_name_base   '2a')(x, training=train_bn)
    x = KL.Activation('relu')(x)

    x = KL.Conv2D(nb_filter2, (kernel_size, kernel_size), padding='same',
                  name=conv_name_base   '2b', use_bias=use_bias)(x)
    x = BatchNorm(name=bn_name_base   '2b')(x, training=train_bn)
    x = KL.Activation('relu')(x)

    x = KL.Conv2D(nb_filter3, (1, 1), name=conv_name_base   '2c',
                  use_bias=use_bias)(x)
    x = BatchNorm(name=bn_name_base   '2c')(x, training=train_bn)

    x = KL.Add()([x, input_tensor])
    x = KL.Activation('relu', name='res'   str(stage)   block   '_out')(x)
    return x


def conv_block(input_tensor, kernel_size, filters, stage, block,
               strides=(2, 2), use_bias=True, train_bn=True):
    """conv_block is the block that has a conv layer at shortcut
    # Arguments
        input_tensor: input tensor
        kernel_size: default 3, the kernel size of middle conv layer at main path
        filters: list of integers, the nb_filters of 3 conv layer at main path
        stage: integer, current stage label, used for generating layer names
        block: 'a','b'..., current block label, used for generating layer names
        use_bias: Boolean. To use or not use a bias in conv layers.
        train_bn: Boolean. Train or freeze Batch Norm layers
    Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
    And the shortcut should have subsample=(2,2) as well
    """
    nb_filter1, nb_filter2, nb_filter3 = filters
    conv_name_base = 'res'   str(stage)   block   '_branch'
    bn_name_base = 'bn'   str(stage)   block   '_branch'

    x = KL.Conv2D(nb_filter1, (1, 1), strides=strides,
                  name=conv_name_base   '2a', use_bias=use_bias)(input_tensor)
    x = BatchNorm(name=bn_name_base   '2a')(x, training=train_bn)
    x = KL.Activation('relu')(x)

    x = KL.Conv2D(nb_filter2, (kernel_size, kernel_size), padding='same',
                  name=conv_name_base   '2b', use_bias=use_bias)(x)
    x = BatchNorm(name=bn_name_base   '2b')(x, training=train_bn)
    x = KL.Activation('relu')(x)

    x = KL.Conv2D(nb_filter3, (1, 1), name=conv_name_base  
                  '2c', use_bias=use_bias)(x)
    x = BatchNorm(name=bn_name_base   '2c')(x, training=train_bn)

    shortcut = KL.Conv2D(nb_filter3, (1, 1), strides=strides,
                         name=conv_name_base   '1', use_bias=use_bias)(input_tensor)
    shortcut = BatchNorm(name=bn_name_base   '1')(shortcut, training=train_bn)

    x = KL.Add()([x, shortcut])
    x = KL.Activation('relu', name='res'   str(stage)   block   '_out')(x)
    return x


def resnet_graph(input_image, architecture, stage5=False, train_bn=True):
    """Build a ResNet graph.
        architecture: Can be resnet50 or resnet101
        stage5: Boolean. If False, stage5 of the network is not created
        train_bn: Boolean. Train or freeze Batch Norm layers
    """
    assert architecture in ["resnet50", "resnet101"]
    # Stage 1
    x = KL.ZeroPadding2D((3, 3))(input_image)
    x = KL.Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=True)(x)
    x = BatchNorm(name='bn_conv1')(x, training=train_bn)
    x = KL.Activation('relu')(x)
    C1 = x = KL.MaxPooling2D((3, 3), strides=(2, 2), padding="same")(x)
    # Stage 2
    x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1), train_bn=train_bn)
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='b', train_bn=train_bn)
    C2 = x = identity_block(x, 3, [64, 64, 256], stage=2, block='c', train_bn=train_bn)
    # Stage 3
    x = conv_block(x, 3, [128, 128, 512], stage=3, block='a', train_bn=train_bn)
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='b', train_bn=train_bn)
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='c', train_bn=train_bn)
    C3 = x = identity_block(x, 3, [128, 128, 512], stage=3, block='d', train_bn=train_bn)
    # Stage 4
    x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a', train_bn=train_bn)
    block_count = {"resnet50": 5, "resnet101": 22}[architecture]
    for i in range(block_count):
        x = identity_block(x, 3, [256, 256, 1024], stage=4, block=chr(98   i), train_bn=train_bn)
    C4 = x
    # Stage 5
    if stage5:
        x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a', train_bn=train_bn)
        x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b', train_bn=train_bn)
        C5 = x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c', train_bn=train_bn)
    else:
        C5 = None
    return [C1, C2, C3, C4, C5]

从这个ResNet101的实现我们总结出这个网络中主要包含了应用了ShortCut和没有ShortCut的两种子结构,如下图所示:

ResNet 101中两种典型卷积模块

接下来就是Mask RCNN结构图里面的FPN部分了,代码如下:

代码语言:javascript复制
# FPN部分
        # Top-down Layers
        # TODO: add assert to varify feature map sizes match what's in config
        P5 = KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (1, 1), name='fpn_c5p5')(C5)
        P4 = KL.Add(name="fpn_p4add")([
            KL.UpSampling2D(size=(2, 2), name="fpn_p5upsampled")(P5),
            KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (1, 1), name='fpn_c4p4')(C4)])
        P3 = KL.Add(name="fpn_p3add")([
            KL.UpSampling2D(size=(2, 2), name="fpn_p4upsampled")(P4),
            KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (1, 1), name='fpn_c3p3')(C3)])
        P2 = KL.Add(name="fpn_p2add")([
            KL.UpSampling2D(size=(2, 2), name="fpn_p3upsampled")(P3),
            KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (1, 1), name='fpn_c2p2')(C2)])
        # Attach 3x3 conv to all P layers to get the final feature maps.
        P2 = KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (3, 3), padding="SAME", name="fpn_p2")(P2)
        P3 = KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (3, 3), padding="SAME", name="fpn_p3")(P3)
        P4 = KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (3, 3), padding="SAME", name="fpn_p4")(P4)
        P5 = KL.Conv2D(config.TOP_DOWN_PYRAMID_SIZE, (3, 3), padding="SAME", name="fpn_p5")(P5)
        # P6 is used for the 5th anchor scale in RPN. Generated by
        # subsampling from P5 with stride of 2.
        P6 = KL.MaxPooling2D(pool_size=(1, 1), strides=2, name="fpn_p6")(P5)

最后网络提取到的特征集合为:

代码语言:javascript复制
 # 汇总提取到的特征集合
        # Note that P6 is used in RPN, but not in the classifier heads.
        rpn_feature_maps = [P2, P3, P4, P5, P6]
        mrcnn_feature_maps = [P2, P3, P4, P5]

3. 小结

本节就暂时讲到这里,现在我们获得了汇总提取到的特征图集合,接下来在这个特征图上我们会完成Anchor的生成以及Proposal生成,下一节我们就来讲解一下Anchor生成和Proposal候选框生成。有任何问题欢迎在留言区交流。

0 人点赞