大家好,又见面了,我是你们的朋友全栈君。
ASPP
空洞空间卷积池化金字塔(atrous spatial pyramid pooling (ASPP))对所给定的输入以不同采样率的空洞卷积并行采样,相当于以多个比例捕捉图像的上下文。
上图为deeplab v2的ASPP模块,deeplabv3中向ASPP中添加了BN层,其中空洞卷积的rate
的意思是在普通卷积的基础上,相邻权重之间的间隔为rate-1
, 普通卷积的rate
默认为1,所以空洞卷积的实际大小为 k ( k − 1 ) ( r a t e − 1 ) k (k-1)(rate-1) k (k−1)(rate−1),其中k为原始卷积核大小。
输出大小如何计算?
问题:当rate
接近feature map
大小时, 3 × 3 3times3 3×3滤波器不是捕获全图像上下文,而是退化为简单的 1 × 1 1times1 1×1滤波器,只有滤波器中心起作用。
改进:Concat( 1 × 1 1times 1 1×1卷积 , 3个 3 × 3 3times 3 3×3空洞卷积 ,pooled image feature)并且每个卷积核都有256个且都有BN层。
代码语言:javascript复制#without bn version
class ASPP(nn.Module):
def __init__(self, in_channel=512, depth=256):
super(ASPP,self).__init__()
self.mean = nn.AdaptiveAvgPool2d((1, 1)) #(1,1)means ouput_dim
self.conv = nn.Conv2d(in_channel, depth, 1, 1)
self.atrous_block1 = nn.Conv2d(in_channel, depth, 1, 1)
self.atrous_block6 = nn.Conv2d(in_channel, depth, 3, 1, padding=6, dilation=6)
self.atrous_block12 = nn.Conv2d(in_channel, depth, 3, 1, padding=12, dilation=12)
self.atrous_block18 = nn.Conv2d(in_channel, depth, 3, 1, padding=18, dilation=18)
self.conv_1x1_output = nn.Conv2d(depth * 5, depth, 1, 1)
def forward(self, x):
size = x.shape[2:]
image_features = self.mean(x)
image_features = self.conv(image_features)
image_features = F.upsample(image_features, size=size, mode='bilinear')
atrous_block1 = self.atrous_block1(x)
atrous_block6 = self.atrous_block6(x)
atrous_block12 = self.atrous_block12(x)
atrous_block18 = self.atrous_block18(x)
net = self.conv_1x1_output(torch.cat([image_features, atrous_block1, atrous_block6,
atrous_block12, atrous_block18], dim=1))
return net
发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/171645.html原文链接:https://javaforall.cn