PyTorch技术点整理

2022-03-24 11:11:30 浏览数 (1)

我们先从一个简单的例子来看看PyTorch和Tensorflow的区别

代码语言:javascript复制
import torch
import tensorflow as tf

if __name__ == "__main__":

    A = torch.Tensor([0])
    B = torch.Tensor([10])
    while (A < B)[0]:
        A  = 2
        B  = 1
    print(A, B)

    C = tf.constant([0])
    D = tf.constant([10])
    while (C < D)[0]:
        C = tf.add(C, 2)
        D = tf.add(D, 1)
    print(C, D)

运行结果

代码语言:javascript复制
tensor([20.]) tensor([20.])
tf.Tensor([20], shape=(1,), dtype=int32) tf.Tensor([20], shape=(1,), dtype=int32)

这里我们可以看到PyTorch更简洁,不需要那么多的接口API,更接近于Python编程本身。

张量

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.Tensor([[1, 2], [3, 4]])
    print(a)
    print(a.shape)
    print(a.type())

    b = torch.Tensor(2, 3)
    print(b)
    print(b.type())

    c = torch.ones(2, 2)
    print(c)
    print(c.type())

    d = torch.zeros(2, 2)
    print(d)
    print(d.type())
    # 定义一个对角矩阵
    e = torch.eye(2, 2)
    print(e)
    print(e.type())
    # 定义一个和b相同形状的全0矩阵
    f = torch.zeros_like(b)
    print(f)
    # 定义一个和b相同形状的全1矩阵
    g = torch.ones_like(b)
    print(g)
    # 定义一个序列
    h = torch.arange(0, 11, 1)
    print(h)
    print(h.type())
    # 获取2~10之间等间隔的4个值
    i = torch.linspace(2, 10, 4)
    print(i)

运行结果

代码语言:javascript复制
tensor([[1., 2.],
        [3., 4.]])
torch.Size([2, 2])
torch.FloatTensor
tensor([[7.0976e 22, 4.1828e 09, 4.2320e 21],
        [1.1818e 22, 7.0976e 22, 1.8515e 28]])
torch.FloatTensor
tensor([[1., 1.],
        [1., 1.]])
torch.FloatTensor
tensor([[0., 0.],
        [0., 0.]])
torch.FloatTensor
tensor([[1., 0.],
        [0., 1.]])
torch.FloatTensor
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
torch.LongTensor
tensor([ 2.0000,  4.6667,  7.3333, 10.0000])

从以上结果可以看到,PyTorch不仅可以直接定义具体的值还可以直接定义形状,而直接定义形状会得到一个随机初始化的值。然后就是全0、全1、对角矩阵以及相同形状的全0矩阵、相同形状的全1矩阵,连续序列,这些tensorflow里面都有。但也有一些跟tensorflow不同的部分

代码语言:javascript复制
# 生成0~1之间的随机值矩阵
j = torch.rand(2, 2)
print(j)
# 生成一个正态分布,包含均值和标准差
k = torch.normal(mean=0.0, std=torch.rand(5))
print(k)
# 生成一个均匀分布
l = torch.Tensor(2, 2).uniform_(-1, 1)
print(l)
# 生成一个乱序序列
m = torch.randperm(10)
print(m)
print(m.type())

运行结果

代码语言:javascript复制
tensor([[0.6257, 0.5132],
        [0.5961, 0.8336]])
tensor([ 0.4738,  0.3879,  0.0394, -0.3446,  0.4863])
tensor([[ 0.6498, -0.8387],
        [ 0.3767, -0.9012]])
tensor([9, 6, 1, 3, 0, 2, 7, 5, 8, 4])
torch.LongTensor

这里我们需要注意的是torch.arange(0, 11, 1)以及torch.randperm(10)的结果类型都是torch.LongTensor,其他都是torch.FloatTensor的。

Tensor的属性

每一个Tensor有torch.dtype、torch.device、torch.layout三种属性。torch.dtype就是数据的类型,这个很容易理解。

  1. torch.device标识了torch.Tensor对象在创建之后所存储的设备名称——比如CPU,GPU。对于GPU一般是通过cuda来表示,如果我们的电脑有多块GPU的话,则使用cuda:0、cuda:1、cuda:2这样的方式来表示。

torch.layout表示torch.Tensor内存布局的对象。我们可以将其定义为稠密的张量,也就是我们日常使用的张量,如果不做特殊的定义的话,就是稠密的张量。换句话说就是它对应到内存中的一块连续的区域。还有一种方式就是可以采用稀疏的方式来对张量进行存储,而采用稀疏的方式存储的是Tensor中非0元素的坐标。对于稠密的张量的完整定义如下

代码语言:javascript复制
a = torch.tensor([1, 2, 3], dtype=torch.float32, device=torch.device('cpu'))
print(a)

运行结果

代码语言:javascript复制
tensor([1., 2., 3.])

如果是定义在GPU上,则定义如下

代码语言:javascript复制
a = torch.tensor([1, 2, 3], dtype=torch.float32, device=torch.device('cuda:0'))

如果只有一块显卡可以选择默认的显卡

代码语言:javascript复制
a = torch.tensor([1, 2, 3], dtype=torch.float32, device=torch.device('cuda'))

而稀疏表达了当前Tensor中非0元素的个数,非0元素的个数越多,我们当前的数据越稀疏,如果全部是0的话就是非稀疏的情况了。低秩描述了数据本身之间的关联性,秩表示了当前矩阵中的向量之间的一个线性可表示的关系,具体可以参考线性代数整理(二) 中矩阵的秩。通过稀疏可以使我们的模型变的非常的简单。机器学习中包含有参数的模型和无参数的模型,对于有参数的模型,如果这些参数中0的个数非常多,就意味着可以把模型进行简化。参数为0的项实际上是可以消掉的。此时参数的个数是降低了,模型变得更简单了。对于参数稀疏化的学习中,对于机器学习中是一个非常重要的性质。这是我们从机器学习的角度去介绍稀疏的意义。另外,我们通过对数据稀疏化的表示的时候,可以在内存中减少开销。假设有一个100*100的矩阵,其中只有一个非0的值,其他9999个值全部是0,此时如果我们在内存中采用稠密的张量进行存储的时候,此时我们就需要10000个内存单元来进行存储;如果我们采用稀疏的张量进行存储,此时我们只需要记录这个非0元素的坐标就可以了,所以我们在定义稀疏张量的时候需要给出坐标值和原值,定义如下

代码语言:javascript复制
# 定义稀疏张量的坐标值
indices = torch.tensor([[0, 1, 1], [2, 0, 2]])
# 定义稀疏张量的原值
values = torch.tensor([3, 4, 5], dtype=torch.float32)
# 定义一个稀疏张量
a = torch.sparse_coo_tensor(indices, values, [2, 4])
print(a)
# 转成稠密张量
print(a.to_dense())

运行结果

代码语言:javascript复制
tensor(indices=tensor([[0, 1, 1],
                       [2, 0, 2]]),
       values=tensor([3., 4., 5.]),
       size=(2, 4), nnz=3, layout=torch.sparse_coo)
tensor([[0., 0., 3., 0.],
        [4., 0., 5., 0.]])

转成稠密张量以后,我们可以发现,经过稀疏张量的定义,它是在(0,2)、(1,0)、(1,2)的位置上放置了3、4、5三个非0值,而其他地方的值都为0。]

我们可以再来定义一个对角矩阵的稀疏张量

代码语言:javascript复制
# 定义稀疏张量的坐标值
indices = torch.tensor([[0, 1, 2, 3], [0, 1, 2, 3]])
# 定义稀疏张量的原值
values = torch.tensor([3, 4, 5, 6], dtype=torch.float32)
# 定义一个稀疏张量
a = torch.sparse_coo_tensor(indices, values, [4, 4]).to_dense()
print(a)

运行结果

代码语言:javascript复制
tensor([[3., 0., 0., 0.],
        [0., 4., 0., 0.],
        [0., 0., 5., 0.],
        [0., 0., 0., 6.]])

为什么我们有的数据放在CPU上,有的放在GPU上,在图像处理的时候,我们需要对其中设计的数据进行合理的分配,比如我们在进行数据读取和数据预处理的时候,可能会优先放在CPU上来进行操作。对于参数的计算,进行推理和反向传播的过程,通常会放在GPU上进行运算。通过对资源合理的分配来实现对资源利用率的最大化,保证网络训练和迭代的过程变得更加的快。

Tensor的算数运算

  • 加法
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.tensor([1], dtype=torch.int32)
    b = torch.tensor([2], dtype=torch.int32)
    c = a   b
    print(c)
    c = torch.add(a, b)
    print(c)
    c = a.add(b)
    print(c)
    a.add_(b)
    print(a)

运行结果

代码语言:javascript复制
tensor([3], dtype=torch.int32)
tensor([3], dtype=torch.int32)
tensor([3], dtype=torch.int32)
tensor([3], dtype=torch.int32)

PyTorch有四种加法运算,前三种效果是一样的,第四种会直接把a b的值赋值给a。

但也有可能是一个向量或者是一个矩阵加上一个标量,则为这个向量或者矩阵所有的分量全部加上这个标量

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32)
    b = torch.tensor([2], dtype=torch.int32)
    c = a   b
    print(c)
    c = torch.add(a, b)
    print(c)
    c = a.add(b)
    print(c)
    a.add_(b)
    print(a)

运行结果

代码语言:javascript复制
tensor([[3, 4, 5],
        [6, 7, 8]], dtype=torch.int32)
tensor([[3, 4, 5],
        [6, 7, 8]], dtype=torch.int32)
tensor([[3, 4, 5],
        [6, 7, 8]], dtype=torch.int32)
tensor([[3, 4, 5],
        [6, 7, 8]], dtype=torch.int32)

如果两者都是向量或者矩阵,则必须两者的最后一个维度的长度相同,否者会报错

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32)
    b = torch.tensor([2, 4, 6], dtype=torch.int32)
    c = a   b
    print(c)
    c = torch.add(a, b)
    print(c)
    c = a.add(b)
    print(c)
    a.add_(b)
    print(a)

运行结果

代码语言:javascript复制
tensor([[ 3,  6,  9],
        [ 6,  9, 12]], dtype=torch.int32)
tensor([[ 3,  6,  9],
        [ 6,  9, 12]], dtype=torch.int32)
tensor([[ 3,  6,  9],
        [ 6,  9, 12]], dtype=torch.int32)
tensor([[ 3,  6,  9],
        [ 6,  9, 12]], dtype=torch.int32)
  • 减法
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.tensor([1], dtype=torch.int32)
    b = torch.tensor([2], dtype=torch.int32)
    c = b - a
    print(c)
    c = torch.sub(b, a)
    print(c)
    c = b.sub(a)
    print(c)
    b.sub_(a)
    print(b)

运行结果

代码语言:javascript复制
tensor([1], dtype=torch.int32)
tensor([1], dtype=torch.int32)
tensor([1], dtype=torch.int32)
tensor([1], dtype=torch.int32)

减法的其他规则与加法相同。

  • 乘法

哈达玛积(element wise,对应元素相乘)

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32)
    b = torch.tensor([2, 4, 6], dtype=torch.int32)
    c = a * b
    print(c)
    c = torch.mul(a, b)
    print(c)
    c = a.mul(b)
    print(c)
    a.mul_(b)
    print(a)

运行结果

代码语言:javascript复制
tensor([[ 2,  8, 18],
        [ 8, 20, 36]], dtype=torch.int32)
tensor([[ 2,  8, 18],
        [ 8, 20, 36]], dtype=torch.int32)
tensor([[ 2,  8, 18],
        [ 8, 20, 36]], dtype=torch.int32)
tensor([[ 2,  8, 18],
        [ 8, 20, 36]], dtype=torch.int32)
  • 除法
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
    b = torch.tensor([2, 4, 6], dtype=torch.int32)
    c = a / b
    print(c)
    c = torch.div(a, b)
    print(c)
    c = a.div(b)
    print(c)
    a.div_(b)
    print(a)

运行结果

代码语言:javascript复制
tensor([[0.5000, 0.5000, 0.5000],
        [2.0000, 1.2500, 1.0000]])
tensor([[0.5000, 0.5000, 0.5000],
        [2.0000, 1.2500, 1.0000]])
tensor([[0.5000, 0.5000, 0.5000],
        [2.0000, 1.2500, 1.0000]])
tensor([[0.5000, 0.5000, 0.5000],
        [2.0000, 1.2500, 1.0000]])
  • 矩阵运算

二维矩阵乘法运算

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.Tensor([[1, 2, 3], [4, 5, 6]])
    b = torch.Tensor([[2, 4], [11, 13], [7, 9]])
    c = a @ b
    print(c)
    c = torch.mm(a, b)
    print(c)
    c = torch.matmul(a, b)
    print(c)
    c = a.matmul(b)
    print(c)
    c = a.mm(b)
    print(c)

运行结果

代码语言:javascript复制
tensor([[ 45.,  57.],
        [105., 135.]])
tensor([[ 45.,  57.],
        [105., 135.]])
tensor([[ 45.,  57.],
        [105., 135.]])
tensor([[ 45.,  57.],
        [105., 135.]])
tensor([[ 45.,  57.],
        [105., 135.]])

关于矩阵乘法的数学意义请参考线性代数整理 中矩阵和矩阵的乘法

对于高维的Tensor(dim>2),定义其矩阵乘法仅在最后的两个维度上,要求前面的维度必须保持一致,就像矩阵的索引一样并且运算操作只有torch.matmul()。

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.ones(1, 2, 3, 4)
    b = torch.ones(1, 2, 4, 3)
    c = torch.matmul(a, b)
    print(c)
    c = a.matmul(b)
    print(c)

运行结果

代码语言:javascript复制
tensor([[[[4., 4., 4.],
          [4., 4., 4.],
          [4., 4., 4.]],

         [[4., 4., 4.],
          [4., 4., 4.],
          [4., 4., 4.]]]])
tensor([[[[4., 4., 4.],
          [4., 4., 4.],
          [4., 4., 4.]],

         [[4., 4., 4.],
          [4., 4., 4.],
          [4., 4., 4.]]]])

本次运算实际上是作用在a的3,4和b的4,3这两个维度上的。而对于前两维需要保持一致,这里都是1,2。

  • 幂运算
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.Tensor([1, 2, 3])
    c = torch.pow(a, 2)
    print(c)
    c = a**2
    print(c)
    a.pow_(2)
    print(a)
    # 计算e的a次方
    a = torch.Tensor([2])
    c = torch.exp(a)
    print(c)
    c = a.exp_()
    print(c)

运行结果

代码语言:javascript复制
tensor([1., 4., 9.])
tensor([1., 4., 9.])
tensor([1., 4., 9.])
tensor([7.3891])
tensor([7.3891])
  • 开方运算
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.Tensor([1, 2, 3])
    c = torch.sqrt(a)
    print(c)
    c = a.sqrt()
    print(c)
    a.sqrt_()
    print(a)

运行结果

代码语言:javascript复制
tensor([1.0000, 1.4142, 1.7321])
tensor([1.0000, 1.4142, 1.7321])
tensor([1.0000, 1.4142, 1.7321])
  • 对数运算
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.Tensor([1, 2, 3])
    c = torch.log2(a)
    print(c)
    c = torch.log10(a)
    print(c)
    c = torch.log(a)
    print(c)
    torch.log_(a)
    print(a)

运行结果

代码语言:javascript复制
tensor([0.0000, 1.0000, 1.5850])
tensor([0.0000, 0.3010, 0.4771])
tensor([0.0000, 0.6931, 1.0986])
tensor([0.0000, 0.6931, 1.0986])

in-place概念和广播机制

  • in-place是指在进行运算的过程中,不允许使用临时变量,也就是所谓的"就地"操作,也称为原位操作。
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.Tensor([1, 2, 3])
    b = torch.Tensor([4])
    a = a   b
    print(a)

运行结果

代码语言:javascript复制
tensor([5., 6., 7.])

又比如之前说的add_、sub_、mul_等等都属于in-place。

  • 广播机制:张量参数可以自动扩展为相同大小。需要满足两个条件:
  • 每个张量至少有一个维度
  • 满足右对齐
代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.rand(2, 1, 1)
    print(a)
    b = torch.rand(3)
    print(b)
    c = a   b
    print(c)
    print(c.shape)

运行结果

代码语言:javascript复制
tensor([[[0.9496]],

        [[0.5661]]])
tensor([0.0402, 0.8962, 0.5040])
tensor([[[0.9898, 1.8458, 1.4536]],

        [[0.6063, 1.4623, 1.0701]]])
torch.Size([2, 1, 3])

在上面这个例子中,我们可以看到a的最低一维是1,但是它有高维度,而b是一个三维向量,没有高维度。通过结果c的形状,我们可以发现c的高维度与a保持一致,而最后一个维度与b保持一致。也就是说最后一维的长度,两个张量要么长度相等,要么其中有一个为1,就可以运算,否则就会报错。这就是广播机制。

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.rand(2, 4, 1, 3)
    print(a)
    b = torch.rand(4, 2, 3)
    print(b)
    c = a   b
    print(c)
    print(c.shape)

运行结果

代码语言:javascript复制
tensor([[[[0.1862, 0.8673, 0.2926]],

         [[0.6385, 0.6885, 0.8268]],

         [[0.3837, 0.3433, 0.0975]],

         [[0.4689, 0.4580, 0.4023]]],


        [[[0.1647, 0.5968, 0.5279]],

         [[0.8252, 0.7446, 0.1916]],

         [[0.9649, 0.6015, 0.5151]],

         [[0.7504, 0.8202, 0.7865]]]])
tensor([[[0.7811, 0.8357, 0.2585],
         [0.8866, 0.3935, 0.4450]],

        [[0.2543, 0.7985, 0.1959],
         [0.5357, 0.3883, 0.4426]],

        [[0.8317, 0.2597, 0.9586],
         [0.2829, 0.8665, 0.2853]],

        [[0.7220, 0.7107, 0.9395],
         [0.8345, 0.0955, 0.3690]]])
tensor([[[[0.9673, 1.7029, 0.5510],
          [1.0727, 1.2608, 0.7375]],

         [[0.8928, 1.4871, 1.0227],
          [1.1742, 1.0768, 1.2694]],

         [[1.2154, 0.6029, 1.0561],
          [0.6666, 1.2098, 0.3828]],

         [[1.1909, 1.1687, 1.3418],
          [1.3034, 0.5535, 0.7713]]],


        [[[0.9458, 1.4325, 0.7864],
          [1.0512, 0.9904, 0.9729]],

         [[1.0795, 1.5432, 0.3876],
          [1.3609, 1.1329, 0.6342]],

         [[1.7966, 0.8611, 1.4737],
          [1.2478, 1.4680, 0.8004]],

         [[1.4724, 1.5309, 1.7260],
          [1.5849, 0.9157, 1.1555]]]])
torch.Size([2, 4, 2, 3])

取整/取余运算

代码语言:javascript复制
import torch

if __name__ == "__main__":

    a = torch.rand(2, 2)
    a.mul_(10)
    print(a)
    # 向下取整
    print(torch.floor(a))
    # 向上取整
    print(torch.ceil(a))
    # 四舍五入
    print(torch.round(a))
    # 取整数部分
    print(torch.trunc(a))
    # 取小数部分
    print(torch.frac(a))
    # 取余
    print(a % 2)

运行结果

代码语言:javascript复制
tensor([[5.8996, 9.2745],
        [1.0162, 8.2628]])
tensor([[5., 9.],
        [1., 8.]])
tensor([[ 6., 10.],
        [ 2.,  9.]])
tensor([[6., 9.],
        [1., 8.]])
tensor([[5., 9.],
        [1., 8.]])
tensor([[0.8996, 0.2745],
        [0.0162, 0.2628]])
tensor([[1.8996, 1.2745],
        [1.0162, 0.2628]])

比较运算

代码语言:javascript复制
import torch
import numpy as np

if __name__ == "__main__":

    a = torch.Tensor([[1, 2, 3], [4, 5, 6]])
    b = torch.Tensor([[1, 4, 9], [6, 5, 7]])
    c = torch.rand(2, 4)
    d = a
    print(a)
    print(b)
    # 比较张量中的每一个值是否相等(张量的形状必须相同),返回相同形状的布尔值
    print(torch.eq(a, b))
    print(torch.eq(a, d))
    # 比较张量的形状和值是否都相同,比较的两个张量的形状可以不同,但会返回False
    print(torch.equal(a, b))
    print(torch.equal(a, c))
    # 比较第一个张量中的每一个值是否大于等于第二个张量的相同位置上的值
    # (张量的形状必须相同),返回相同形状的布尔值
    print(torch.ge(a, b))
    # 比较第一个张量中的每一个值是否大于第二个张量的相同位置上的值
    # (张量的形状必须相同),返回相同形状的布尔值
    print(torch.gt(a, b))
    # 比较第一个张量中的每一个值是否小于等于第二个张量的相同位置上的值
    # (张量的形状必须相同),返回相同形状的布尔值
    print(torch.le(a, b))
    # 比较第一个张量中的每一个值是否小于第二个张量的相同位置上的值
    # (张量的形状必须相同),返回相同形状的布尔值
    print(torch.lt(a, b))
    # 比较第一个张量中的每一个值是否不等于第二个张量的相同位置上的值
    # (张量的形状必须相同),返回相同形状的布尔值
    print(torch.ne(a, b))

运行结果

代码语言:javascript复制
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[1., 4., 9.],
        [6., 5., 7.]])
tensor([[ True, False, False],
        [False,  True, False]])
tensor([[True, True, True],
        [True, True, True]])
False
False
tensor([[ True, False, False],
        [False,  True, False]])
tensor([[False, False, False],
        [False, False, False]])
tensor([[True, True, True],
        [True, True, True]])
tensor([[False,  True,  True],
        [ True, False,  True]])
tensor([[False,  True,  True],
        [ True, False,  True]])

排序

代码语言:javascript复制
a = torch.Tensor([1, 4, 4, 3, 5])
# 升序排序
print(torch.sort(a))
# 降序排序
print(torch.sort(a, descending=True))
b = torch.Tensor([[1, 4, 4, 3, 5], [2, 3, 1, 3, 5]])
print(b.shape)
# 升序排序
print(torch.sort(b))
# 在第一个维度进行升序排序
print(torch.sort(b, dim=0))
# 降序排序
print(torch.sort(b, descending=True))
# 在第一个维度进行降序排序
print(torch.sort(b, dim=0, descending=True))

运行结果

代码语言:javascript复制
torch.return_types.sort(
values=tensor([1., 3., 4., 4., 5.]),
indices=tensor([0, 3, 1, 2, 4]))
torch.return_types.sort(
values=tensor([5., 4., 4., 3., 1.]),
indices=tensor([4, 1, 2, 3, 0]))
torch.Size([2, 5])
torch.return_types.sort(
values=tensor([[1., 3., 4., 4., 5.],
        [1., 2., 3., 3., 5.]]),
indices=tensor([[0, 3, 1, 2, 4],
        [2, 0, 1, 3, 4]]))
torch.return_types.sort(
values=tensor([[1., 3., 1., 3., 5.],
        [2., 4., 4., 3., 5.]]),
indices=tensor([[0, 1, 1, 0, 0],
        [1, 0, 0, 1, 1]]))
torch.return_types.sort(
values=tensor([[5., 4., 4., 3., 1.],
        [5., 3., 3., 2., 1.]]),
indices=tensor([[4, 1, 2, 3, 0],
        [4, 1, 3, 0, 2]]))
torch.return_types.sort(
values=tensor([[2., 4., 4., 3., 5.],
        [1., 3., 1., 3., 5.]]),
indices=tensor([[1, 0, 0, 0, 0],
        [0, 1, 1, 1, 1]]))

Top K

代码语言:javascript复制
a = torch.Tensor([[1, 4, 4, 3, 5], [2, 3, 1, 3, 6]])
# 在第一个维度上找出1个最大值
print(torch.topk(a, k=1, dim=0))
# 在第一个维度上找出2个最大值,由于a在第一个维度上只有2行,所以k最大只能为2
print(torch.topk(a, k=2, dim=0))
# 在第二个维度上找出2个最大值,这里k最大可为5,因为第二个维度上有5个值
print(torch.topk(a, k=2, dim=1))

运行结果

代码语言:javascript复制
torch.return_types.topk(
values=tensor([[2., 4., 4., 3., 6.]]),
indices=tensor([[1, 0, 0, 0, 1]]))
torch.return_types.topk(
values=tensor([[2., 4., 4., 3., 6.],
        [1., 3., 1., 3., 5.]]),
indices=tensor([[1, 0, 0, 0, 1],
        [0, 1, 1, 1, 0]]))
torch.return_types.topk(
values=tensor([[5., 4.],
        [6., 3.]]),
indices=tensor([[4, 1],
        [4, 1]]))

第k个最小值

代码语言:javascript复制
a = torch.Tensor([[1, 4, 4, 3, 5], [2, 3, 1, 3, 6], [4, 5, 6, 7, 8]])
# 在第一个维度上找出第2小的数
print(torch.kthvalue(a, k=2, dim=0))
# 在第二个维度上找出第2小的数
print(torch.kthvalue(a, k=2, dim=1))

运行结果

代码语言:javascript复制
torch.return_types.kthvalue(
values=tensor([2., 4., 4., 3., 6.]),
indices=tensor([1, 0, 0, 0, 1]))
torch.return_types.kthvalue(
values=tensor([3., 2., 5.]),
indices=tensor([3, 0, 1]))

数据合法性校验

代码语言:javascript复制
a = torch.rand(2, 3)
b = torch.Tensor([1, 2, np.nan])
print(a)
# 是否有界
print(torch.isfinite(a))
print(torch.isfinite(a / 0))
# 是否无界
print(torch.isinf(a / 0))
# 是否为空
print(torch.isnan(a))
print(torch.isnan(b))

运行结果

代码语言:javascript复制
tensor([[0.8657, 0.4002, 0.3988],
        [0.2066, 0.5564, 0.3181]])
tensor([[True, True, True],
        [True, True, True]])
tensor([[False, False, False],
        [False, False, False]])
tensor([[True, True, True],
        [True, True, True]])
tensor([[False, False, False],
        [False, False, False]])
tensor([False, False,  True])

三角函数

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.Tensor([0, 0, 0])
    print(torch.cos(a))

运行结果

代码语言:javascript复制
tensor([1., 1., 1.])

统计学函数

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.rand(2, 2)
    print(a)
    # 求均值
    print(torch.mean(a))
    # 对第一个维度求均值
    print(torch.mean(a, dim=0))
    # 求和
    print(torch.sum(a))
    # 对第一个维度求和
    print(torch.sum(a, dim=0))
    # 累积
    print(torch.prod(a))
    # 对第一个维度求累积
    print(torch.prod(a, dim=0))
    # 求第一个维度的最大值索引
    print(torch.argmax(a, dim=0))
    # 求第一个维度的最小值索引
    print(torch.argmin(a, dim=0))
    # 计算标准差
    print(torch.std(a))
    # 计算方差
    print(torch.var(a))
    # 获取中数(中间的数)
    print(torch.median(a))
    # 获取众数(出现次数最多的数)
    print(torch.mode(a))
    a = torch.rand(2, 2) * 10
    print(a)
    # 打印直方图
    print(torch.histc(a, 6, 0, 0))
    a = torch.randint(0, 10, [10])
    print(a)
    print(torch.bincount(a))

运行结果

代码语言:javascript复制
tensor([[0.3333, 0.3611],
        [0.4208, 0.6395]])
tensor(0.4387)
tensor([0.3771, 0.5003])
tensor(1.7547)
tensor([0.7541, 1.0006])
tensor(0.0324)
tensor([0.1403, 0.2309])
tensor([1, 1])
tensor([0, 0])
tensor(0.1388)
tensor(0.0193)
tensor(0.3611)
torch.return_types.mode(
values=tensor([0.3333, 0.4208]),
indices=tensor([0, 0]))
tensor([[1.9862, 7.6381],
        [9.2323, 7.4402]])
tensor([1., 0., 0., 0., 2., 1.])
tensor([1, 1, 4, 1, 2, 7, 9, 1, 4, 7])
tensor([0, 4, 1, 0, 2, 0, 0, 2, 0, 1])

随机抽样

代码语言:javascript复制
import torch

if __name__ == '__main__':
    # 定义随机种子
    torch.manual_seed(1)
    # 均值
    mean = torch.rand(1, 2)
    # 标准差
    std = torch.rand(1, 2)
    # 正态分布
    print(torch.normal(mean, std))

运行结果

代码语言:javascript复制
tensor([[0.7825, 0.7358]])

范数运算

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.rand(2, 1)
    b = torch.rand(2, 1)
    print(a)
    print(b)
    # 计算1、2、3范数
    print(torch.dist(a, b, p=1))
    print(torch.dist(a, b, p=2))
    print(torch.dist(a, b, p=3))
    # a的2范数
    print(torch.norm(a))
    # a的3范数
    print(torch.norm(a, p=3))
    # a的核范数
    print(torch.norm(a, p='fro'))

运行结果

代码语言:javascript复制
tensor([[0.3291],
        [0.8294]])
tensor([[0.0810],
        [0.9734]])
tensor(0.3921)
tensor(0.2869)
tensor(0.2633)
tensor(0.8923)
tensor(0.8463)
tensor(0.8923)

关于范数的内容可以参考机器学习算法整理 中的欧拉距离、明可夫斯基距离以及机器学习算法整理(二) 中L1正则,L2正则

张量裁剪

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.rand(2, 2) * 10
    print(a)
    # 将小于2的变成2,大于5的变成5,2~5之间的不变
    a.clamp_(2, 5)
    print(a)

运行结果

代码语言:javascript复制
tensor([[1.6498, 5.2090],
        [9.7682, 2.3269]])
tensor([[2.0000, 5.0000],
        [5.0000, 2.3269]])

张量的索引与数据筛选

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.rand(4, 4)
    b = torch.rand(4, 4)
    print(a)
    print(b)
    # 如果a中的值大于0.5则输出a中的值,否则输出b中的值
    out = torch.where(a > 0.5, a, b)
    print(out)
    # 列出a比b大的坐标
    out = torch.where(a > b)
    print(out)
    # 挑选a的第一个维度的第0行、第3行、第2行构建出一个新的tensor
    out = torch.index_select(a, dim=0, index=torch.tensor([0, 3, 2]))
    print(out)
    # 挑选a的第二个维度的第0列、第3列、第2列构建出一个新的tensor
    out = torch.index_select(a, dim=1, index=torch.tensor([0, 3, 2]))
    print(out)

    a = torch.linspace(1, 16, 16).view(4, 4)
    print(a)
    # 挑选a的第一个维度的行索引取值,而其本身的位置代表取行索引的第几列
    out = torch.gather(a, dim=0, index=torch.tensor([[0, 1, 1, 1], [0, 1, 2, 2], [0, 1, 3, 3]]))
    print(out)
    # 挑选a的第二个维度的索引取值
    out = torch.gather(a, dim=1, index=torch.tensor([[0, 1, 1, 1], [0, 1, 2, 2], [0, 1, 3, 3]]))
    print(out)
    mask = torch.gt(a, 8)
    print(mask)
    # 挑选mask中为True的值
    out = torch.masked_select(a, mask)
    print(out)
    # 先将a扁平化处理成向量,再获取索引上的值,输出也是一个向量
    out = torch.take(a, index=torch.tensor([0, 15, 13, 10]))
    print(out)
    a = torch.tensor([[0, 1, 2, 0], [2, 3, 0, 1]])
    # 获取非0元素的坐标
    out = torch.nonzero(a)
    print(out)

运行结果

代码语言:javascript复制
tensor([[0.2301, 0.4003, 0.6914, 0.4822],
        [0.7194, 0.8242, 0.1110, 0.6387],
        [0.7997, 0.9174, 0.6136, 0.7631],
        [0.9998, 0.5568, 0.9027, 0.7765]])
tensor([[0.4136, 0.4748, 0.0058, 0.3138],
        [0.7306, 0.7052, 0.5451, 0.1708],
        [0.0622, 0.9961, 0.7769, 0.2812],
        [0.5140, 0.5198, 0.2314, 0.2854]])
tensor([[0.4136, 0.4748, 0.6914, 0.3138],
        [0.7194, 0.8242, 0.5451, 0.6387],
        [0.7997, 0.9174, 0.6136, 0.7631],
        [0.9998, 0.5568, 0.9027, 0.7765]])
(tensor([0, 0, 1, 1, 2, 2, 3, 3, 3, 3]), tensor([2, 3, 1, 3, 0, 3, 0, 1, 2, 3]))
tensor([[0.2301, 0.4003, 0.6914, 0.4822],
        [0.9998, 0.5568, 0.9027, 0.7765],
        [0.7997, 0.9174, 0.6136, 0.7631]])
tensor([[0.2301, 0.4822, 0.6914],
        [0.7194, 0.6387, 0.1110],
        [0.7997, 0.7631, 0.6136],
        [0.9998, 0.7765, 0.9027]])
tensor([[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.],
        [13., 14., 15., 16.]])
tensor([[ 1.,  6.,  7.,  8.],
        [ 1.,  6., 11., 12.],
        [ 1.,  6., 15., 16.]])
tensor([[ 1.,  2.,  2.,  2.],
        [ 5.,  6.,  7.,  7.],
        [ 9., 10., 12., 12.]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]])
tensor([ 9., 10., 11., 12., 13., 14., 15., 16.])
tensor([ 1., 16., 14., 11.])
tensor([[0, 1],
        [0, 2],
        [1, 0],
        [1, 1],
        [1, 3]])

张量的组合与拼接

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.zeros((2, 4))
    b = torch.ones((2, 4))
    # 在第一个维度上进行拼接,此时不会增加维度,只会增加该维度的条数
    out = torch.cat((a, b), dim=0)
    print(out)
    a = torch.linspace(1, 6, 6).view(2, 3)
    b = torch.linspace(7, 12, 6).view(2, 3)
    print(a)
    print(b)
    # 将a和b看成独立元素进行拼接,此时会增加一个维度,该维度表示a和b的2个元素
    out = torch.stack((a, b), dim=0)
    print(out)
    print(out.shape)
    # 将a的每一行跟b的相同序号的一行拼接,拼接后形成一个维度
    out = torch.stack((a, b), dim=1)
    print(out)
    print(out.shape)
    # 此处可以打印出原始的a
    print(out[:, 0, :])
    # 此处可以打印出原始的b
    print(out[:, 1, :])
    # 将a的每一行每一列跟b的相同序号的行相同序号的列拼接,拼接后形成一个维度
    out = torch.stack((a, b), dim=2)
    print(out)
    print(out.shape)
    # 此处可以打印出原始的a
    print(out[:, :, 0])
    # 此处可以打印出原始的b
    print(out[:, :, 1])

运行结果

代码语言:javascript复制
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
tensor([[[ 1.,  2.,  3.],
         [ 4.,  5.,  6.]],

        [[ 7.,  8.,  9.],
         [10., 11., 12.]]])
torch.Size([2, 2, 3])
tensor([[[ 1.,  2.,  3.],
         [ 7.,  8.,  9.]],

        [[ 4.,  5.,  6.],
         [10., 11., 12.]]])
torch.Size([2, 2, 3])
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
tensor([[[ 1.,  7.],
         [ 2.,  8.],
         [ 3.,  9.]],

        [[ 4., 10.],
         [ 5., 11.],
         [ 6., 12.]]])
torch.Size([2, 3, 2])
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])

张量切片

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.rand(3, 4)
    print(a)
    # 对a的第一个维度进行切片,第一个维度是3行,无法进行两两切片,
    # 第一个切片为2,第二个切片为1
    out = torch.chunk(a, 2, dim=0)
    print(out)
    # 对a的第二个维度进行切片,第二个维度是4列,可以进行两两切片,
    out = torch.chunk(a, 2, dim=1)
    print(out)
    out = torch.split(a, 2, dim=0)
    print(out)
    out = torch.split(a, 2, dim=1)
    print(out)
    # 按照list指定的长度进行切分
    out = torch.split(a, [1, 1, 1], dim=0)
    print(out)

运行结果

代码语言:javascript复制
tensor([[0.5683, 0.0800, 0.2068, 0.8908],
        [0.8924, 0.8733, 0.6078, 0.8697],
        [0.0428, 0.0265, 0.3515, 0.3164]])
(tensor([[0.5683, 0.0800, 0.2068, 0.8908],
        [0.8924, 0.8733, 0.6078, 0.8697]]), tensor([[0.0428, 0.0265, 0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800],
        [0.8924, 0.8733],
        [0.0428, 0.0265]]), tensor([[0.2068, 0.8908],
        [0.6078, 0.8697],
        [0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800, 0.2068, 0.8908],
        [0.8924, 0.8733, 0.6078, 0.8697]]), tensor([[0.0428, 0.0265, 0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800],
        [0.8924, 0.8733],
        [0.0428, 0.0265]]), tensor([[0.2068, 0.8908],
        [0.6078, 0.8697],
        [0.3515, 0.3164]]))
(tensor([[0.5683, 0.0800, 0.2068, 0.8908]]), tensor([[0.8924, 0.8733, 0.6078, 0.8697]]), tensor([[0.0428, 0.0265, 0.3515, 0.3164]]))

张量变形

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.rand(2, 3)
    print(a)
    # 先将a扁平化,再转成需要的形状
    out = torch.reshape(a, (3, 2))
    print(out)
    # 转置
    print(torch.t(out))
    a = torch.rand(1, 2, 3)
    print(a)
    # 交换两个维度
    out = torch.transpose(a, 0, 1)
    print(out)
    print(out.shape)
    # 去除值为1的维度
    out = torch.squeeze(a)
    print(out)
    print(out.shape)
    # 在指定位置(这里是最后一个位置)增加一个值为1的维度
    out = torch.unsqueeze(a, -1)
    print(out)
    print(out.shape)
    # 在第二个维度进行切分
    out = torch.unbind(a, dim=1)
    print(out)
    # 在第二个维度进行翻转
    print(torch.flip(a, dims=[1]))
    # 在第三个维度进行翻转
    print(torch.flip(a, dims=[2]))
    # 对第二、第三个维度都翻转
    print(torch.flip(a, dims=[1, 2]))
    # 逆时针旋转90度
    out = torch.rot90(a)
    print(out)
    print(out.shape)
    # 逆时针旋转180度
    out = torch.rot90(a, 2)
    print(out)
    print(out.shape)
    # 逆时针旋转360度
    out = torch.rot90(a, 4)
    print(out)
    print(out.shape)
    # 顺时针旋转90度
    out = torch.rot90(a, -1)
    print(out)
    print(out.shape)

运行结果

代码语言:javascript复制
tensor([[7.7492e-01, 8.1314e-01, 6.4422e-01],
        [9.8577e-01, 5.3938e-01, 2.3049e-04]])
tensor([[7.7492e-01, 8.1314e-01],
        [6.4422e-01, 9.8577e-01],
        [5.3938e-01, 2.3049e-04]])
tensor([[7.7492e-01, 6.4422e-01, 5.3938e-01],
        [8.1314e-01, 9.8577e-01, 2.3049e-04]])
tensor([[[0.4003, 0.3144, 0.7292],
         [0.0459, 0.5821, 0.7332]]])
tensor([[[0.4003, 0.3144, 0.7292]],

        [[0.0459, 0.5821, 0.7332]]])
torch.Size([2, 1, 3])
tensor([[0.4003, 0.3144, 0.7292],
        [0.0459, 0.5821, 0.7332]])
torch.Size([2, 3])
tensor([[[[0.4003],
          [0.3144],
          [0.7292]],

         [[0.0459],
          [0.5821],
          [0.7332]]]])
torch.Size([1, 2, 3, 1])
(tensor([[0.4003, 0.3144, 0.7292]]), tensor([[0.0459, 0.5821, 0.7332]]))
tensor([[[0.0459, 0.5821, 0.7332],
         [0.4003, 0.3144, 0.7292]]])
tensor([[[0.7292, 0.3144, 0.4003],
         [0.7332, 0.5821, 0.0459]]])
tensor([[[0.7332, 0.5821, 0.0459],
         [0.7292, 0.3144, 0.4003]]])
tensor([[[0.0459, 0.5821, 0.7332]],

        [[0.4003, 0.3144, 0.7292]]])
torch.Size([2, 1, 3])
tensor([[[0.0459, 0.5821, 0.7332],
         [0.4003, 0.3144, 0.7292]]])
torch.Size([1, 2, 3])
tensor([[[0.4003, 0.3144, 0.7292],
         [0.0459, 0.5821, 0.7332]]])
torch.Size([1, 2, 3])
tensor([[[0.4003, 0.3144, 0.7292]],

        [[0.0459, 0.5821, 0.7332]]])
torch.Size([2, 1, 3])

张量填充

代码语言:javascript复制
import torch

if __name__ == '__main__':
    # 定义一个2*3的全10矩阵
    a = torch.full((2, 3), 10)
    print(a)

运行结果

代码语言:javascript复制
tensor([[10, 10, 10],
        [10, 10, 10]])

求导数

代码语言:javascript复制
import torch
from torch.autograd import Variable

if __name__ == '__main__':

    # 等价于 x = Variable(torch.ones(2, 2), requires_grad=True)
    # requires_grad=True表示加入到反向传播图中参与计算
    x = torch.ones(2, 2, requires_grad=True)
    y = x   2
    z = y**2 * 3
    # 对x求导
    # 等价于 torch.autograd.backward(z, grad_tensors=torch.ones(2, 2))
    z.backward(torch.ones(2, 2))
    print(x.grad)
    print(y.grad)
    print(x.grad_fn)
    print(y.grad_fn)
    print(z.grad_fn)

运行结果

代码语言:javascript复制
tensor([[18., 18.],
        [18., 18.]])
None
None
<AddBackward0 object at 0x7fad8f1b0c50>
<MulBackward0 object at 0x7fad8f1b0c50>

这里是一个复合函数求导,z'(x)=6y*(x 2)'=6y*1=6(1 2)=18,由于有4个1,所以有4个18。关于导数的计算方法可以参考高等数学整理 一元函数的导数与微分

目前Variable已经和Tensor合并。每个tensor通过requires_grad来设置是否计算梯度。用来冻结某些层的参数,比如预训练网络的参数不做更新,只训练后面的网络参数。又或者多任务网络中,我们只对某个分支进行计算参数,而其他分支的参数进行冻结。

  • 关于Autograd的几个概念

1、叶子张量(leaf)

上图中表示了在PyTorch中想要完成的一组计算,在这个图中X称之为叶子张量,X也是一个叶子节点。在PyTorch中想要计算梯度,必须满足的一个要求就是当前的节点属于叶子节点,只有是一个叶子张量,我们才能计算它的梯度。如果我们打印Y的梯度,返回的是None。

2、grad VS grad_fn

  1. grad:该Tensor的梯度值,每次在计算backward时都需要将前一时刻的梯度归零,否则梯度值会一直累加。
  2. grad_fn:叶子节点通常为None,只有结果节点的grad_fn才有效,用于指示梯度函数是哪种类型.

3、backward函数

torch.autograd.backward(tensors,grad_tensors=None,retain_graph=None,create_graph=False)

  1. tensor:用于计算梯度的tensor,torch.autograd.backward(z)==z.backward(),这里指的的是常量的情况下,如果是变量的话需要指定grad_tensors
  2. grad_tensors:在计算矩阵的梯度时会用到。它其实也是一个tensor,shape一般需要和前面的tensor保持一致。
  3. retain_graph:通常在调用一次backward后,pytorch会自动把计算图销毁,所以想要对某个变量重复调用backward,则需要将该参数设置为True。
  4. create_graph:如果为True,那么就创建一个专门的graph of the derivative,这可以方便计算高阶微分。
  5. def grad( outputs: _TensorOrTensors, inputs: _TensorOrTensors, grad_outputs: Optional_TensorOrTensors = None, retain_graph: Optionalbool = None, create_graph: bool = False, only_inputs: bool = True, allow_unused: bool = False )

计算和返回outputs关于inputs的梯度的和。outputs:函数的因变量,即需要求导的那个函数。inputs:函数的自变量。grad_outputs:同backward。only_inputs:只计算inputs的梯度。allow_unused(bool,可选):如果为False,当计算输出出错时(因此它们的梯度永远是0)指明不使用的inputs。

4、torch.autograd包中的其他函数

  1. torch.autograd.enable_grad:启动梯度计算的上下文管理器
  2. torch.autograd.no_grad:禁止梯度计算的上下文管理器
  3. torch.autograd.set_grad_enabled(mode):设置是否进行梯度计算的上下文管理器。

Autograd中的function

代码语言:javascript复制
import torch

class line(torch.autograd.Function):

    @staticmethod
    def forward(ctx, w, x, b):
        '''
        前向运算
        :param ctx: 上下文管理器
        :param w:
        :param x:
        :param b:
        :return:
        '''
        # y = wx   b
        ctx.save_for_backward(w, x, b)
        return w * x   b

    @staticmethod
    def backward(ctx, grad_out):
        '''
        反向传播
        :param ctx: 上下文管理器
        :param grad_out: 上一级梯度
        :return:
        '''
        w, x, b = ctx.saved_tensors
        # w的偏导数
        grad_w = grad_out * x
        # x的偏导数
        grad_x = grad_out * w
        # b的偏导数
        grad_b = grad_out
        return grad_w, grad_x, grad_b

if __name__ == '__main__':

    w = torch.rand(2, 2, requires_grad=True)
    x = torch.rand(2, 2, requires_grad=True)
    b = torch.rand(2, 2, requires_grad=True)
    out = line.apply(w, x, b)
    out.backward(torch.ones(2, 2))
    print(w, x, b)
    print(w.grad, x.grad, b.grad)

运行结果

代码语言:javascript复制
tensor([[0.7784, 0.2882],
        [0.7826, 0.8178]], requires_grad=True) tensor([[0.4062, 0.4722],
        [0.7921, 0.9470]], requires_grad=True) tensor([[0.7012, 0.9489],
        [0.2466, 0.1548]], requires_grad=True)
tensor([[0.4062, 0.4722],
        [0.7921, 0.9470]]) tensor([[0.7784, 0.2882],
        [0.7826, 0.8178]]) tensor([[1., 1.],
        [1., 1.]])

这是一个对线性函数求偏导数的过程,通过结果我们可以看到w的偏导数是x,x的偏导数是w,b的偏导数是1。

  • 每一个原始的自动求导运算实际上是两个在Tensor上运行的函数
  • forward函数计算从输入Tensors获得的输出Tensors
  • backward函数接收输出Tensors对于某个标量值的梯度,并且计算输入Tensors相对于该相同标量值的梯度
  • 最后,利用apply方法执行相应的运算,该方法是定义在Function类的父类_FunctionBase中的一个方法

非0值填充

这个是相对于普通padding而言的。

代码语言:javascript复制
import torch

if __name__ == '__main__':

    a = torch.arange(9, dtype=torch.float).reshape((1, 3, 3))
    print(a)
    m = torch.nn.ReflectionPad2d(1)
    # 在a的周边填充非0值
    out = m(a)
    print(out)

运行结果

代码语言:javascript复制
tensor([[[0., 1., 2.],
         [3., 4., 5.],
         [6., 7., 8.]]])
tensor([[[4., 3., 4., 5., 4.],
         [1., 0., 1., 2., 1.],
         [4., 3., 4., 5., 4.],
         [7., 6., 7., 8., 7.],
         [4., 3., 4., 5., 4.]]])

神经网络的搭建

有关神经网络的内容可以参考Tensorflow深度学习算法整理 ,这里不再赘述。

波士顿房价预测

我们先来看一下数据

代码语言:javascript复制
import numpy as np
import torch
from sklearn import datasets

if __name__ == '__main__':

    boston = datasets.load_boston()
    X = torch.from_numpy(boston.data)
    y = torch.from_numpy(boston.target)
    y = torch.unsqueeze(y, -1)
    data = torch.cat((X, y), dim=-1)
    print(data)
    print(data.shape)

运行结果

代码语言:javascript复制
tensor([[6.3200e-03, 1.8000e 01, 2.3100e 00,  ..., 3.9690e 02, 4.9800e 00,
         2.4000e 01],
        [2.7310e-02, 0.0000e 00, 7.0700e 00,  ..., 3.9690e 02, 9.1400e 00,
         2.1600e 01],
        [2.7290e-02, 0.0000e 00, 7.0700e 00,  ..., 3.9283e 02, 4.0300e 00,
         3.4700e 01],
        ...,
        [6.0760e-02, 0.0000e 00, 1.1930e 01,  ..., 3.9690e 02, 5.6400e 00,
         2.3900e 01],
        [1.0959e-01, 0.0000e 00, 1.1930e 01,  ..., 3.9345e 02, 6.4800e 00,
         2.2000e 01],
        [4.7410e-02, 0.0000e 00, 1.1930e 01,  ..., 3.9690e 02, 7.8800e 00,
         1.1900e 01]], dtype=torch.float64)
torch.Size([506, 14])
代码语言:javascript复制
import torch
from sklearn import datasets

if __name__ == '__main__':

    boston = datasets.load_boston()
    X = torch.from_numpy(boston.data)
    y = torch.from_numpy(boston.target)
    y = torch.unsqueeze(y, -1)
    data = torch.cat((X, y), dim=-1)
    print(data)
    print(data.shape)
    y = torch.squeeze(y)
    X_train = X[:496]
    y_train = y[:496]
    X_test = X[496:]
    y_test = y[496:]

    class Net(torch.nn.Module):

        def __init__(self, n_feature, n_output):
            super(Net, self).__init__()
            self.hidden = torch.nn.Linear(n_feature, 100)
            self.predict = torch.nn.Linear(100, n_output)

        def forward(self, x):
            out = self.hidden(x)
            out = torch.relu(out)
            out = self.predict(out)
            return out

    net = Net(13, 1)
    loss_func = torch.nn.MSELoss()
    optimizer = torch.optim.Adam(net.parameters(), lr=0.01)
    for i in range(10000):
        pred = net.forward(X_train.float())
        pred = torch.squeeze(pred)
        loss = loss_func(pred, y_train.float()) * 0.001
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        print("item:{},loss:{}".format(i, loss))
        print(pred[:10])
        print(y_train[:10])

        pred = net.forward(X_test.float())
        pred = torch.squeeze(pred)
        loss_test = loss_func(pred, y_test.float()) * 0.001
        print("item:{},loss_test:{}".format(i, loss_test))
        print(pred[:10])
        print(y_test[:10])

运行结果(最终训练结果)

代码语言:javascript复制
item:9999,loss:0.0034487966913729906
tensor([26.7165, 22.6610, 33.0955, 34.6687, 36.8087, 29.3654, 22.9609, 20.9920,
        17.1832, 20.8744], grad_fn=<SliceBackward0>)
tensor([24.0000, 21.6000, 34.7000, 33.4000, 36.2000, 28.7000, 22.9000, 27.1000,
        16.5000, 18.9000], dtype=torch.float64)
item:9999,loss_test:0.007662008982151747
tensor([14.5801, 18.2911, 21.3332, 16.9826, 19.6432, 21.8298, 18.5557, 23.6807,
        22.3610, 18.0118], grad_fn=<SliceBackward0>)
tensor([19.7000, 18.3000, 21.2000, 17.5000, 16.8000, 22.4000, 20.6000, 23.9000,
        22.0000, 11.9000], dtype=torch.float64)

手写数字识别

代码语言:javascript复制
import torch
import torchvision.datasets as dataset
import torchvision.transforms as transforms
import torch.utils.data as data_utils

if __name__ == '__main__':

    train_data = dataset.MNIST(root='mnist', train=True,
                               transform=transforms.ToTensor(),
                               download=True)
    test_data = dataset.MNIST(root='mnist', train=False,
                              transform=transforms.ToTensor(),
                              download=False)
    train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True)
    test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True)

    class CNN(torch.nn.Module):

        def __init__(self):
            super(CNN, self).__init__()
            self.conv = torch.nn.Sequential(
                torch.nn.Conv2d(1, 32, kernel_size=5, padding=2),
                torch.nn.BatchNorm2d(32),
                torch.nn.ReLU(),
                torch.nn.MaxPool2d(2)
            )
            self.fc = torch.nn.Linear(14 * 14 * 32, 10)

        def forward(self, x):
            out = self.conv(x)
            out = out.view(out.size()[0], -1)
            out = self.fc(out)
            return out

    cnn = CNN()
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(cnn.parameters(), lr=0.01)
    for epoch in range(10):
        for i, (images, labels) in enumerate(train_loader):
            outputs = cnn(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
        print("epoch is {}, ite is {}/{}, loss is {}".format(epoch   1, i,
              len(train_data) // 64, loss.item()))
        loss_test = 0
        accuracy = 0
        for i, (images, labels) in enumerate(test_loader):
            outputs = cnn(images)
            loss_test  = loss_func(outputs, labels)
            _, pred = outputs.max(1)
            accuracy  = (pred == labels).sum().item()
        accuracy = accuracy / len(test_data)
        loss_test = loss_test / (len(test_data) // 64)
        print("epoch is {}, accuracy is {}, loss test is {}".format(epoch   1, accuracy, loss_test.item()))

运行结果

代码语言:javascript复制
epoch is 1, ite is 937/937, loss is 0.08334837108850479
epoch is 1, accuracy is 0.9814, loss test is 0.06306721270084381
epoch is 2, ite is 937/937, loss is 0.08257070928812027
epoch is 2, accuracy is 0.9824, loss test is 0.05769834667444229
epoch is 3, ite is 937/937, loss is 0.02539072372019291
epoch is 3, accuracy is 0.9823, loss test is 0.05558949336409569
epoch is 4, ite is 937/937, loss is 0.014101949520409107
epoch is 4, accuracy is 0.982, loss test is 0.05912528932094574
epoch is 5, ite is 937/937, loss is 0.0016860843170434237
epoch is 5, accuracy is 0.9835, loss test is 0.05862809345126152
epoch is 6, ite is 937/937, loss is 0.04285441339015961
epoch is 6, accuracy is 0.9817, loss test is 0.06716518104076385
epoch is 7, ite is 937/937, loss is 0.0026565147563815117
epoch is 7, accuracy is 0.9831, loss test is 0.05950026586651802
epoch is 8, ite is 937/937, loss is 0.02730828896164894
epoch is 8, accuracy is 0.9824, loss test is 0.058563172817230225
epoch is 9, ite is 937/937, loss is 0.00010762683814391494
epoch is 9, accuracy is 0.9828, loss test is 0.0673145055770874
epoch is 10, ite is 937/937, loss is 0.0021532117389142513
epoch is 10, accuracy is 0.9852, loss test is 0.0562417209148407

Cifar10图像分类

  • VggNet网络结构

VggNet是一个标准的串联网络结构,网络的深度不宜太深,否则会造成梯度消失

代码语言:javascript复制
import torch
import torch.nn.functional as F
import torchvision.datasets as dataset
import torchvision.transforms as transforms
import torch.utils.data as data_utils

if __name__ == '__main__':

    train_data = dataset.CIFAR10(root='cifa', train=True,
                                 transform=transforms.ToTensor(),
                                 download=True)
    test_data = dataset.CIFAR10(root='cifa', train=False,
                                transform=transforms.ToTensor(),
                                download=False)
    train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True)
    test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True)

    class VGGbase(torch.nn.Module):

        def __init__(self):
            super(VGGbase, self).__init__()
            self.conv1 = torch.nn.Sequential(
                torch.nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(64),
                torch.nn.ReLU()
            )
            self.max_pooling1 = torch.nn.MaxPool2d(kernel_size=2, stride=2)
            self.conv2_1 = torch.nn.Sequential(
                torch.nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(128),
                torch.nn.ReLU()
            )
            self.conv2_2 = torch.nn.Sequential(
                torch.nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(128),
                torch.nn.ReLU()
            )
            self.max_pooling2 = torch.nn.MaxPool2d(kernel_size=2, stride=2)
            self.conv3_1 = torch.nn.Sequential(
                torch.nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(256),
                torch.nn.ReLU()
            )
            self.conv3_2 = torch.nn.Sequential(
                torch.nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(256),
                torch.nn.ReLU()
            )
            self.max_pooling3 = torch.nn.MaxPool2d(kernel_size=2, stride=2, padding=1)
            self.conv4_1 = torch.nn.Sequential(
                torch.nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(512),
                torch.nn.ReLU()
            )
            self.conv4_2 = torch.nn.Sequential(
                torch.nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(512),
                torch.nn.ReLU()
            )
            self.max_pooling4 = torch.nn.MaxPool2d(kernel_size=2, stride=2)
            self.fc = torch.nn.Linear(512 * 4, 10)

        def forward(self, x):
            batchsize = x.size()[0]
            out = self.conv1(x)
            out = self.max_pooling1(out)
            out = self.conv2_1(out)
            out = self.conv2_2(out)
            out = self.max_pooling2(out)
            out = self.conv3_1(out)
            out = self.conv3_2(out)
            out = self.max_pooling3(out)
            out = self.conv4_1(out)
            out = self.conv4_2(out)
            out = self.max_pooling4(out)
            out = out.view(batchsize, -1)
            out = self.fc(out)
            out = F.log_softmax(out, dim=1)
            return out

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    epoch_num = 200
    lr = 0.01
    net = VGGbase().to(device)
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters(), lr=lr)
    # 每隔5个epoch进行指数衰减,变成上一次学习率的0.9倍
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9)
    for epoch in range(epoch_num):
        net.train()
        for i, (images, labels) in enumerate(train_loader):
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            _, pred = torch.max(outputs, dim=1)
            correct = pred.eq(labels.data).cpu().sum()
            print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(),
                  "mini-batch correct is: ", (100.0 * correct / 64).item())

运行结果(部分)

代码语言:javascript复制
epoch is  32 step  605 loss is:  0.010804953053593636 mini-batch correct is:  100.0
epoch is  32 step  606 loss is:  0.02166593447327614 mini-batch correct is:  98.4375
epoch is  32 step  607 loss is:  0.1924218237400055 mini-batch correct is:  95.3125
epoch is  32 step  608 loss is:  0.04531310871243477 mini-batch correct is:  96.875
epoch is  32 step  609 loss is:  0.03866473212838173 mini-batch correct is:  98.4375
epoch is  32 step  610 loss is:  0.0039138575084507465 mini-batch correct is:  100.0
epoch is  32 step  611 loss is:  0.009379544295370579 mini-batch correct is:  100.0
epoch is  32 step  612 loss is:  0.2707091271877289 mini-batch correct is:  93.75
epoch is  32 step  613 loss is:  0.016424348577857018 mini-batch correct is:  100.0
epoch is  32 step  614 loss is:  0.001230329042300582 mini-batch correct is:  100.0
epoch is  32 step  615 loss is:  0.013688713312149048 mini-batch correct is:  100.0
epoch is  32 step  616 loss is:  0.0062867505475878716 mini-batch correct is:  100.0
epoch is  32 step  617 loss is:  0.005267560016363859 mini-batch correct is:  100.0
  • ResNet网络结构

ResNet是一个级联 串联的网络结构,它的网络可以搭的很深,不会造成梯度消失。

代码语言:javascript复制
import torch
import torch.nn.functional as F
import torchvision.datasets as dataset
import torchvision.transforms as transforms
import torch.utils.data as data_utils

if __name__ == '__main__':

    train_data = dataset.CIFAR10(root='cifa', train=True,
                                 transform=transforms.ToTensor(),
                                 download=True)
    test_data = dataset.CIFAR10(root='cifa', train=False,
                                transform=transforms.ToTensor(),
                                download=False)
    train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True)
    test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True)

    class ResBlock(torch.nn.Module):

        def __init__(self, in_channel, out_channel, stride=1):
            super(ResBlock, self).__init__()
            self.layer = torch.nn.Sequential(
                torch.nn.Conv2d(in_channel, out_channel,
                                kernel_size=3, stride=stride, padding=1),
                torch.nn.BatchNorm2d(out_channel),
                torch.nn.ReLU(),
                torch.nn.Conv2d(out_channel, out_channel,
                                kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(out_channel)
            )
            self.shortcut = torch.nn.Sequential()
            if in_channel != out_channel or stride > 1:
                self.shortcut = torch.nn.Sequential(
                    torch.nn.Conv2d(in_channel, out_channel,
                                    kernel_size=3, stride=stride, padding=1),
                    torch.nn.BatchNorm2d(out_channel)
                )

        def forward(self, x):
            out1 = self.layer(x)
            out2 = self.shortcut(x)
            out = out1   out2
            out = F.relu(out)
            return out

    class ResNet(torch.nn.Module):

        def make_layer(self, block, out_channel, stride, num_block):
            layers_list = []
            for i in range(num_block):
                if i == 0:
                    in_stride = stride
                else:
                    in_stride = 1
                layers_list.append(block(self.in_channel, out_channel, in_stride))
                self.in_channel = out_channel
            return torch.nn.Sequential(*layers_list)

        def __init__(self):
            super(ResNet, self).__init__()
            self.in_channel = 32
            self.conv1 = torch.nn.Sequential(
                torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(32),
                torch.nn.ReLU()
            )
            self.layer1 = self.make_layer(ResBlock, 64, 2, 2)
            self.layer2 = self.make_layer(ResBlock, 128, 2, 2)
            self.layer3 = self.make_layer(ResBlock, 256, 2, 2)
            self.layer4 = self.make_layer(ResBlock, 512, 2, 2)
            self.fc = torch.nn.Linear(512, 10)

        def forward(self, x):
            out = self.conv1(x)
            out = self.layer1(out)
            out = self.layer2(out)
            out = self.layer3(out)
            out = self.layer4(out)
            out = F.avg_pool2d(out, 2)
            out = out.view(out.size()[0], -1)
            out = self.fc(out)
            out = F.log_softmax(out, dim=1)
            return out


    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    epoch_num = 200
    lr = 0.01
    net = ResNet().to(device)
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters(), lr=lr)
    # 每隔5个epoch进行指数衰减,变成上一次学习率的0.9倍
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9)
    for epoch in range(epoch_num):
        net.train()
        for i, (images, labels) in enumerate(train_loader):
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            _, pred = torch.max(outputs, dim=1)
            correct = pred.eq(labels.data).cpu().sum()
            print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(),
                  "mini-batch correct is: ", (100.0 * correct / 64).item())
  • MobileNet网络结构

有关MobileNet的详细说明可以参考Tensorflow深度学习算法整理 中的MobileNet,MobileNet能够通过分组卷积和1*1卷积结合代替掉一个标准的卷积单元,进而压缩计算量和参数量。MobileNet能够实现更轻量型的网络结构。

代码语言:javascript复制
import torch
import torch.nn.functional as F
import torchvision.datasets as dataset
import torchvision.transforms as transforms
import torch.utils.data as data_utils

if __name__ == '__main__':

    train_data = dataset.CIFAR10(root='cifa', train=True,
                                 transform=transforms.ToTensor(),
                                 download=True)
    test_data = dataset.CIFAR10(root='cifa', train=False,
                                transform=transforms.ToTensor(),
                                download=False)
    train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True)
    test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True)

    class MobileNet(torch.nn.Module):
        def conv_dw(self, in_channel, out_channel, stride):
            return torch.nn.Sequential(
                # 深度可分离卷积
                torch.nn.Conv2d(in_channel, in_channel, kernel_size=3, stride=stride,
                                padding=1, groups=in_channel, bias=False),
                torch.nn.BatchNorm2d(in_channel),
                torch.nn.ReLU(),
                torch.nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=1,
                                padding=0, bias=False),
                torch.nn.BatchNorm2d(out_channel),
                torch.nn.ReLU()
            )

        def __init__(self):
            super(MobileNet, self).__init__()
            self.conv1 = torch.nn.Sequential(
                torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(32),
                torch.nn.ReLU()
            )
            self.conv_dw2 = self.conv_dw(32, 32, 1)
            self.conv_dw3 = self.conv_dw(32, 64, 2)
            self.conv_dw4 = self.conv_dw(64, 64, 1)
            self.conv_dw5 = self.conv_dw(64, 128, 2)
            self.conv_dw6 = self.conv_dw(128, 128, 1)
            self.conv_dw7 = self.conv_dw(128, 256, 2)
            self.conv_dw8 = self.conv_dw(256, 256, 1)
            self.conv_dw9 = self.conv_dw(256, 512, 2)
            self.fc = torch.nn.Linear(512, 10)

        def forward(self, x):
            out = self.conv1(x)
            out = self.conv_dw2(out)
            out = self.conv_dw3(out)
            out = self.conv_dw4(out)
            out = self.conv_dw5(out)
            out = self.conv_dw6(out)
            out = self.conv_dw7(out)
            out = self.conv_dw8(out)
            out = self.conv_dw9(out)
            out = F.avg_pool2d(out, 2)
            out = out.view(out.size()[0], -1)
            out = self.fc(out)
            out = F.log_softmax(out, dim=1)
            return out


    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    epoch_num = 200
    lr = 0.01
    net = MobileNet().to(device)
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters(), lr=lr)
    # 每隔5个epoch进行指数衰减,变成上一次学习率的0.9倍
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9)
    for epoch in range(epoch_num):
        net.train()
        for i, (images, labels) in enumerate(train_loader):
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            _, pred = torch.max(outputs, dim=1)
            correct = pred.eq(labels.data).cpu().sum()
            print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(),
                  "mini-batch correct is: ", (100.0 * correct / 64).item())
  • InceptionNet网络结构

有关InceptionNett的详细说明可以参考Tensorflow深度学习算法整理 中的InceptionNet。InceptionNet是一种并联和串联相结合的网络结构,它可以加宽网络宽度,从而提升网络性能的网络结构。

代码语言:javascript复制
import torch
import torch.nn.functional as F
import torchvision.datasets as dataset
import torchvision.transforms as transforms
import torch.utils.data as data_utils

if __name__ == '__main__':

    train_data = dataset.CIFAR10(root='cifa', train=True,
                                 transform=transforms.ToTensor(),
                                 download=True)
    test_data = dataset.CIFAR10(root='cifa', train=False,
                                transform=transforms.ToTensor(),
                                download=False)
    train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True)
    test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True)

    class BaseInception(torch.nn.Module):
        def ConvBNRelu(self, in_channel, out_channel, kernel_size):
            return torch.nn.Sequential(
                torch.nn.Conv2d(in_channel, out_channel, kernel_size=kernel_size,
                                stride=1, padding=kernel_size // 2),
                torch.nn.BatchNorm2d(out_channel),
                torch.nn.ReLU()
            )

        def __init__(self, in_channel, out_channel_list, reduce_channel_list):
            super(BaseInception, self).__init__()
            self.branch1_conv = self.ConvBNRelu(in_channel, out_channel_list[0], 1)
            self.branch2_conv1 = self.ConvBNRelu(in_channel, reduce_channel_list[0], 1)
            self.branch2_conv2 = self.ConvBNRelu(reduce_channel_list[0], out_channel_list[1], 3)
            self.branch3_conv1 = self.ConvBNRelu(in_channel, reduce_channel_list[1], 1)
            self.branch3_conv2 = self.ConvBNRelu(reduce_channel_list[1], out_channel_list[2], 5)
            self.branch4_pool = torch.nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
            self.branch4_conv = self.ConvBNRelu(in_channel, out_channel_list[3], 3)

        def forward(self, x):
            out1 = self.branch1_conv(x)
            out2 = self.branch2_conv1(x)
            out2 = self.branch2_conv2(out2)
            out3 = self.branch3_conv1(x)
            out3 = self.branch3_conv2(out3)
            out4 = self.branch4_pool(x)
            out4 = self.branch4_conv(out4)
            out = torch.cat([out1, out2, out3, out4], dim=1)
            return out


    class InceptionNet(torch.nn.Module):

        def __init__(self):
            super(InceptionNet, self).__init__()
            self.block1 = torch.nn.Sequential(
                torch.nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
                torch.nn.BatchNorm2d(64),
                torch.nn.ReLU()
            )
            self.block2 = torch.nn.Sequential(
                torch.nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1),
                torch.nn.BatchNorm2d(128),
                torch.nn.ReLU()
            )
            self.block3 = torch.nn.Sequential(
                BaseInception(in_channel=128,
                              out_channel_list=[64, 64, 64, 64],
                              reduce_channel_list=[16, 16]),
                torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            )
            self.block4 = torch.nn.Sequential(
                BaseInception(in_channel=256,
                              out_channel_list=[96, 96, 96, 96],
                              reduce_channel_list=[32, 32]),
                torch.nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            )
            self.fc = torch.nn.Linear(1536, 10)

        def forward(self, x):
            out = self.block1(x)
            out = self.block2(out)
            out = self.block3(out)
            out = self.block4(out)
            out = F.avg_pool2d(out, 2)
            out = out.view(out.size()[0], -1)
            out = self.fc(out)
            out = F.log_softmax(out, dim=1)
            return out

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    epoch_num = 200
    lr = 0.01
    net = InceptionNet().to(device)
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters(), lr=lr)
    # 每隔5个epoch进行指数衰减,变成上一次学习率的0.9倍
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9)
    for epoch in range(epoch_num):
        net.train()
        for i, (images, labels) in enumerate(train_loader):
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            _, pred = torch.max(outputs, dim=1)
            correct = pred.eq(labels.data).cpu().sum()
            print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(),
                  "mini-batch correct is: ", (100.0 * correct / 64).item())
  • 使用预训练模型

这里我们使用ResNet18为例,使用预训练模型,我们不需要自己去搭建网络模型,在不需要对模型进行剪枝的情况下使用预训练模型可以增加我们的开发速度。

代码语言:javascript复制
import torch
import torchvision.datasets as dataset
import torchvision.transforms as transforms
import torch.utils.data as data_utils
from torchvision import models

if __name__ == '__main__':

    train_data = dataset.CIFAR10(root='cifa', train=True,
                                 transform=transforms.ToTensor(),
                                 download=True)
    test_data = dataset.CIFAR10(root='cifa', train=False,
                                transform=transforms.ToTensor(),
                                download=False)
    train_loader = data_utils.DataLoader(dataset=train_data, batch_size=64, shuffle=True)
    test_loader = data_utils.DataLoader(dataset=test_data, batch_size=64, shuffle=True)

    class ResNet18(torch.nn.Module):

        def __init__(self):
            super(ResNet18, self).__init__()
            # 加载预训练模型
            self.model = models.resnet18(pretrained=True)
            self.num_features = self.model.fc.in_features
            self.model.fc = torch.nn.Linear(self.num_features, 10)

        def forward(self, x):
            out = self.model(x)
            return out

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    epoch_num = 200
    lr = 0.01
    net = ResNet18().to(device)
    loss_func = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters(), lr=lr)
    # 每隔5个epoch进行指数衰减,变成上一次学习率的0.9倍
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.9)
    for epoch in range(epoch_num):
        net.train()
        for i, (images, labels) in enumerate(train_loader):
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            _, pred = torch.max(outputs, dim=1)
            correct = pred.eq(labels.data).cpu().sum()
            print("epoch is ", epoch, "step ", i, "loss is: ", loss.item(),
                  "mini-batch correct is: ", (100.0 * correct / 64).item())

图像增强

这是进行图像进入神经网络之前进行预处理的方法,叫做图像增强。上图中最左边的图片为原始图片,通过这张图片来产生右边的四张图片。一方面可以扩充我们的数据集,将数据集扩大4倍。另一方面这些变换后的图片可能在场景中会出现,但是在原始数据集中并没有。我们将这些变化后的数据送给我们的模型进行学习之后,模型应对现实场景下各种各样不一样的图片,就会更加的鲁棒。

上图中是一些常用的方法,整个图片增强的方法其实非常多。最左边的是原始图片,Rotation表示随机的旋转,随机旋转的意义在于在拍摄图片的时候,相机可能是斜的,如果把原始图片旋转一下送给模型,那么模型就能应对这些倾斜的图片,依然能够准确识别图片中的内容。Blur为模糊,模糊的意义在于拍摄图片的时候可能是摄像头上有雾气时拍摄的图片,为了让模型能够对这些图片依然鲁棒的时候,应该做这种操作。Contrast为对比度的随机调节,意义在不同的人对不同的对比度有不同的喜好,有些人可能希望艳丽一点,有些人喜欢暗一点。Scaling表示不同距离时的图片,其意义在于能够让模型能够处理不同远近的图片。Illumination表示曝光,在不同的光照条件下,让模型能够识别。Projective为透视变换,表示不同角度拍摄的图片的样子,通过透视变换能够来扭曲图片空间的位置来模拟我们站在不同角度拍摄图片的不同的样子,使得模型也能够应对这些图片的样子。

代码语言:javascript复制
import torchvision.transforms as transforms
from PIL import Image

if __name__ == '__main__':

    trans = transforms.Compose([
        transforms.ToTensor(),          # 归一化,转成float32
        transforms.RandomRotation(45),    # 随机的旋转
        transforms.RandomAffine(45),      # 随机仿射变换
        # 标准化,第一个元组表示各个通道(r,g,b)的均值,第二个元组表示各个通道(r,g,b)的方差
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
    ])
    unloader = transforms.ToPILImage()
    image = Image.open("/Users/admin/Documents/444.jpeg")
    print(image)
    image.show()
    image_out = trans(image)
    image = unloader(image_out)
    print(image_out.size())
    image.show()

运行结果

代码语言:javascript复制
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1896x1279 at 0x7F801385F750>
torch.Size([3, 1279, 1896])

GAN网络

GAN的主要内容可以参考Tensorflow深度学习算法整理(三)

这里我们会实现一个CycleGAN。数据集下载地址:https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/这里我们使用apple2orange.zip这个数据集来进行训练。

首先是数据集的读取

代码语言:javascript复制
import glob
import random
from torch.utils.data import Dataset, DataLoader
from PIL import Image
import torchvision.transforms as transforms
import os

class ImageDataset(Dataset):

    def __init__(self, root='', transform=None, model='train'):
        self.transform = transforms.Compose(transform)
        self.pathA = os.path.join(root, model   "A/*")
        self.pathB = os.path.join(root, model   "B/*")
        self.list_A = glob.glob(self.pathA)
        self.list_B = glob.glob(self.pathB)

    def __getitem__(self, index):
        im_pathA = self.list_A[index % len(self.list_A)]
        im_pathB = random.choice(self.list_B)
        im_A = Image.open(im_pathA)
        im_B = Image.open(im_pathB)
        item_A = self.transform(im_A)
        item_B = self.transform(im_B)
        return {"A": item_A, "B": item_B}

    def __len__(self):
        return max(len(self.list_A), len(self.list_B))

if __name__ == '__main__':

    root = "/Users/admin/Downloads/apple2orange"
    transform_ = [transforms.Resize(256, Image.BILINEAR),
                  transforms.ToTensor()]
    dataloader = DataLoader(dataset=ImageDataset(root, transform_, 'train'),
                            batch_size=1, shuffle=True,
                            num_workers=1)
    for i, batch in enumerate(dataloader):
        print(i)
        print(batch)

生成器和判别器的模型

代码语言:javascript复制
import torch
import torch.nn.functional as F


class ResBlock(torch.nn.Module):

    def __init__(self, in_channel):
        super(ResBlock, self).__init__()
        self.conv_block = torch.nn.Sequential(
            # 非0填充周边
            torch.nn.ReflectionPad2d(1),
            torch.nn.Conv2d(in_channel, in_channel, kernel_size=3),
            # 在一个通道内做归一化
            torch.nn.InstanceNorm2d(in_channel),
            torch.nn.ReLU(inplace=True),
            torch.nn.ReflectionPad2d(1),
            torch.nn.Conv2d(in_channel, in_channel, kernel_size=3),
            torch.nn.InstanceNorm2d(in_channel)
        )

    def forward(self, x):
        return x   self.conv_block(x)

class Generator(torch.nn.Module):
    '''生成器'''

    def __init__(self):
        super(Generator, self).__init__()
        net = [
            torch.nn.ReflectionPad2d(3),
            torch.nn.Conv2d(3, 64, kernel_size=7),
            torch.nn.InstanceNorm2d(64),
            torch.nn.ReLU(inplace=True)
        ]
        in_channel = 64
        out_channel = in_channel * 2
        # 下采样2次
        for _ in range(2):
            net  = [
                torch.nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=2, padding=1),
                torch.nn.InstanceNorm2d(out_channel),
                torch.nn.ReLU(inplace=True)
            ]
            in_channel = out_channel
            out_channel = in_channel * 2
        # 做9次残差连接
        for _ in range(9):
            net  = [ResBlock(in_channel)]
        # 上采样2次
        out_channel = in_channel // 2
        for _ in range(2):
            net  = [
                torch.nn.ConvTranspose2d(in_channel, out_channel, kernel_size=3,
                                         stride=2, padding=1, output_padding=1),
                torch.nn.InstanceNorm2d(out_channel),
                torch.nn.ReLU(inplace=True)
            ]
            in_channel = out_channel
            out_channel = in_channel // 2
        # 输出
        net  = [
            torch.nn.ReflectionPad2d(3),
            torch.nn.Conv2d(in_channel, 3, kernel_size=7),
            torch.nn.Tanh()
        ]
        self.model = torch.nn.Sequential(*net)

    def forward(self, x):
        return self.model(x)

class Discriminator(torch.nn.Module):
    '''判别器'''

    def __init__(self):
        super(Discriminator, self).__init__()
        model = [
            torch.nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1),
            torch.nn.LeakyReLU(0.2, inplace=True)
        ]
        model  = [
            torch.nn.Conv2d(64, 128, kernel_size=4, stride=2, padding=1),
            torch.nn.InstanceNorm2d(128),
            torch.nn.LeakyReLU(0.2, inplace=True)
        ]
        model  = [
            torch.nn.Conv2d(128, 256, kernel_size=4, stride=2, padding=1),
            torch.nn.InstanceNorm2d(256),
            torch.nn.LeakyReLU(0.2, inplace=True)
        ]
        model  = [
            torch.nn.Conv2d(256, 512, kernel_size=4, stride=2, padding=1),
            torch.nn.InstanceNorm2d(512),
            torch.nn.LeakyReLU(0.2, inplace=True)
        ]
        model  = [torch.nn.Conv2d(512, 1, kernel_size=4, padding=1)]
        self.model = torch.nn.Sequential(*model)

    def forward(self, x):
        x = self.model(x)
        return F.avg_pool2d(x, x.size()[2:]).view(x.size()[0], -1)

if __name__ == '__main__':

    G = Generator()
    D = Discriminator()
    input_tensor = torch.ones((1, 3, 256, 256), dtype=torch.float)
    out = G(input_tensor)
    print(out.size())
    out = D(input_tensor)
    print(out.size())

运行结果

代码语言:javascript复制
torch.Size([1, 3, 256, 256])
torch.Size([1, 1])

工具类和方法

代码语言:javascript复制
import random
import torch
import numpy as np

def tensor2image(tensor):
    image = 127.5 * (tensor[0].cpu().float().numpy()   1.0)
    if image.shape[0] == 1:
        image = np.tile(image, (3, 1, 1))
    return image.astype(np.uint8)

class ReplayBuffer():

    def __init__(self, max_size=50):
        assert (max_size > 0), "Empty buffer or trying to create a black hole. Be careful."
        self.max_size = max_size
        self.data = []

    def push_and_pop(self, data):
        to_return = []
        for element in data.data:
            element = torch.unsqueeze(element, 0)
            if len(self.data) < self.max_size:
                self.data.append(element)
                to_return.append(element)
            else:
                if random.uniform(0, 1) > 0.5:
                    i = random.randint(0, self.max_size - 1)
                    to_return.append(self.data[i].clone())
                    self.data[i] = element
                else:
                    to_return.append(element)
        return torch.cat(to_return)

class LambdaLR():

    def __init__(self, n_epochs, offset, decay_start_epoch):
        assert ((n_epochs - decay_start_epoch) > 0), "Decay must start before the training session ends!"
        self.n_epochs = n_epochs
        self.offset = offset
        self.decay_start_epoch = decay_start_epoch

    def step(self, epoch):
        return 1.0 - max(0, epoch   self.offset - self.decay_start_epoch) / (self.n_epochs - self.decay_start_epoch)

def weights_init_normal(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        torch.nn.init.normal(m.weight.data, 0.0, 0.02)
    elif classname.find('BatchNorm2d') != -1:
        torch.nn.init.normal(m.weight.data, 1.0, 0.02)
        torch.nn.init.constant(m.bias.data, 0.0)

模型训练

代码语言:javascript复制
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from PIL import Image
import torch
from pytorch.gan.models import Generator, Discriminator
from pytorch.gan.utils import ReplayBuffer, LambdaLR, weights_init_normal
from pytorch.gan.dataset import ImageDataset
import itertools
import tensorboardX
import os

if __name__ == '__main__':

    os.environ["OMP_NUM_THREADS"] = "1"
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    batchsize = 1
    size = 256
    lr = 0.0002
    n_epoch = 200
    epoch = 0
    decay_epoch = 100

    netG_A2B = Generator().to(device)
    netG_B2A = Generator().to(device)
    netD_A = Discriminator().to(device)
    netD_B = Discriminator().to(device)

    loss_GAN = torch.nn.MSELoss()
    loss_Cycle = torch.nn.L1Loss()
    loss_identity = torch.nn.L1Loss()

    opt_G = torch.optim.Adam(itertools.chain(netG_A2B.parameters(), netG_B2A.parameters()),
                             lr=lr, betas=(0.5, 0.9999))
    opt_DA = torch.optim.Adam(netD_A.parameters(), lr=lr, betas=(0.5, 0.9999))
    opt_DB = torch.optim.Adam(netD_B.parameters(), lr=lr, betas=(0.5, 0.9999))

    lr_scheduler_G = torch.optim.lr_scheduler.LambdaLR(opt_G,
                                                       lr_lambda=LambdaLR(n_epoch,
                                                                          epoch,
                                                                          decay_epoch).step)
    lr_scheduler_DA = torch.optim.lr_scheduler.LambdaLR(opt_DA,
                                                        lr_lambda=LambdaLR(n_epoch,
                                                                           epoch,
                                                                           decay_epoch).step)
    lr_scheduler_DB = torch.optim.lr_scheduler.LambdaLR(opt_DB,
                                                        lr_lambda=LambdaLR(n_epoch,
                                                                           epoch,
                                                                           decay_epoch).step)
    data_root = "/Users/admin/Downloads/apple2orange"
    input_A = torch.ones([1, 3, size, size], dtype=torch.float).to(device)
    input_B = torch.ones([1, 3, size, size], dtype=torch.float).to(device)
    label_real = torch.ones([1], dtype=torch.float, requires_grad=False).to(device)
    label_fake = torch.zeros([1], dtype=torch.float, requires_grad=False).to(device)
    fake_A_buffer = ReplayBuffer()
    fake_B_Buffer = ReplayBuffer()
    log_path = "logs"
    writer_log = tensorboardX.SummaryWriter(log_path)
    transforms_ = [
        transforms.Resize(int(256 * 1.12), Image.BICUBIC),
        transforms.RandomCrop(256),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ]
    dataloader = DataLoader(dataset=ImageDataset(data_root, transforms_),
                            batch_size=batchsize, shuffle=True, num_workers=1)
    step = 0
    for epoch in range(n_epoch):
        for i, batch in enumerate(dataloader):
            real_A = torch.tensor(input_A.copy_(batch['A']), dtype=torch.float).to(device)
            real_B = torch.tensor(input_B.copy_(batch['B']), dtype=torch.float).to(device)
            # 生成器梯度下降
            opt_G.zero_grad()
            same_B = netG_A2B(real_B)
            loss_identity_B = loss_identity(same_B, real_B) * 5.0
            same_A = netG_B2A(real_A)
            loss_identity_A = loss_identity(same_A, real_A) * 5.0
            fake_B = netG_A2B(real_A)
            pred_fake = netD_B(fake_B)
            loss_GAN_A2B = loss_GAN(pred_fake, label_real)
            fake_A = netG_B2A(real_B)
            pred_fake = netD_A(fake_A)
            loss_GAN_B2A = loss_GAN(pred_fake, label_real)

            recovered_A = netG_B2A(fake_B)
            loss_cycle_ABA = loss_Cycle(recovered_A, real_A) * 10.0
            recovered_B = netG_A2B(fake_A)
            loss_cycle_BAB = loss_Cycle(recovered_B, real_B) * 10.0

            loss_G = loss_identity_A   loss_identity_B   loss_GAN_A2B   loss_GAN_B2A   
                     loss_cycle_ABA   loss_cycle_BAB
            loss_G.backward()
            opt_G.step()
            # 判别器梯度下降
            opt_DA.zero_grad()
            pred_real = netD_A(real_A)
            loss_D_real = loss_GAN(pred_real, label_real)
            fake_A = fake_A_buffer.push_and_pop(fake_A)
            pred_fake = netD_A(fake_A.detach())
            loss_D_fake = loss_GAN(pred_real, label_fake)

            loss_D_A = (loss_D_real   loss_D_fake) * 0.5
            loss_D_A.backward()
            opt_DA.step()

            opt_DB.zero_grad()
            pred_real = netD_B(real_B)
            loss_D_real = loss_GAN(pred_real, label_real)
            fake_B = fake_A_buffer.push_and_pop(fake_B)
            pred_fake = netD_B(fake_B.detach())
            loss_D_fake = loss_GAN(pred_real, label_fake)

            loss_D_B = (loss_D_real   loss_D_fake) * 0.5
            loss_D_B.backward()
            opt_DB.step()

            print("loss_G:{}, loss_G_identity:{}, loss_G_GAN:{}, "
                  "loss_G_cycle:{}, loss_D_A:{}, loss_D_B:{}".format(
                loss_G, loss_identity_A   loss_identity_B,
                loss_GAN_A2B   loss_GAN_B2A,
                loss_cycle_ABA   loss_cycle_BAB,
                loss_D_A, loss_D_B
            ))

            writer_log.add_scalar("loss_G", loss_G, global_step=step   1)
            writer_log.add_scalar("loss_G_identity", loss_identity_A   loss_identity_B, global_step=step   1)
            writer_log.add_scalar("loss_G_GAN", loss_GAN_A2B   loss_GAN_B2A, global_step=step   1)
            writer_log.add_scalar("loss_G_cycle", loss_cycle_ABA   loss_cycle_BAB, global_step=step   1)
            writer_log.add_scalar("loss_D_A", loss_D_A, global_step=step   1)
            writer_log.add_scalar("loss_D_B", loss_D_B, global_step=step   1)

            step  = 1

        lr_scheduler_G.step()
        lr_scheduler_DA.step()
        lr_scheduler_DB.step()

        torch.save(netG_A2B.state_dict(), "models/netG_A2B.pth")
        torch.save(netG_B2A.state_dict(), "models/netG_B2A.pth")
        torch.save(netD_A.state_dict(), "models/netD_A.pth")
        torch.save(netD_B.state_dict(), "models/netD_B.pth")

0 人点赞