模型学习的根源在于需要知道当前模型的问题出在哪，为模型优化指明方向和距离就需要依靠损失函数，本文介绍 Pytorch 的损失函数。

参考深入浅出PyTorch ，系统补齐基础知识。

本节目录

在深度学习中常见的损失函数及其定义方式
PyTorch中损失函数的调用

二分类交叉熵损失函数

1	torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean')

功能：计算二分类任务时的交叉熵（Cross Entropy）函数。在二分类中，label是{0,1}。对于进入交叉熵函数的input为概率分布的形式。一般来说，input为sigmoid激活层的输出，或者softmax的输出。

主要参数：

weight: 每个类别的loss设置权值

size_average: 数据为bool，为True时，返回的 loss 为平均值；为False时，返回的各样本的 loss 之和。这个参数已经被重命名为 reduction，将在将来的版本中删除。请使用 reduction 参数代替。

reduce: 数据类型为bool，为True时，loss的返回是标量。

核心实现：

12	def forward(self, input: Tensor, target: Tensor) -> Tensor: return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)

，则 loss 为：

loss=tlog frac{1}{d} (1-t)logfrac{1}{1-d}

123456789101112131415161718192021	import torchfrom torch import nnimport numpy as nploss = nn.BCELoss()m = nn.Sigmoid()data = torch.tensor(0.0, requires_grad=True)target = torch.ones(1)l = loss(m(data), target)print(l)print(np.log(2))pass-->tensor(0.6931, grad_fn=<BinaryCrossEntropyBackward0>)0.6931471805599453

交叉熵损失函数

1	torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

功能：计算交叉熵函数

主要参数：

weight:每个类别的loss设置权值。

size_average:数据为bool，为True时，返回的loss为平均值；为False时，返回的各样本的loss之和。

ignore_index:忽略某个类的损失函数。

reduce:数据类型为bool，为True时，loss的返回是标量。

123456789101112131415161718192021222324252627282930

import torchimport torch.nn as nnx_input=torch.randn(3,5)#随机生成输入 print('x_input:n',x_input) y_target=torch.tensor(4,2,0)#设置输出具体值 print('y_targetn',y_target)#计算输入softmax，此时可以看到每一行加到一起结果都是1softmax_func=nn.Softmax(dim=1)soft_output=softmax_func(x_input)print('soft_output:n',soft_output)#在softmax的基础上取loglog_output=torch.log(soft_output)print('log_output:n',log_output)#对比softmax与log的结合与nn.LogSoftmaxloss(负对数似然损失)的输出结果，发现两者是一致的。logsoftmax_func=nn.LogSoftmax(dim=1)logsoftmax_output=logsoftmax_func(x_input)print('logsoftmax_output:n',logsoftmax_output)#pytorch中关于NLLLoss的默认参数配置为：reducetion=True、size_average=Truenllloss_func=nn.NLLLoss(reduction="none")nlloss_output=nllloss_func(logsoftmax_output,y_target)print('nlloss_output:n',nlloss_output)#直接使用pytorch中的loss_func=nn.CrossEntropyLoss()看与经过NLLLoss的计算是不是一样crossentropyloss=nn.CrossEntropyLoss(reduction="none")crossentropyloss_output=crossentropyloss(x_input,y_target)print('crossentropyloss_output:n',crossentropyloss_output)

输出：

1234567891011121314151617181920

x_input: tensor([-1.7327, -0.1885, -0.7649, 0.8701, 0.4981, -2.1903, 0.5137, -0.3262, 0.1239, 0.0126, 0.8400, 1.4696, -0.2860, -2.8149, -0.3208])soft_output: tensor([0.0321, 0.1505, 0.0846, 0.4338, 0.2990, 0.0241, 0.3595, 0.1552, 0.2435, 0.2178, 0.2825, 0.5302, 0.0916, 0.0073, 0.0885])log_output: tensor([-3.4380, -1.8939, -2.4702, -0.8352, -1.2072, -3.7271, -1.0231, -1.8630, -1.4128, -1.5242, -1.2643, -0.6346, -2.3902, -4.9191, -2.4250])logsoftmax_output: tensor([-3.4380, -1.8939, -2.4702, -0.8352, -1.2072, -3.7271, -1.0231, -1.8630, -1.4128, -1.5242, -1.2643, -0.6346, -2.3902, -4.9191, -2.4250])nlloss_output: tensor(1.2072, 1.8630, 1.2643)crossentropyloss_output: tensor(1.2072, 1.8630, 1.2643)

L1损失函数

1	torch.nn.L1Loss(size_average=None, reduce=None, reduction='mean')

功能： 计算输出y和真实标签target之间的差值的绝对值。

我们需要知道的是，reduction参数决定了计算模式。有三种计算模式可选：none：逐个元素计算。 sum：所有元素求和，返回标量。 mean：加权平均，返回标量。如果选择none，那么返回的结果是和输入元素相同尺寸的。默认计算方式是求平均。

计算公式如下： L_{n}=left|x_{n}-y_{n}right|

1234567891011121314	import torchimport torch.nn as nndata = torch.randn(2,4, requires_grad=True)target = torch.empty(2,4).random_(2)print(data)print(target)loss = nn.L1Loss(reduction="none")res = loss(data, target)print(res)pass

输出：

123456	tensor([ 0.7438, -0.7181, 1.7000, 0.2125, -0.8243, 1.0593, -1.5408, -0.9641], requires_grad=True)tensor([0., 1., 1., 1., 1., 0., 1., 1.])tensor([0.7438, 1.7181, 0.7000, 0.7875, 1.8243, 1.0593, 2.5408, 1.9641], grad_fn=<AbsBackward0>)

MSE损失函数

1	torch.nn.MSELoss(size_average=None, reduce=None, reduction='mean')

功能： 计算输出y和真实标签target之差的平方。

和L1Loss一样，MSELoss损失函数中，reduction参数决定了计算模式。有三种计算模式可选：none：逐个元素计算。 sum：所有元素求和，返回标量。默认计算方式是求平均。

计算公式： l_{n}=left(x_{n}-y_{n}right)^{2}

12345678910	loss = nn.MSELoss()input = torch.randn(3, 5, requires_grad=True)target = torch.randn(3, 5)output = loss(input, target)output.backward()print('MSE损失函数的计算结果为',output)-->MSE损失函数的计算结果为 tensor(1.6968, grad_fn=<MseLossBackward>)

平滑L1 (Smooth L1)损失函数

1	torch.nn.SmoothL1Loss(size_average=None, reduce=None, reduction='mean', beta=1.0)

功能： L1的平滑输出，其功能是减轻离群点带来的影响

reduction参数决定了计算模式。有三种计算模式可选：none：逐个元素计算。 sum：所有元素求和，返回标量。默认计算方式是求平均 mean。

提醒： 之后的损失函数中，关于reduction 这个参数依旧会存在。所以，之后就不再单独说明。

计算公式如下：

$$ operatorname{loss}(x, y)=frac{1}{n} sum_{i=1}^{n} z_{i} 其中， z_{i}=left{begin{array}{ll}0.5left(x_{i}-y_{i}right)^{2}, & text { if }left|x_{i}-y_{i}right|<1 left|x_{i}-y_{i}right|-0.5, & text { otherwise }end{array}right. $$

12345678910	loss = nn.SmoothL1Loss()input = torch.randn(3, 5, requires_grad=True)target = torch.randn(3, 5)output = loss(input, target)output.backward()print('SmoothL1Loss损失函数的计算结果为',output)-->SmoothL1Loss损失函数的计算结果为 tensor(0.7808, grad_fn=<SmoothL1LossBackward>)

平滑L1与L1的对比

这里我们通过可视化两种损失函数曲线来对比平滑L1和L1两种损失函数的区别。

123456789101112131415

inputs = torch.linspace(-10, 10, steps=5000)target = torch.zeros_like(inputs)loss_f_smooth = nn.SmoothL1Loss(reduction='none')loss_smooth = loss_f_smooth(inputs, target)loss_f_l1 = nn.L1Loss(reduction='none')loss_l1 = loss_f_l1(inputs,target)plt.plot(inputs.numpy(), loss_smooth.numpy(), label='Smooth L1 Loss')plt.plot(inputs.numpy(), loss_l1, label='L1 loss')plt.xlabel('x_i - y_i')plt.ylabel('loss value')plt.legend()plt.grid()plt.show()

可以看出，对于smoothL1来说，在 0 这个尖端处，过渡更为平滑。

目标泊松分布的负对数似然损失

1	torch.nn.PoissonNLLLoss(log_input=True, full=False, size_average=None, eps=1e-08, reduce=None, reduction='mean')

功能：泊松分布的负对数似然损失函数，针对的是神经网络输出为泊松分布参数 lambda 时损失计算的情况。由于输出为 lambda 而不是概率值，因此需要将该值转化为概率。

主要参数：

log_input：输入是否为对数形式，决定计算公式。

full：计算所有 loss，默认为 False。表示loss计算是否保留 log(y_{n}!)% 如果保留使用

当 $ mathrm{y}_{mathrm{n}} leq 1, log left(mathrm{y}_{mathrm{n}} !right) $ 近似为 0 - 当 $ mathrm{y}_{mathrm{n}}>1 $ ，使用斯特林公式(Stirling’s formula) ，$ log left(mathrm{y}_{mathrm{n}} !right) $ 近似为 $ mathrm{y}_{mathrm{n}} * log left(mathrm{y}_{mathrm{n}}right)-mathrm{y}_{mathrm{n}} 0.5 * log left(2 pi mathrm{y}_{mathrm{n}}right) $.

eps：修正项，避免 input 为 0 时，log(input) 为 nan 的情况。

原理：

泊松分布概率计算公式：

mathrm{P}(mathrm{Y}=mathrm{k})=frac{lambda^{mathrm{k}}}{mathrm{k} !} mathrm{e}^{-lambda}

对于包含

当网络输出参数为 x_n 时，若该样本对应的标签为 y_n :

若 x 是神经网络的输出，且未进行对数化处理。第 n 个样本对应的损失 l_{n} 为：

$$ mathrm{P}left(mathrm{Y}=mathrm{y}_{mathrm{n}}right)=frac{mathrm{x}_{mathrm{n}}^{mathrm{y}_{mathrm{n}}}}{mathrm{y}_{mathrm{n}} !} mathrm{e}^{-mathrm{x}_{mathrm{n}}} $$ $$ mathrm{l}_{mathrm{n}}=-log mathrm{P}left(mathrm{Y}=mathrm{y}_{mathrm{n}}right)=mathrm{x}_{mathrm{n}}-mathrm{y}_{mathrm{n}} log mathrm{x}_{mathrm{n}} log left(mathrm{y}_{mathrm{n}} !right) $$

若 x 是神经网络的输出，且进行过了对数化处理。第 n 个样本对应的损失 l_{n} 为：

$ mathrm{x}_{mathrm{n}} $ 替换为 $ exp left(mathrm{x}_{mathrm{n}}right) $

$$ mathrm{l}_{mathrm{n}}=-log mathrm{P}left(mathrm{Y}=mathrm{y}_{mathrm{n}}right)=exp left(mathrm{x}_{mathrm{n}}right)-mathrm{y}_{mathrm{n}} mathrm{x}_{mathrm{n}} log left(mathrm{y}_{mathrm{n}} !right) $$

最后一项 log(y_{n}!) 可以省略或者用斯特林公式(Stirling’s formula)近似。

数学公式：

当参数log_input=True： operatorname{loss}left(x_{n}, y_{n}right)=e^{x_{n}}-x_{n} cdot y_{n}
当参数log_input=False： operatorname{loss}left(x_{n}, y_{n}right)=x_{n}-y_{n} cdot log left(x_{n} right. eps )

1234567891011121314151617181920

import torchimport matplotlib.pyplot as pltimport torch.nn as nnloss = nn.PoissonNLLLoss(reduction='none')log_input = torch.randn(5, 2, requires_grad=True)target = torch.empty(5,2).random_(5)output = loss(log_input, target)print('PoissonNLLLoss损失函数的计算结果为',output)-->PoissonNLLLoss损失函数的计算结果为 tensor([1.8573, 2.2177, 1.9914, 1.0427, 4.5823, 0.8821, 4.5176, 1.0008, 2.6423, 0.3343], grad_fn=<SubBackward0>)

KL散度

1	torch.nn.KLDivLoss(size_average=None, reduce=None, reduction='mean', log_target=False)

功能： 计算KL散度，也就是计算相对熵。用于连续分布的距离度量，并且对离散采用的连续输出空间分布进行回归通常很有用。

主要参数:

reduction：计算模式，可为 none/sum/mean/batchmean。

1234567	none：逐个元素计算。sum：所有元素求和，返回标量。mean：加权平均，返回标量。batchmean：batchsize 维度求平均值。

计算公式：

$$ begin{aligned} D_{mathrm{KL}}(P, Q)=mathrm{E}_{X sim P}leftlog frac{P(X)}{Q(X)}right & =mathrm{E}_{X sim P}log P(X)-log Q(X) & =sum_{i=1}^{n} Pleft(x_{i}right)left(log Pleft(x_{i}right)-log Qleft(x_{i}right)right)end{aligned} $$

使用流程：

使用时输入为 input 和 target，其中 target 相当于公式中的 P ，此处的 target 为概率值， input 为概率的对数结果，因此其实计算的是 sum target times (ln target -input)

123456789101112131415161718192021

import torch.nn as nnimport torchimport torch.nn.functional as Fx = torch.randn((1, 8))y = torch.randn((1, 8))# 先转化为概率，之后取对数x_log = F.log_softmax(x,dim=1)# 只转化为概率y = F.softmax(y,dim=1)kl = nn.KLDivLoss(reduction='batchmean')out = kl(x_log, y)print(x)print(y)print(out)-->tensor([-0.9543, -0.4117, 0.0377, -0.3320, 0.2467, -0.4887, 0.1111, 1.2274])tensor([0.0630, 0.0266, 0.0735, 0.2664, 0.1959, 0.1449, 0.1859, 0.0438])tensor(0.4630)

验证示例：

123456789101112131415161718192021222324252627282930313233343536373839404142434445

import torchimport torch.nn as nnimport mathdef validate_loss(output, target): val = 0 for li_x, li_y in zip(output, target): for i, xy in enumerate(zip(li_x, li_y)): x, y = xy loss_val = y * (math.log(y, math.e) - x) val = loss_val return val / output.nelement()torch.manual_seed(20)loss = nn.KLDivLoss()input = torch.Tensor([-2, -6, -8, -7, -1, -2, -1, -9, -2.3, -1.9, -2.8, -5.4])target = torch.Tensor([0.8, 0.1, 0.1, 0.1, 0.7, 0.2, 0.5, 0.2, 0.3, 0.4, 0.3, 0.3])output = loss(input, target)print("default loss:", output)output = validate_loss(input, target)print("validate loss:", output)loss = nn.KLDivLoss(reduction="batchmean")output = loss(input, target)print("batchmean loss:", output)loss = nn.KLDivLoss(reduction="mean")output = loss(input, target)print("mean loss:", output)loss = nn.KLDivLoss(reduction="none")output = loss(input, target)print("none loss:", output)-->default loss: tensor(0.6209)validate loss: tensor(0.6209)batchmean loss: tensor(1.8626)mean loss: tensor(0.6209)none loss: tensor([1.4215, 0.3697, 0.5697, 0.4697, 0.4503, 0.0781, 0.1534, 1.4781, 0.3288, 0.3935, 0.4788, 1.2588])

MarginRankingLoss

1	torch.nn.MarginRankingLoss(margin=0.0, size_average=None, reduce=None, reduction='mean')

功能： 计算两个向量之间的相似度，用于排序任务。该方法用于计算两组数据之间的差异。

主要参数:

margin：边界值， x_{1} 与 x_{2} 之间的差异值。

reduction：计算模式，可为 none/sum/mean。

计算公式：

operatorname{loss}(x 1, x 2, y)=max (0,-y *(x 1-x 2) operatorname{margin})

1234567891011	loss = nn.MarginRankingLoss()input1 = torch.randn(3, requires_grad=True)input2 = torch.randn(3, requires_grad=True)target = torch.randn(3).sign()output = loss(input1, input2, target)output.backward()print('MarginRankingLoss损失函数的计算结果为',output)-->MarginRankingLoss损失函数的计算结果为 tensor(0.7740, grad_fn=<MeanBackward0>)

多标签边界损失函数

1	torch.nn.MultiLabelMarginLoss(size_average=None, reduce=None, reduction='mean')

功能： 对于多标签分类问题计算损失函数。

主要参数:

reduction：计算模式，可为 none/sum/mean。

计算公式： operatorname{loss}(x, y)=sum_{i j} frac{max (0,1-x[y[j]]-x[i])}{x cdot operatorname{size}(0)}

其中对于所有的和都有并且其中, , i=0, ldots, x cdot operatorname{size}(0), j=0, ldots, y cdot operatorname{size}(0) , 对于所有的 i 和 j , 都有 y[j] geq 0 并且 i neq y[j]

12345678910	loss = nn.MultiLabelMarginLoss()x = torch.FloatTensor([0.9, 0.2, 0.4, 0.8])# for target y, only consider labels 3 and 0, not after label -1y = torch.LongTensor([3, 0, -1, 1])# 真实的分类是，第3类和第0类output = loss(x, y)print('MultiLabelMarginLoss损失函数的计算结果为',output)-->MultiLabelMarginLoss损失函数的计算结果为 tensor(0.4500)

二分类损失函数

1	torch.nn.SoftMarginLoss(size_average=None, reduce=None, reduction='mean')torch.nn.(size_average=None, reduce=None, reduction='mean')

功能： 计算二分类的 logistic 损失。

主要参数:

reduction：计算模式，可为 none/sum/mean。

计算公式： operatorname{loss}(x, y)=sum_{i} frac{log (1 exp (-y[i] cdot x[i]))}{x cdot operatorname{nelement}()}

其中, x . nelement () 为输入 x 中的样本个数。注意这里 y 世有 1 和 -1 两种模式。

1234567891011	inputs = torch.tensor([0.3, 0.7, 0.5, 0.5]) # 两个样本，两个神经元target = torch.tensor([-1, 1, 1, -1], dtype=torch.float) # 该 loss 为逐个神经元计算，需要为每个神经元单独设置标签loss_f = nn.SoftMarginLoss()output = loss_f(inputs, target)print('SoftMarginLoss损失函数的计算结果为',output)-->SoftMarginLoss损失函数的计算结果为 tensor(0.6764)

多分类的折页损失

1	torch.nn.MultiMarginLoss(p=1, margin=1.0, weight=None, size_average=None, reduce=None, reduction='mean')

功能： 计算多分类的折页损失

主要参数:

reduction：计算模式，可为 none/sum/mean。

p：可选 1 或 2。

weight：各类别的 loss 设置权值。

margin：边界值

计算公式： operatorname{loss}(x, y)=frac{sum_{i} max (0, operatorname{margin}-x[y] x[i])^{p}}{x cdot operatorname{size}(0)}

其中 x in{0, ldots, x cdot operatorname{size}(0)-1}, y in{0, ldots, y cdot operatorname{size}(0)-1} , 对于所有 i,j ，都有 0 leq y[j] leq x cdot operatorname{size}(0)-1 和 i neq y[j] .

1234567891011	inputs = torch.tensor([0.3, 0.7, 0.5, 0.5]) target = torch.tensor(0, 1, dtype=torch.long) loss_f = nn.MultiMarginLoss()output = loss_f(inputs, target)print('MultiMarginLoss损失函数的计算结果为',output)-->MultiMarginLoss损失函数的计算结果为 tensor(0.6000)

三元组损失

1	torch.nn.TripletMarginLoss(margin=1.0, p=2.0, eps=1e-06, swap=False, size_average=None, reduce=None, reduction='mean')

功能： 计算三元组损失。

三元组: 这是一种数据的存储或者使用格式。<实体1，关系，实体2>。在项目中，也可以表示为< anchor, positive examples , negative examples>

在这个损失函数中，我们希望去anchor的距离更接近positive examples，而远离negative examples

主要参数:

reduction：计算模式，可为 none/sum/mean。

p：可选 1 或 2。

margin：边界值

计算公式：

$L(a, p, n)=max left{dleft(a_{i}, p_{i}right)-dleft(a_{i}, n_{i}right) operatorname{margin}, 0right}$ 其中, $ dleft(x_{i}, y_{i}right)=left|mathbf{x}_{i}-mathbf{y}_{i}right| $.

1234567891011

triplet_loss = nn.TripletMarginLoss(margin=1.0, p=2)anchor = torch.randn(100, 128, requires_grad=True)positive = torch.randn(100, 128, requires_grad=True)negative = torch.randn(100, 128, requires_grad=True)output = triplet_loss(anchor, positive, negative)output.backward()print('TripletMarginLoss损失函数的计算结果为',output)-->TripletMarginLoss损失函数的计算结果为 tensor(1.1667, grad_fn=<MeanBackward0>)

HingEmbeddingLoss

1	torch.nn.HingeEmbeddingLoss(margin=1.0, size_average=None, reduce=None, reduction='mean')

功能： 对输出的embedding结果做Hing损失计算

主要参数:

reduction：计算模式，可为 none/sum/mean。

margin：边界值

计算公式：

$$ l_{n}=left{begin{array}{ll}x_{n}, & text { if } y_{n}=1 max left{0, Delta-x_{n}right}, & text { if } y_{n}=-1end{array}right. $$

注意事项：输入x 应为两个输入之差的绝对值。

可以这样理解，让个输出的是正例 y_n=1 ,那么 loss 就是 x ，如果输出的是负例 y=-1 ，那么输出的loss就是要做一个比较。

12345678910	loss_f = nn.HingeEmbeddingLoss()inputs = torch.tensor([1., 0.8, 0.5])target = torch.tensor([1, 1, -1])output = loss_f(inputs,target)print('HingEmbeddingLoss损失函数的计算结果为',output)->HingEmbeddingLoss损失函数的计算结果为 tensor(0.7667)

余弦相似度

1	torch.nn.CosineEmbeddingLoss(margin=0.0, size_average=None, reduce=None, reduction='mean')

功能： 对两个向量做余弦相似度

主要参数:

reduction：计算模式，可为 none/sum/mean。

margin：可取值-1,1 ，推荐为0,0.5 。

计算公式：

$$ begin{array}{l}operatorname{loss}(x, y)=left{begin{array}{ll}1-cos left(x_{1}, x_{2}right), & text { if } y=1 max left{0, cos left(x_{1}, x_{2}right)-operatorname{margin}right}, & text { if } y=-1end{array} text { 其中， }right. cos (theta)=frac{A cdot B}{|A||B|}=frac{sum_{i=1}^{n} A_{i} times B_{i}}{sqrt{sum_{i=1}^{n}left(A_{i}right)^{2}} times sqrt{sum_{i=1}^{n}left(B_{i}right)^{2}}}end{array} $$

这个损失函数应该是最广为人知的。对于两个向量，做余弦相似度。将余弦相似度作为一个距离的计算方式，如果两个向量的距离近，则损失函数值小，反之亦然。

1234567891011	loss_f = nn.CosineEmbeddingLoss()inputs_1 = torch.tensor([0.3, 0.5, 0.7, 0.3, 0.5, 0.7])inputs_2 = torch.tensor([0.1, 0.3, 0.5, 0.1, 0.3, 0.5])target = torch.tensor(1, -1, dtype=torch.float)output = loss_f(inputs_1,inputs_2,target)print('CosineEmbeddingLoss损失函数的计算结果为',output)-->CosineEmbeddingLoss损失函数的计算结果为 tensor(0.5000)

CTC损失函数

1	torch.nn.CTCLoss(blank=0, reduction='mean', zero_infinity=False)

功能： 用于解决时序类数据的分类

计算连续时间序列和目标序列之间的损失。CTCLoss对输入和目标的可能排列的概率进行求和，产生一个损失值，这个损失值对每个输入节点来说是可分的。输入与目标的对齐方式被假定为 “多对一”，这就限制了目标序列的长度，使其必须是≤输入长度。

主要参数:

reduction：计算模式，可为 none/sum/mean。

blank：blank label。

zero_infinity：无穷大的值或梯度值为

1234567891011121314151617181920212223242526272829303132333435363738

Target are to be paddedT = 50 # Input sequence lengthC = 20 # Number of classes (including blank)N = 16 # Batch sizeS = 30 # Target sequence length of longest target in batch (padding length)S_min = 10 # Minimum target length, for demonstration purposes# Initialize random batch of input vectors, for size = (T,N,C)input = torch.randn(T, N, C).log_softmax(2).detach().requires_grad_()# Initialize random batch of targets (0 = blank, 1:C = classes)target = torch.randint(low=1, high=C, size=(N, S), dtype=torch.long)input_lengths = torch.full(size=(N,), fill_value=T, dtype=torch.long)target_lengths = torch.randint(low=S_min, high=S, size=(N,), dtype=torch.long)ctc_loss = nn.CTCLoss()loss = ctc_loss(input, target, input_lengths, target_lengths)loss.backward()# Target are to be un-paddedT = 50 # Input sequence lengthC = 20 # Number of classes (including blank)N = 16 # Batch size# Initialize random batch of input vectors, for size = (T,N,C)input = torch.randn(T, N, C).log_softmax(2).detach().requires_grad_()input_lengths = torch.full(size=(N,), fill_value=T, dtype=torch.long)# Initialize random batch of targets (0 = blank, 1:C = classes)target_lengths = torch.randint(low=1, high=T, size=(N,), dtype=torch.long)target = torch.randint(low=1, high=C, size=(sum(target_lengths),), dtype=torch.long)ctc_loss = nn.CTCLoss()loss = ctc_loss(input, target, input_lengths, target_lengths)loss.backward()print('CTCLoss损失函数的计算结果为',loss)CTCLoss损失函数的计算结果为 tensor(16.0885, grad_fn=<MeanBackward0>)

参考资料

https://datawhalechina.github.io/thorough-pytorch/第三章/3.6 损失函数.html
https://blog.csdn.net/weixin_46566663/article/details/127911813
https://blog.csdn.net/ltochange/article/details/117935410
https://zhuanlan.zhihu.com/p/98785902
https://blog.csdn.net/qq_50001789/article/details/128974654
https://www.jianshu.com/p/98ec08ea3bec
https://blog.csdn.net/qq_39707285/article/details/124032964

文章链接： https://cloud.tencent.com/developer/article/2304594

pytorch output target torch 函数

0 人点赞

PyTorch 学习 -6- 损失函数

本节目录

二分类交叉熵损失函数

交叉熵损失函数

L1损失函数

MSE损失函数

平滑L1 (Smooth L1)损失函数

平滑L1与L1的对比

目标泊松分布的负对数似然损失

KL散度

MarginRankingLoss

多标签边界损失函数

二分类损失函数

多分类的折页损失

三元组损失

HingEmbeddingLoss

余弦相似度

CTC损失函数

参考资料