pytorch打印pth模型参数 pytorch打印loss

转载

mob6454cc77db30 2024-02-08 06:41:24

文章标签 pytorch打印pth模型参数 Pytorch 损失函数 sed ide 文章分类 PyTorch 人工智能

torch.nn.NLLLoss（）

分类问题的损失函数中，经常会遇到torch.nn.NLLLOSS。torch.nn.NLLLOSS通常不被独立当作损失函数，而需要和softmax、log等运算组合当作损失函数。

Input形状：(N, C)
Target形状：(N)

torch.nn.NLLLOSS官方链接

1、源码

class NLLLoss(_WeightedLoss):
    r"""The negative log likelihood loss. It is useful to train a classification
    problem with `C` classes.

    If provided, the optional argument :attr:`weight` should be a 1D Tensor assigning
    weight to each of the classes. This is particularly useful when you have an
    unbalanced training set.

    The `input` given through a forward call is expected to contain
    log-probabilities of each class. `input` has to be a Tensor of size either
    :math:`(minibatch, C)` or :math:`(minibatch, C, d_1, d_2, ..., d_K)`
    with :math:`K \geq 1` for the `K`-dimensional case (described later).

    Obtaining log-probabilities in a neural network is easily achieved by
    adding a  `LogSoftmax`  layer in the last layer of your network.
    You may use `CrossEntropyLoss` instead, if you prefer not to add an extra
    layer.

    The `target` that this loss expects should be a class index in the range :math:`[0, C-1]`
    where `C = number of classes`; if `ignore_index` is specified, this loss also accepts
    this class index (this index may not necessarily be in the class range).

    The unreduced (i.e. with :attr:`reduction` set to ``'none'``) loss can be described as:

    .. math::
        \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad
        l_n = - w_{y_n} x_{n,y_n}, \quad
        w_{c} = \text{weight}[c] \cdot \mathbb{1}\{c \not= \text{ignore\_index}\},

    where :math:`x` is the input, :math:`y` is the target, :math:`w` is the weight, and
    :math:`N` is the batch size. If :attr:`reduction` is not ``'none'``
    (default ``'mean'``), then

    .. math::
        \ell(x, y) = \begin{cases}
            \sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n}} l_n, &
            \text{if reduction} = \text{`mean';}\\
            \sum_{n=1}^N l_n,  &
            \text{if reduction} = \text{`sum'.}
        \end{cases}

    Can also be used for higher dimension inputs, such as 2D images, by providing
    an input of size :math:`(minibatch, C, d_1, d_2, ..., d_K)` with :math:`K \geq 1`,
    where :math:`K` is the number of dimensions, and a target of appropriate shape
    (see below). In the case of images, it computes NLL loss per-pixel.

    Args:
        weight (Tensor, optional): a manual rescaling weight given to each
            class. If given, it has to be a Tensor of size `C`. Otherwise, it is
            treated as if having all ones.
        size_average (bool, optional): Deprecated (see :attr:`reduction`). By default,
            the losses are averaged over each loss element in the batch. Note that for
            some losses, there are multiple elements per sample. If the field :attr:`size_average`
            is set to ``False``, the losses are instead summed for each minibatch. Ignored
            when :attr:`reduce` is ``False``. Default: ``True``
        ignore_index (int, optional): Specifies a target value that is ignored
            and does not contribute to the input gradient. When
            :attr:`size_average` is ``True``, the loss is averaged over
            non-ignored targets.
        reduce (bool, optional): Deprecated (see :attr:`reduction`). By default, the
            losses are averaged or summed over observations for each minibatch depending
            on :attr:`size_average`. When :attr:`reduce` is ``False``, returns a loss per
            batch element instead and ignores :attr:`size_average`. Default: ``True``
        reduction (string, optional): Specifies the reduction to apply to the output:
            ``'none'`` | ``'mean'`` | ``'sum'``. ``'none'``: no reduction will
            be applied, ``'mean'``: the weighted mean of the output is taken,
            ``'sum'``: the output will be summed. Note: :attr:`size_average`
            and :attr:`reduce` are in the process of being deprecated, and in
            the meantime, specifying either of those two args will override
            :attr:`reduction`. Default: ``'mean'``

    Shape:
        - Input: :math:`(N, C)` where `C = number of classes`, or
          :math:`(N, C, d_1, d_2, ..., d_K)` with :math:`K \geq 1`
          in the case of `K`-dimensional loss.
        - Target: :math:`(N)` where each value is :math:`0 \leq \text{targets}[i] \leq C-1`, or
          :math:`(N, d_1, d_2, ..., d_K)` with :math:`K \geq 1` in the case of
          K-dimensional loss.
        - Output: scalar.
          If :attr:`reduction` is ``'none'``, then the same size as the target: :math:`(N)`, or
          :math:`(N, d_1, d_2, ..., d_K)` with :math:`K \geq 1` in the case
          of K-dimensional loss.

    Examples::

        >>> m = nn.LogSoftmax(dim=1)
        >>> loss = nn.NLLLoss()
        >>> # input is of size N x C = 3 x 5
        >>> input = torch.randn(3, 5, requires_grad=True)
        >>> # each element in target has to have 0 <= value < C
        >>> target = torch.tensor([1, 0, 4])
        >>> output = loss(m(input), target)
        >>> output.backward()
        >>>
        >>>
        >>> # 2D loss example (used, for example, with image inputs)
        >>> N, C = 5, 4
        >>> loss = nn.NLLLoss()
        >>> # input is of size N x C x height x width
        >>> data = torch.randn(N, 16, 10, 10)
        >>> conv = nn.Conv2d(16, C, (3, 3))
        >>> m = nn.LogSoftmax(dim=1)
        >>> # each element in target has to have 0 <= value < C
        >>> target = torch.empty(N, 8, 8, dtype=torch.long).random_(0, C)
        >>> output = loss(m(conv(data)), target)
        >>> output.backward()
    """
    __constants__ = ['ignore_index', 'reduction']
    ignore_index: int

    def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100, reduce=None, reduction: str = 'mean') -> None:
        super(NLLLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index

    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        assert self.weight is None or isinstance(self.weight, Tensor)
        return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)

2、案例

from torch import nn
import torch

# nllloss首先需要初始化
nllloss = nn.NLLLoss(reduction='mean')  # 可选参数中有 reduction='mean', 'sum', 默认mean

# 在使用nllloss时，需要有两个张量，一个是预测向量，一个是label

# --------------------------- 1、predict shape为（1, category）的情况 ---------------------------
# predict则表示每个类别预测的概率，比如向量(2, 5, 3)则表示类别0，1，2预测的概率分别为(2, 5, 3)
predict01 = torch.Tensor([[2, 5, 3]])  # shape: (n, category)
# label的shape是n，表示了n个向量对应的正确类别，比如这里label为1，则表明向量(2, 5, 3)对应的类别是2
label01 = torch.tensor([2])  # shape: (n,)
# nllloss对两个向量的操作为，将predict中的向量，在label中对应的index取出，并取负号输出。label中为2，则取2, 5, 3中的第2位3, 取负号后输出。
loss01 = nllloss(predict01, label01)
print('loss01 = ', loss01)  # loss01 =  tensor(-3.)

# --------------------------- 2、predict shape为（n, category）的情况 ---------------------------
predict02 = torch.Tensor([[2, 5, 3],
                          [3, 1, 6]])
label02 = torch.tensor([1, 2])
# nllloss对两个向量的操作为，继续将predict中的向量，在label中对应的index取出，并取负号输出。
# label中为1，则取2, 5, 3中的第1位5，label第二位为2，则取出3, 1, 6的第2位6，
# 将第1位5、第2位6两数取平均后加负号后输出
loss02 = nllloss(predict02, label02)
print('loss02 = ', loss02)  # loss02 =  tensor(-5.5000)

打印结果：

loss01 =  tensor(-3.)
loss02 =  tensor(-5.5000)

torch.nn.CrossEntropyLoss（交叉熵损失函数）

nn.CrossEntropyLoss的关系可以描述为：softmax(x)+log(x)+nn.NLLLoss====>nn.CrossEntropyLoss

Input形状：(N, C)
Target形状：(N)

1、源码

class CrossEntropyLoss(_WeightedLoss):
    r"""This criterion combines :class:`~torch.nn.LogSoftmax` and :class:`~torch.nn.NLLLoss` in one single class.

    It is useful when training a classification problem with `C` classes.
    If provided, the optional argument :attr:`weight` should be a 1D `Tensor`
    assigning weight to each of the classes.
    This is particularly useful when you have an unbalanced training set.

    The `input` is expected to contain raw, unnormalized scores for each class.

    `input` has to be a Tensor of size either :math:`(minibatch, C)` or
    :math:`(minibatch, C, d_1, d_2, ..., d_K)`
    with :math:`K \geq 1` for the `K`-dimensional case (described later).

    This criterion expects a class index in the range :math:`[0, C-1]` as the
    `target` for each value of a 1D tensor of size `minibatch`; if `ignore_index`
    is specified, this criterion also accepts this class index (this index may not
    necessarily be in the class range).

    The loss can be described as:

    .. math::
        \text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right)
                       = -x[class] + \log\left(\sum_j \exp(x[j])\right)

    or in the case of the :attr:`weight` argument being specified:

    .. math::
        \text{loss}(x, class) = weight[class] \left(-x[class] + \log\left(\sum_j \exp(x[j])\right)\right)

    The losses are averaged across observations for each minibatch. If the
    :attr:`weight` argument is specified then this is a weighted average:

    .. math::
        \text{loss} = \frac{\sum^{N}_{i=1} loss(i, class[i])}{\sum^{N}_{i=1} weight[class[i]]}

    Can also be used for higher dimension inputs, such as 2D images, by providing
    an input of size :math:`(minibatch, C, d_1, d_2, ..., d_K)` with :math:`K \geq 1`,
    where :math:`K` is the number of dimensions, and a target of appropriate shape
    (see below).


    Args:
        weight (Tensor, optional): a manual rescaling weight given to each class.
            If given, has to be a Tensor of size `C`
        size_average (bool, optional): Deprecated (see :attr:`reduction`). By default,
            the losses are averaged over each loss element in the batch. Note that for
            some losses, there are multiple elements per sample. If the field :attr:`size_average`
            is set to ``False``, the losses are instead summed for each minibatch. Ignored
            when :attr:`reduce` is ``False``. Default: ``True``
        ignore_index (int, optional): Specifies a target value that is ignored
            and does not contribute to the input gradient. When :attr:`size_average` is
            ``True``, the loss is averaged over non-ignored targets.
        reduce (bool, optional): Deprecated (see :attr:`reduction`). By default, the
            losses are averaged or summed over observations for each minibatch depending
            on :attr:`size_average`. When :attr:`reduce` is ``False``, returns a loss per
            batch element instead and ignores :attr:`size_average`. Default: ``True``
        reduction (string, optional): Specifies the reduction to apply to the output:
            ``'none'`` | ``'mean'`` | ``'sum'``. ``'none'``: no reduction will
            be applied, ``'mean'``: the weighted mean of the output is taken,
            ``'sum'``: the output will be summed. Note: :attr:`size_average`
            and :attr:`reduce` are in the process of being deprecated, and in
            the meantime, specifying either of those two args will override
            :attr:`reduction`. Default: ``'mean'``

    Shape:
        - Input: :math:`(N, C)` where `C = number of classes`, or
          :math:`(N, C, d_1, d_2, ..., d_K)` with :math:`K \geq 1`
          in the case of `K`-dimensional loss.
        - Target: :math:`(N)` where each value is :math:`0 \leq \text{targets}[i] \leq C-1`, or
          :math:`(N, d_1, d_2, ..., d_K)` with :math:`K \geq 1` in the case of
          K-dimensional loss.
        - Output: scalar.
          If :attr:`reduction` is ``'none'``, then the same size as the target:
          :math:`(N)`, or
          :math:`(N, d_1, d_2, ..., d_K)` with :math:`K \geq 1` in the case
          of K-dimensional loss.

    Examples::

        >>> loss = nn.CrossEntropyLoss()
        >>> input = torch.randn(3, 5, requires_grad=True)
        >>> target = torch.empty(3, dtype=torch.long).random_(5)
        >>> output = loss(input, target)
        >>> output.backward()
    """
    __constants__ = ['ignore_index', 'reduction']
    ignore_index: int

    def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100, reduce=None, reduction: str = 'mean') -> None:
        super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index

    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        assert self.weight is None or isinstance(self.weight, Tensor)
        return F.cross_entropy(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)

2、案例

2.1 直接使用 nn.CrossEntropyLoss

from torch import nn
import torch

# crossentropyloss首先需要初始化
crossentropyloss = nn.CrossEntropyLoss(reduction='mean')  # 可选参数中有 reduction='mean', 'sum', 默认mean

# 在使用crossentropyloss时，需要有两个张量，一个是预测向量，一个是label

# --------------------------- 1、predict shape为（1, category）的情况 ---------------------------
# predict则表示每个类别预测的概率，比如向量(2, 5, 3)则表示类别0，1，2预测的概率分别为(2, 5, 3)
predict01 = torch.Tensor([[2, 5, 3]])  # shape: (n, category)
# label的shape是n，表示了n个向量对应的正确类别，比如这里label为1，则表明向量(2, 5, 3)对应的类别是2
label01 = torch.tensor([2])  # shape: (n,)
loss01 = crossentropyloss(predict01, label01)
print('loss01 = ', loss01)  # loss01 =  tensor(2.1698)

# --------------------------- 2、predict shape为（n, category）的情况 ---------------------------
predict02 = torch.Tensor([[2, 5, 3],
                          [3, 1, 6]])
label02 = torch.tensor([1, 2])
loss02 = crossentropyloss(predict02, label02)
print('loss02 = ', loss02)  # loss02 =  tensor(0.1124)

打印结果：

loss01 =  tensor(2.1698)
loss02 =  tensor(0.1124)

2.2 使用softmax(x)+log(x)+nn.NLLLoss实现nn.CrossEntropyLoss相同效果

from torch import nn
import torch

# nllloss首先需要初始化
nllloss = nn.NLLLoss(reduction='mean')  # 可选参数中有 reduction='mean', 'sum', 默认mean
softmax_fn = nn.Softmax(dim=1)

# 在使用nllloss时，需要有两个张量，一个是预测向量，一个是label

# --------------------------- 1、predict shape为（1, category）的情况 ---------------------------
# predict则表示每个类别预测的概率，比如向量(2, 5, 3)则表示类别0，1，2预测的概率分别为(2, 5, 3)
predict01 = torch.Tensor([[2, 5, 3]])  # shape: (n, category)
# 计算输入softmax，此时可以看到每一行加到一起结果都是1
soft_output01 = softmax_fn(predict01)
print("soft_output01 = ", soft_output01)
# 在softmax的基础上取log
log_output01 = torch.log(soft_output01)
print("log_output01 = ", log_output01)

# label的shape是n，表示了n个向量对应的正确类别，比如这里label为1，则表明向量(2, 5, 3)对应的类别是2
label01 = torch.tensor([2])  # shape: (n,)
loss01 = nllloss(log_output01, label01)
print('\nloss01 = ', loss01)  # loss01 =  tensor(-3.)

# --------------------------- 2、predict shape为（n, category）的情况 ---------------------------
predict02 = torch.Tensor([[2, 5, 3],
                          [3, 1, 6]])
# 计算输入softmax，此时可以看到每一行加到一起结果都是1
soft_output02 = softmax_fn(predict02)
print("soft_output02 = ", soft_output02)
# 在softmax的基础上取log
log_output02 = torch.log(soft_output02)
print("log_output02 = ", log_output02)

label02 = torch.tensor([1, 2])
loss02 = nllloss(log_output02, label02)
print('\nloss02 = ', loss02)  # loss02 =  tensor(-5.5000)

打印结果：

soft_output01 =  tensor([[0.0420, 0.8438, 0.1142]])
log_output01 =  tensor([[-3.1698, -0.1698, -2.1698]])

loss01 =  tensor(2.1698)

soft_output02 =  tensor([[0.0420, 0.8438, 0.1142],
        [0.0471, 0.0064, 0.9465]])
log_output02 =  tensor([[-3.1698, -0.1698, -2.1698],
        [-3.0550, -5.0550, -0.0550]])

loss02 =  tensor(0.1124)

通过上面的结果可以看出，直接使用pytorch中的loss_func=nn.CrossEntropyLoss()计算得到的结果与softmax-log-NLLLoss计算得到的结果是一致的。

torch.nn.BCELoss

Input 与 Target 的形状一致，都为 (N, *)

1、源码

class BCELoss(_WeightedLoss):
    r"""Creates a criterion that measures the Binary Cross Entropy
    between the target and the output:

    The unreduced (i.e. with :attr:`reduction` set to ``'none'``) loss can be described as:

    .. math::
        \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad
        l_n = - w_n \left[ y_n \cdot \log x_n + (1 - y_n) \cdot \log (1 - x_n) \right],

    where :math:`N` is the batch size. If :attr:`reduction` is not ``'none'``
    (default ``'mean'``), then

    .. math::
        \ell(x, y) = \begin{cases}
            \operatorname{mean}(L), & \text{if reduction} = \text{`mean';}\\
            \operatorname{sum}(L),  & \text{if reduction} = \text{`sum'.}
        \end{cases}

    This is used for measuring the error of a reconstruction in for example
    an auto-encoder. Note that the targets :math:`y` should be numbers
    between 0 and 1.

    Notice that if :math:`x_n` is either 0 or 1, one of the log terms would be
    mathematically undefined in the above loss equation. PyTorch chooses to set
    :math:`\log (0) = -\infty`, since :math:`\lim_{x\to 0} \log (x) = -\infty`.
    However, an infinite term in the loss equation is not desirable for several reasons.

    For one, if either :math:`y_n = 0` or :math:`(1 - y_n) = 0`, then we would be
    multiplying 0 with infinity. Secondly, if we have an infinite loss value, then
    we would also have an infinite term in our gradient, since
    :math:`\lim_{x\to 0} \frac{d}{dx} \log (x) = \infty`.
    This would make BCELoss's backward method nonlinear with respect to :math:`x_n`,
    and using it for things like linear regression would not be straight-forward.

    Our solution is that BCELoss clamps its log function outputs to be greater than
    or equal to -100. This way, we can always have a finite loss value and a linear
    backward method.


    Args:
        weight (Tensor, optional): a manual rescaling weight given to the loss
            of each batch element. If given, has to be a Tensor of size `nbatch`.
        size_average (bool, optional): Deprecated (see :attr:`reduction`). By default,
            the losses are averaged over each loss element in the batch. Note that for
            some losses, there are multiple elements per sample. If the field :attr:`size_average`
            is set to ``False``, the losses are instead summed for each minibatch. Ignored
            when :attr:`reduce` is ``False``. Default: ``True``
        reduce (bool, optional): Deprecated (see :attr:`reduction`). By default, the
            losses are averaged or summed over observations for each minibatch depending
            on :attr:`size_average`. When :attr:`reduce` is ``False``, returns a loss per
            batch element instead and ignores :attr:`size_average`. Default: ``True``
        reduction (string, optional): Specifies the reduction to apply to the output:
            ``'none'`` | ``'mean'`` | ``'sum'``. ``'none'``: no reduction will be applied,
            ``'mean'``: the sum of the output will be divided by the number of
            elements in the output, ``'sum'``: the output will be summed. Note: :attr:`size_average`
            and :attr:`reduce` are in the process of being deprecated, and in the meantime,
            specifying either of those two args will override :attr:`reduction`. Default: ``'mean'``

    Shape:
        - Input: :math:`(N, *)` where :math:`*` means, any number of additional
          dimensions
        - Target: :math:`(N, *)`, same shape as the input
        - Output: scalar. If :attr:`reduction` is ``'none'``, then :math:`(N, *)`, same
          shape as input.

    Examples::

        >>> m = nn.Sigmoid()
        >>> loss = nn.BCELoss()
        >>> input = torch.randn(3, requires_grad=True)
        >>> target = torch.empty(3).random_(2)
        >>> output = loss(m(input), target)
        >>> output.backward()
    """
    __constants__ = ['reduction']

    def __init__(self, weight: Optional[Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean') -> None:
        super(BCELoss, self).__init__(weight, size_average, reduce, reduction)

    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        assert self.weight is None or isinstance(self.weight, Tensor)
        return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)

2、案例

from torch import nn
import torch

# nllloss首先需要初始化
bce_loss = nn.BCELoss()  # 可选参数中有 reduction='mean', 'sum', 默认mean
sigmoid = nn.Sigmoid()

input = torch.Tensor([[2, 6, 7]])
m_input = sigmoid(input)
print('m_input = ', m_input)

target = torch.Tensor([[0, 1, 0]])

output = bce_loss(m_input, target)
print('output = ', output)

打印结果：

m_input =  tensor([[0.8808, 0.9975, 0.9991]])
output =  tensor(3.0435)

参考资料：
Pytorch十九种损失函数的使用详解详解torch.nn.NLLLOSS Pytorch常用的交叉熵损失函数CrossEntropyLoss()详解

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：haproxy如何分发策略 haproxy部署

下一篇：java 实体继承相同字段 java继承一个类

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

pytorch打印pth模型参数 pytorch打印loss

pytorch打印pth模型参数 pytorch打印loss

torch.nn.NLLLoss（）

1、源码

2、案例

torch.nn.CrossEntropyLoss（交叉熵损失函数）

1、源码

2、案例

2.1 直接使用 nn.CrossEntropyLoss

2.2 使用softmax(x)+log(x)+nn.NLLLoss实现nn.CrossEntropyLoss相同效果

torch.nn.BCELoss

1、源码

2、案例

51CTO博客