主要是PyTorch官方原文,增加了Ascend npu的示例和实战。
Tensors
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters.
Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other specialized hardware to accelerate computing. If you’re familiar with ndarrays, you’ll be right at home with the Tensor API. If not, follow along in this quick API walkthrough.
代码语言:python代码运行次数:0复制import torch
import numpy as np
Tensor Initialization
Tensors can be initialized in various ways. Take a look at the following examples:
Directly from data
Tensors can be created directly from data. The data type is automatically inferred.
代码语言:python代码运行次数:0复制data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)
From a NumPy array
Tensors can be created from NumPy arrays (and vice versa - see Bridge with NumPy).
代码语言:python代码运行次数:0复制np_array = np.array(data)
x_np = torch.from_numpy(np_array)
From another tensor:
The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.
代码语言:python代码运行次数:0复制x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: n {x_ones} n")
x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: n {x_rand} n")
Ones Tensor: tensor([1, 1, 1, 1])
Random Tensor: tensor([0.8823, 0.9150, 0.3829, 0.9593])
With random or constant values:
shape is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.
代码语言:python代码运行次数:0复制shape = (2, 3, )
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
print(f"Random Tensor: n {rand_tensor} n")
print(f"Ones Tensor: n {ones_tensor} n")
print(f"Zeros Tensor: n {zeros_tensor}")
代码语言:log复制Random Tensor:
tensor([[0.3904, 0.6009, 0.2566],
[0.7936, 0.9408, 0.1332]])
Ones Tensor:
tensor([[1., 1., 1.],
[1., 1., 1.]])
Zeros Tensor:
tensor([[0., 0., 0.],
[0., 0., 0.]])
Tensor Attributes
Tensor attributes describe their shape, datatype, and the device on which they are stored.
代码语言:python代码运行次数:0复制tensor = torch.rand(3, 4)
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
Tensor Operations
Over 100 tensor operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random sampling, and more are comprehensively described here.
Each of them can be run on the GPU (at typically higher speeds than on a CPU). If you’re using Colab, allocate a GPU by going to Edit > Notebook Settings.
代码语言:python代码运行次数:0复制# We move our tensor to the GPU if available
if torch.cuda.is_available():
tensor = tensor.to('cuda')
print(f"Device tensor is stored on: {tensor.device}")
Device tensor is stored on: cuda:0
Try out some of the operations from the list. If you’re familiar with the NumPy API, you’ll find the Tensor API a breeze to use.
Ascend NPU
如果在Ascend NPU上使用,简单可以使用from torch_npu.contrib import transfer_to_npu
来自动迁移。
>>> torch.cuda.is_available()
False
>>> import torch_npu
>>> from torch_npu.contrib import transfer_to_npu
>>> torch.cuda.is_available()
True
>>> tensor = tensor.to('cuda')
>>> print(f"Device tensor is stored on: {tensor.device}")
Device tensor is stored on: npu:0
可以看到,当调用了transfer_to_npu
后,cuda相关的调用被自动转到npu了,相关的判断也通过了。
或者直接使用npu相关的API,例如:
代码语言:shell复制>>> import torch_npu
>>> torch.npu.is_available()
True
>>> tensor = tensor.npu()
>>> print(f"Device tensor is stored on: {tensor.device}")
Device tensor is stored on: npu:0
Standard numpy-like indexing and slicing:
代码语言:python代码运行次数:0复制tensor = torch.ones(4, 4)
tensor[:,1] = 0
print(tensor)
代码语言:out复制tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
Joining tensors
You can use torch.cat to concatenate a sequence of tensors along a given dimension. See also torch.stack, another tensor joining op that is subtly different from torch.cat.
代码语言:python代码运行次数:0复制t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)
代码语言:out复制tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])
torch.cat 和 torch.stack 的dim都稍显复杂,可以通过官网case来了解。
torch.cat dim=0时,就是说在torch.shape的第0维度上进行cat连接,所以也就是行;dim=1时,就是在torch.shape的第1维度上进行cat拼接,所以也就是列。当维度更高时,可能就不好类比长宽高了,这时候根据torch.shape的维度来判断就好。
代码语言:shell复制>>> x = torch.randn(2, 3)
>>> x
tensor([[ 0.0679, -0.3655, -1.5670],
[-0.6854, 0.1267, -0.8296]])
>>> x0 = torch.cat((x, x, x), 0)
>>> x0.shape
torch.Size([6, 3])
>>> x0
tensor([[ 0.0679, -0.3655, -1.5670],
[-0.6854, 0.1267, -0.8296],
[ 0.0679, -0.3655, -1.5670],
[-0.6854, 0.1267, -0.8296],
[ 0.0679, -0.3655, -1.5670],
[-0.6854, 0.1267, -0.8296]])
>>> x1 = torch.cat((x, x, x), 1)
>>> x1.shape
torch.Size([2, 9])
>>> x1
tensor([[ 0.0679, -0.3655, -1.5670, 0.0679, -0.3655, -1.5670, 0.0679, -0.3655,
-1.5670],
[-0.6854, 0.1267, -0.8296, -0.6854, 0.1267, -0.8296, -0.6854, 0.1267,
-0.8296]])
代码语言:shell复制>>> x_0_t
tensor([[1.1100, 1.1200, 1.1300, 1.1400],
[1.2100, 1.2200, 1.2300, 1.2400],
[1.3100, 1.3200, 1.3300, 1.3400]])
>>> x_1_t
tensor([[2.1100, 2.1200, 2.1300, 2.1400],
[2.2100, 2.2200, 2.2300, 2.2400],
[2.3100, 2.3200, 2.3300, 2.3400]])
>>> y0 = torch.stack((x_0_t, x_1_t), dim=0)
>>> y0.shape
torch.Size([2, 3, 4])
>>> y0
tensor([[[1.1100, 1.1200, 1.1300, 1.1400],
[1.2100, 1.2200, 1.2300, 1.2400],
[1.3100, 1.3200, 1.3300, 1.3400]],
[[2.1100, 2.1200, 2.1300, 2.1400],
[2.2100, 2.2200, 2.2300, 2.2400],
[2.3100, 2.3200, 2.3300, 2.3400]]])
>>> y1 = torch.stack((x_0_t, x_1_t), dim=1)
>>> y1.shape
torch.Size([3, 2, 4])
>>> y1
tensor([[[1.1100, 1.1200, 1.1300, 1.1400],
[2.1100, 2.1200, 2.1300, 2.1400]],
[[1.2100, 1.2200, 1.2300, 1.2400],
[2.2100, 2.2200, 2.2300, 2.2400]],
[[1.3100, 1.3200, 1.3300, 1.3400],
[2.3100, 2.3200, 2.3300, 2.3400]]])
>>> y2 = torch.stack((x_0_t, x_1_t), dim=2)
>>> y2.shape
torch.Size([3, 4, 2])
>>> y2
tensor([[[1.1100, 2.1100],
[1.1200, 2.1200],
[1.1300, 2.1300],
[1.1400, 2.1400]],
[[1.2100, 2.2100],
[1.2200, 2.2200],
[1.2300, 2.2300],
[1.2400, 2.2400]],
[[1.3100, 2.3100],
[1.3200, 2.3200],
[1.3300, 2.3300],
[1.3400, 2.3400]]])
Multiplying tensors
代码语言:python代码运行次数:0复制tensor = torch.ones(4, 4)
tensor[:,1] = 0
print(tensor)
# This computes the element-wise product
print(f"tensor.mul(tensor) n {tensor.mul(tensor)} n")
# Alternative syntax:
print(f"tensor * tensor n {tensor * tensor}")
代码语言:out复制tensor.mul(tensor)
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
tensor * tensor
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
This computes the matrix multiplication between two tensors
代码语言:python代码运行次数:0复制print(f"tensor.matmul(tensor.T) n {tensor.matmul(tensor.T)} n")
# Alternative syntax:
print(f"tensor @ tensor.T n {tensor @ tensor.T}")
代码语言:out复制tensor.matmul(tensor.T)
tensor([[3., 3., 3., 3.],
[3., 3., 3., 3.],
[3., 3., 3., 3.],
[3., 3., 3., 3.]])
tensor @ tensor.T
tensor([[3., 3., 3., 3.],
[3., 3., 3., 3.],
[3., 3., 3., 3.],
[3., 3., 3., 3.]])
In-place operations
Operations that have a suffix are in-place. For example: x.copy(y), x.t_(), will change x.
代码语言:python代码运行次数:0复制print(tensor, "n")
tensor.add_(5)
print(tensor)
代码语言:out复制tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
tensor([[6., 5., 6., 6.],
[6., 5., 6., 6.],
[6., 5., 6., 6.],
[6., 5., 6., 6.]])
In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss of history. Hence, their use is discouraged.
Bridge with NumPy
Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.
Tensor to NumPy array
代码语言:python代码运行次数:0复制t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")
代码语言:out复制t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]
A change in the tensor reflects in the NumPy array.
代码语言:python代码运行次数:0复制t.add_(1)
print(f"t: {t}")
print(f"n: {n}")
代码语言:out复制t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]
但是GPU上的数据,修改是不会联动的。
代码语言:python代码运行次数:0复制>>> t = torch.ones(5).npu()
>>> t
tensor([1., 1., 1., 1., 1.], device='npu:0')
>>> n = t.cpu().numpy()
>>> n
array([1., 1., 1., 1., 1.], dtype=float32)
>>> t.add_(1)
tensor([2., 2., 2., 2., 2.], device='npu:0')
>>> t
tensor([2., 2., 2., 2., 2.], device='npu:0')
>>> n
array([1., 1., 1., 1., 1.], dtype=float32)
NumPy array to Tensor
代码语言:out复制n = np.ones(5)
t = torch.from_numpy(n)
Changes in the NumPy array reflects in the tensor.
代码语言:python代码运行次数:0复制np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")
代码语言:out复制t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]
同样不会影响到GPU上的数据,因为GPU的数据和CPU是分离的,需要先从GPUcopy到CPU上。
代码语言:python代码运行次数:0复制>>> import numpy as np
>>> n = np.ones(5)
>>> t = torch.from_numpy(n).npu()
>>> n
array([1., 1., 1., 1., 1.])
>>> t
tensor([1., 1., 1., 1., 1.], device='npu:0', dtype=torch.float64)
>>> np.add(n, 1, out=n)
array([2., 2., 2., 2., 2.])
>>> n
array([2., 2., 2., 2., 2.])
>>> t
tensor([1., 1., 1., 1., 1.], device='npu:0', dtype=torch.float64)
主要翻译自
https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html