numpy 理解PyTorch的核心

ibrsph3r  于 5个月前  发布在  其他
关注(0)|答案(2)|浏览(50)

我很熟悉einsum在NumPy中的工作方式。PyTorch也提供了类似的功能:torch.einsum()。在功能或性能方面有什么相似之处和不同之处?PyTorch文档中提供的信息相当少,没有提供任何关于这方面的见解。

uplii1fm

uplii1fm1#

由于torch文档中对einsum的描述很少,我决定写这篇文章来记录,比较和对比torch.einsum()numpy.einsum()的行为。

差异:

  • NumPy允许小写字母和大写字母[a-zA-Z]用于“* 下标字符串 *”,而PyTorch只允许小写字母[a-z]
  • NumPy接受nd-arrays,普通Python列表(或元组),列表列表(或元组的元组,元组的列表,列表的元组)甚至PyTorchTensor作为 * 操作数 *(即输入)。这是因为 * 操作数 * 只需是 array_like,而不是严格的NumPy nd-arrays。相反,PyTorch严格要求 * operand *(即输入)是PyTorchTensor。如果传递普通Python列表/元组(或其组合)或NumPy nd-arrays,它将抛出TypeError
  • NumPy除了nd-arrays之外还支持很多关键字参数(例如optimize),而PyTorch还没有提供这样的灵活性。

以下是PyTorch和NumPy中一些示例的实现:

# input tensors to work with

In [16]: vec
Out[16]: tensor([0, 1, 2, 3])

In [17]: aten
Out[17]: 
tensor([[11, 12, 13, 14],
        [21, 22, 23, 24],
        [31, 32, 33, 34],
        [41, 42, 43, 44]])

In [18]: bten
Out[18]: 
tensor([[1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3],
        [4, 4, 4, 4]])

字符串

1)矩阵乘法

PyTorch:torch.matmul(aten, bten) ; aten.mm(bten)
NumPy:np.einsum("ij, jk -> ik", arr1, arr2)

In [19]: torch.einsum('ij, jk -> ik', aten, bten)
Out[19]: 
tensor([[130, 130, 130, 130],
        [230, 230, 230, 230],
        [330, 330, 330, 330],
        [430, 430, 430, 430]])

2)沿主对角线沿着提取元素

PyTorch:torch.diag(aten)
NumPy:np.einsum("ii -> i", arr)

In [28]: torch.einsum('ii -> i', aten)
Out[28]: tensor([11, 22, 33, 44])

3)Hadamard乘积(即两个Tensor的元素乘积)

PyTorch:aten * bten
NumPy:np.einsum("ij, ij -> ij", arr1, arr2)

In [34]: torch.einsum('ij, ij -> ij', aten, bten)
Out[34]: 
tensor([[ 11,  12,  13,  14],
        [ 42,  44,  46,  48],
        [ 93,  96,  99, 102],
        [164, 168, 172, 176]])

4)元素平方

PyTorch:aten ** 2
NumPy:np.einsum("ij, ij -> ij", arr, arr)

In [37]: torch.einsum('ij, ij -> ij', aten, aten)
Out[37]: 
tensor([[ 121,  144,  169,  196],
        [ 441,  484,  529,  576],
        [ 961, 1024, 1089, 1156],
        [1681, 1764, 1849, 1936]])

  • 常规 *:元素级nth幂可以通过重复下标字符串和Tensorn次来实现。例如,计算Tensor的元素级4次幂可以使用:
# NumPy: np.einsum('ij, ij, ij, ij -> ij', arr, arr, arr, arr)
In [38]: torch.einsum('ij, ij, ij, ij -> ij', aten, aten, aten, aten)
Out[38]: 
tensor([[  14641,   20736,   28561,   38416],
        [ 194481,  234256,  279841,  331776],
        [ 923521, 1048576, 1185921, 1336336],
        [2825761, 3111696, 3418801, 3748096]])

5)迹(即主对角线元素的和)

PyTorch:torch.trace(aten)
NumPy einsum:np.einsum("ii -> ", arr)

In [44]: torch.einsum('ii -> ', aten)
Out[44]: tensor(110)

6)矩阵转置

PyTorch:torch.transpose(aten, 1, 0)
NumPy einsum:np.einsum("ij -> ji", arr)

In [58]: torch.einsum('ij -> ji', aten)
Out[58]: 
tensor([[11, 21, 31, 41],
        [12, 22, 32, 42],
        [13, 23, 33, 43],
        [14, 24, 34, 44]])

7)外积(向量)

PyTorch:torch.ger(vec, vec)
NumPy einsum:np.einsum("i, j -> ij", vec, vec)

In [73]: torch.einsum('i, j -> ij', vec, vec)
Out[73]: 
tensor([[0, 0, 0, 0],
        [0, 1, 2, 3],
        [0, 2, 4, 6],
        [0, 3, 6, 9]])

8)(向量的)内积PyTorch:torch.dot(vec1, vec2)

NumPy einsum:np.einsum("i, i -> ", vec1, vec2)

In [76]: torch.einsum('i, i -> ', vec, vec)
Out[76]: tensor(14)

9)沿沿着轴0求和

PyTorch:torch.sum(aten, 0)
NumPy einsum:np.einsum("ij -> j", arr)

In [85]: torch.einsum('ij -> j', aten)
Out[85]: tensor([104, 108, 112, 116])

10)沿沿着轴1求和

PyTorch:torch.sum(aten, 1)
NumPy einsum:np.einsum("ij -> i", arr)

In [86]: torch.einsum('ij -> i', aten)
Out[86]: tensor([ 50,  90, 130, 170])

11)批量矩阵乘法

PyTorch:torch.bmm(batch_tensor_1, batch_tensor_2)
NumPy:np.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2)

# input batch tensors to work with
In [13]: batch_tensor_1 = torch.arange(2 * 4 * 3).reshape(2, 4, 3)
In [14]: batch_tensor_2 = torch.arange(2 * 3 * 4).reshape(2, 3, 4) 

In [15]: torch.bmm(batch_tensor_1, batch_tensor_2)  
Out[15]: 
tensor([[[  20,   23,   26,   29],
         [  56,   68,   80,   92],
         [  92,  113,  134,  155],
         [ 128,  158,  188,  218]],

        [[ 632,  671,  710,  749],
         [ 776,  824,  872,  920],
         [ 920,  977, 1034, 1091],
         [1064, 1130, 1196, 1262]]])

# sanity check with the shapes
In [16]: torch.bmm(batch_tensor_1, batch_tensor_2).shape 
Out[16]: torch.Size([2, 4, 4])

# batch matrix multiply using einsum
In [17]: torch.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2)
Out[17]: 
tensor([[[  20,   23,   26,   29],
         [  56,   68,   80,   92],
         [  92,  113,  134,  155],
         [ 128,  158,  188,  218]],

        [[ 632,  671,  710,  749],
         [ 776,  824,  872,  920],
         [ 920,  977, 1034, 1091],
         [1064, 1130, 1196, 1262]]])

# sanity check with the shapes
In [18]: torch.einsum("bij, bjk -> bik", batch_tensor_1, batch_tensor_2).shape

12)沿沿着轴2求和

PyTorch:torch.sum(batch_ten, 2)
NumPy einsum:np.einsum("ijk -> ij", arr3D)

In [99]: torch.einsum("ijk -> ij", batch_ten)
Out[99]: 
tensor([[ 50,  90, 130, 170],
        [  4,   8,  12,  16]])

13)对nDTensor中的所有元素求和

PyTorch:torch.sum(batch_ten)
NumPy einsum:np.einsum("ijk -> ", arr3D)

In [101]: torch.einsum("ijk -> ", batch_ten)
Out[101]: tensor(480)

14)多轴求和(即边缘化)

PyTorch:torch.sum(arr, dim=(dim0, dim1, dim2, dim3, dim4, dim6, dim7))
NumPy:np.einsum("ijklmnop -> n", nDarr)

# 8D tensor
In [103]: nDten = torch.randn((3,5,4,6,8,2,7,9))
In [104]: nDten.shape
Out[104]: torch.Size([3, 5, 4, 6, 8, 2, 7, 9])

# marginalize out dimension 5 (i.e. "n" here)
In [111]: esum = torch.einsum("ijklmnop -> n", nDten)
In [112]: esum
Out[112]: tensor([  98.6921, -206.0575])

# marginalize out axis 5 (i.e. sum over rest of the axes)
In [113]: tsum = torch.sum(nDten, dim=(0, 1, 2, 3, 4, 6, 7))

In [115]: torch.allclose(tsum, esum)
Out[115]: True

15)双点积/Frobenius inner product(同:torch.sum(hadamard积)参见3)

PyTorch:torch.sum(aten * bten)
NumPy:np.einsum("ij, ij -> ", arr1, arr2)

In [120]: torch.einsum("ij, ij -> ", aten, bten)
Out[120]: tensor(1300)

sz81bmfz

sz81bmfz2#

目前的答案似乎已经过时了,因为关于torch.einsum的一些不准确之处(例子是一样的):
1.您可以同时使用大小写字母
1.保持不变(通过设计)
1.不存在optimize关键字参数,因为它已经使用了Tensor网络收缩的优化顺序。此外,np.einsum的其他关键字与输出[out, dtype, order, casting]相关。
有一个值得阅读的注解是关于torch.einsum docs中的最佳收缩顺序的,它还展示了如果你想禁用优化顺序的话如何禁用。

附加:子列表表示法

自从这篇文章发表以来,nptorch都增加了对子列表表示法的支持:(来自pytorch文档的示例)

>>> # with sublist format and ellipsis
>>> torch.einsum(As, [..., 0, 1], Bs, [..., 1, 2], [..., 0, 2])
tensor([[[-1.0564, -1.5904,  3.2023,  3.1271],
        [-1.6706, -0.8097, -0.8025, -2.1183]],

        [[ 4.2239,  0.3107, -0.5756, -0.2354],
        [-1.4558, -0.3460,  1.5087, -0.8530]],

        [[ 2.8153,  1.8787, -4.3839, -1.2112],
        [ 0.3728, -2.1131,  0.0921,  0.8305]]])

字符串
这个例子还使用了省略号...,它表示所有不需要作为下标的维度(这在答案的注解中提到,但没有解释)。再次参考torch文档以深入解释其工作原理。

相关问题