淺談pytorch中torch.max和F.softmax函式的維度解釋
在利用torch.max函式和F.Ssoftmax函式時,對應該設定什麼維度,總是有點懵,遂總結一下:
首先看看二維tensor的函式的例子:
import torch import torch.nn.functional as F input = torch.randn(3,4) print(input) tensor([[-0.5526,-0.0194,2.1469,-0.2567],[-0.3337,-0.9229,0.0376,-0.0801],[ 1.4721,0.1181,-2.6214,1.7721]]) b = F.softmax(input,dim=0) # 按列SoftMax,列和為1 print(b) tensor([[0.1018,0.3918,0.8851,0.1021],[0.1268,0.1587,0.1074,0.1218],[0.7714,0.4495,0.0075,0.7762]]) c = F.softmax(input,dim=1) # 按行SoftMax,行和為1 print(c) tensor([[0.0529,0.0901,0.7860,0.0710],[0.2329,0.1292,0.3377,0.3002],[0.3810,0.0984,0.0064,0.5143]]) d = torch.max(input,dim=0) # 按列取max,print(d) torch.return_types.max( values=tensor([1.4721,1.7721]),indices=tensor([2,2,2])) e = torch.max(input,dim=1) # 按行取max, print(e) torch.return_types.max( values=tensor([2.1469,3]))
下面看看三維tensor解釋例子:
函式softmax輸出的是所給矩陣的概率分佈;
b輸出的是在dim=0維上的概率分佈,b[0][5][6]+b[1][5][6]+b[2][5][6]=1
a=torch.rand(3,16,20) b=F.softmax(a,dim=0) c=F.softmax(a,dim=1) d=F.softmax(a,dim=2) In [1]: import torch as t In [2]: import torch.nn.functional as F In [4]: a=t.Tensor(3,4,5) In [5]: b=F.softmax(a,dim=0) In [6]: c=F.softmax(a,dim=1) In [7]: d=F.softmax(a,dim=2) In [8]: a Out[8]: tensor([[[-0.1581,0.0000,-0.0344],[ 0.0000,-0.0344,0.0000],[-0.0344,0.0000]],[[-0.0344,0.0000]]]) In [9]: b Out[9]: tensor([[[0.3064,0.3333,0.3410,0.3333],[0.3333,0.3333]],[[0.3468,0.3295,0.3333]]]) In [10]: b.sum() Out[10]: tensor(20.0000) In [11]: b[0][0][0]+b[1][0][0]+b[2][0][0] Out[11]: tensor(1.0000) In [12]: c.sum() Out[12]: tensor(15.) In [13]: c Out[13]: tensor([[[0.2235,0.2543,0.2521,0.2457],[0.2618,0.2457,0.2543],[0.2529,0.2436,0.2543]],[[0.2457,[0.2543,[0.2457,0.2543]]]) In [14]: n=t.rand(3,4) In [15]: n Out[15]: tensor([[0.2769,0.3475,0.8914,0.6845],[0.9251,0.3976,0.8690,0.4510],[0.8249,0.1157,0.3075,0.3799]]) In [16]: m=t.argmax(n,dim=0) In [17]: m Out[17]: tensor([1,1,0]) In [18]: p=t.argmax(n,dim=1) In [19]: p Out[19]: tensor([2,0]) In [20]: d.sum() Out[20]: tensor(12.0000) In [22]: d Out[22]: tensor([[[0.1771,0.2075,0.2005],[0.2027,0.1959,0.2027,0.2027],[0.1972,0.2041,0.1972,0.1972],0.2027]],[[0.1972,0.2027]]]) In [23]: d[0][0].sum() Out[23]: tensor(1.)
補充知識:多分類問題torch.nn.Softmax的使用
為什麼談論這個問題呢?是因為我在工作的過程中遇到了語義分割預測輸出特徵圖個數為16,也就是所謂的16分類問題。
因為每個通道的畫素的值的大小代表了畫素屬於該通道的類的大小,為了在一張圖上用不同的顏色顯示出來,我不得不學習了torch.nn.Softmax的使用。
首先看一個簡答的例子,倘若輸出為(3,4),也就是3張4x4的特徵圖。
import torch img = torch.rand((3,4)) print(img)
輸出為:
tensor([[[0.0413,0.8728,0.8926,0.0693],[0.4072,0.0302,0.9248,0.6676],[0.4699,0.9197,0.4809],[0.3877,0.7673,0.6132,0.5203]],[[0.4940,0.7996,0.5513,0.8016],[0.1157,0.8323,0.9944,0.2127],[0.3055,0.4343,0.8123,0.3184],[0.8246,0.6731,0.3229,0.1730]],[[0.0661,0.1905,0.4490,0.7484],[0.4013,0.1468,0.2145,0.8838],[0.0083,0.5029,0.0141,0.8998],[0.8673,0.2308,0.8808,0.0532]]])
我們可以看到共三張特徵圖,每張特徵圖上對應的值越大,說明屬於該特徵圖對應類的概率越大。
import torch.nn as nn sogtmax = nn.Softmax(dim=0) img = sogtmax(img) print(img)
輸出為:
tensor([[[0.2780,0.4107,0.4251,0.1979],[0.3648,0.2297,0.3901,0.3477],[0.4035,0.4396,0.2993,0.2967],[0.2402,0.4008,0.3273,0.4285]],[[0.4371,0.3817,0.3022,0.4117],[0.2726,0.5122,0.4182,0.2206],[0.3423,0.2706,0.4832,0.2522],[0.3718,0.3648,0.2449,0.3028]],[[0.2849,0.2076,0.2728,0.3904],[0.3627,0.2581,0.1917,0.4317],0.2898,0.2175,0.4511],[0.3880,0.2344,0.4278,0.2686]]])
可以看到,上面的程式碼對每張特徵圖對應位置的畫素值進行Softmax函式處理, 圖中標紅位置加和=1,同理,標藍位置加和=1。
我們看到Softmax函式會對原特徵圖每個畫素的值在對應維度(這裡dim=0,也就是第一維)上進行計算,將其處理到0~1之間,並且大小固定不變。
print(torch.max(img,0))
輸出為:
torch.return_types.max( values=tensor([[0.4371,0.4285]]),indices=tensor([[1,1],[0,2],[2,0]]))
可以看到這裡3x4x4變成了1x4x4,而且對應位置上的值為畫素對應每個通道上的最大值,並且indices是對應的分類。
清楚理解了上面的流程,那麼我們就容易處理了。
看具體案例,這裡輸出output的大小為:16x416x416.
output = torch.tensor(output) sm = nn.Softmax(dim=0) output = sm(output) mask = torch.max(output,0).indices.numpy() # 因為要轉化為RGB彩色圖,所以增加一維 rgb_img = np.zeros((output.shape[1],output.shape[2],3)) for i in range(len(mask)): for j in range(len(mask[0])): if mask[i][j] == 0: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 255 if mask[i][j] == 1: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 0 if mask[i][j] == 2: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 180 if mask[i][j] == 3: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 255 if mask[i][j] == 4: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 180 if mask[i][j] == 5: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 0 if mask[i][j] == 6: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 180 if mask[i][j] == 7: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 255 if mask[i][j] == 8: rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 0 if mask[i][j] == 9: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 0 if mask[i][j] == 10: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 255 if mask[i][j] == 11: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 180 if mask[i][j] == 12: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 255 if mask[i][j] == 13: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 180 if mask[i][j] == 14: rgb_img[i][j][0] = 0 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 255 if mask[i][j] == 15: rgb_img[i][j][0] = 0 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 0 cv2.imwrite('output.jpg',rgb_img)
最後儲存得到的圖為:
以上這篇淺談pytorch中torch.max和F.softmax函式的維度解釋就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支援我們。