從零實(shí)現(xiàn)深度學(xué)習(xí)框架(八)實(shí)現(xiàn)常見運(yùn)算的計(jì)算圖(上)

引言
本著“凡我不能創(chuàng)造的,我就不能理解”的思想,本系列文章會(huì)基于純Python以及NumPy從零創(chuàng)建自己的深度學(xué)習(xí)框架,該框架類似PyTorch能實(shí)現(xiàn)自動(dòng)求導(dǎo)。
要深入理解深度學(xué)習(xí),從零開始創(chuàng)建的經(jīng)驗(yàn)非常重要,從自己可以理解的角度出發(fā),盡量不適用外部完備的框架前提下,實(shí)現(xiàn)我們想要的模型。本系列文章的宗旨就是通過這樣的過程,讓大家切實(shí)掌握深度學(xué)習(xí)底層實(shí)現(xiàn),而不是僅做一個(gè)調(diào)包俠。
在上篇文章中,我們實(shí)現(xiàn)了反向傳播的模式代碼。同時(shí)正確地實(shí)現(xiàn)了加法運(yùn)算和乘法運(yùn)算。從今天開始,我們就來實(shí)現(xiàn)剩下的運(yùn)算。本文實(shí)現(xiàn)了減法、除法、矩陣乘法和求和等運(yùn)算。
實(shí)現(xiàn)減法運(yùn)算
我們先編寫測(cè)試用例,再實(shí)現(xiàn)減法計(jì)算圖。
test_tensor_sub.py:
import?numpy?as?np
from?core.tensor?import?Tensor
def?test_simple_sub():
????x?=?Tensor(1,?requires_grad=True)
????y?=?Tensor(2,?requires_grad=True)
????z?=?x?-?y
????z.backward()
????assert?x.grad.data?==?1.0
????assert?y.grad.data?==?-1.0
def?test_array_sub():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????y?=?Tensor([4,?5,?6],?requires_grad=True)
????z?=?x?-?y
????assert?z.data.tolist()?==?[-3.,?-3.,?-3.]
????z.backward(Tensor([1,?1,?1]))
????assert?x.grad.data.tolist()?==?[1,?1,?1]
????assert?y.grad.data.tolist()?==?[-1,?-1,?-1]
????x?-=?0.1
????assert?x.grad?is?None
????np.testing.assert_array_almost_equal(x.data,?[0.9,?1.9,?2.9])
def?test_broadcast_sub():
????x?=?Tensor([[1,?2,?3],?[4,?5,?6]],?requires_grad=True)??#?(2,?3)
????y?=?Tensor([7,?8,?9],?requires_grad=True)??#?(3,?)
????z?=?x?-?y??#?shape?(2,?3)
????assert?z.data.tolist()?==?[[-6,?-6,?-6],?[-3,?-3,?-3]]
????z.backward(Tensor(np.ones_like(x.data)))
????assert?x.grad.data.tolist()?==?[[1,?1,?1],?[1,?1,?1]]
????assert?y.grad.data.tolist()?==?[-2,?-2,?-2]
然后實(shí)現(xiàn)減法的計(jì)算圖。

class?Sub(_Function):
????def?forward(ctx,?x:?np.ndarray,?y:?np.ndarray)?->?np.ndarray:
????????'''
????????實(shí)現(xiàn)?z?=?x?-?y
????????'''
????????ctx.save_for_backward(x.shape,?y.shape)
????????return?x?-?y
????def?backward(ctx,?grad:?Any)?->?Any:
????????shape_x,?shape_y?=?ctx.saved_tensors
????????return?unbroadcast(grad,?shape_x),?unbroadcast(-grad,?shape_y)
這些類都添加到ops.py中。然后跑一下測(cè)試用例,結(jié)果為:
=============================?test?session?starts?==============================
collecting?...?collected?3?items
test_sub.py::test_simple_sub?PASSED??????????????????????????????????????[?33%]'numpy.ndarray'>
test_sub.py::test_array_sub?PASSED???????????????????????????????????????[?66%]'numpy.ndarray'>
test_sub.py::test_broadcast_sub?PASSED???????????????????????????????????[100%]'numpy.ndarray'>
==============================?3?passed?in?0.36s?===============================
實(shí)現(xiàn)除法運(yùn)算
編寫測(cè)試用例:
import?numpy?as?np
from?core.tensor?import?Tensor
def?test_simple_div():
????'''
????測(cè)試簡(jiǎn)單的除法
????'''
????x?=?Tensor(1,?requires_grad=True)
????y?=?Tensor(2,?requires_grad=True)
????z?=?x?/?y
????z.backward()
????assert?x.grad.data?==?0.5
????assert?y.grad.data?==?-0.25
def?test_array_div():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????y?=?Tensor([2,?4,?6],?requires_grad=True)
????z?=?x?/?y
????assert?z.data.tolist()?==?[0.5,?0.5,?0.5]
????assert?x.data.tolist()?==?[1,?2,?3]
????z.backward(Tensor([1,?1,?1]))
????np.testing.assert_array_almost_equal(x.grad.data,?[0.5,?0.25,?1?/?6])
????np.testing.assert_array_almost_equal(y.grad.data,?[-0.25,?-1?/?8,?-1?/?12])
????x?/=?0.1
????assert?x.grad?is?None
????assert?x.data.tolist()?==?[10,?20,?30]
def?test_broadcast_div():
????x?=?Tensor([[1,?1,?1],?[2,?2,?2]],?requires_grad=True)??#?(2,?3)
????y?=?Tensor([4,?4,?4],?requires_grad=True)??#?(3,?)
????z?=?x?/?y??#?(2,3)?*?(3,)?=>?(2,3)?*?(2,3)?->?(2,3)
????assert?z.data.tolist()?==?[[0.25,?0.25,?0.25],?[0.5,?0.5,?0.5]]
????z.backward(Tensor([[1,?1,?1,?],?[1,?1,?1]]))
????assert?x.grad.data.tolist()?==?[[1/4,?1/4,?1/4],?[1/4,?1/4,?1/4]]
????assert?y.grad.data.tolist()?==?[-3/16,?-3/16,?-3/16]

#?Python3?只有?__truediv__?相關(guān)魔法方法
class?TrueDiv(_Function):
????def?forward(ctx,?x:?ndarray,?y:?ndarray)?->?ndarray:
????????'''
????????實(shí)現(xiàn)?z?=?x?/?y
????????'''
????????ctx.save_for_backward(x,?y)
????????return?x?/?y
????def?backward(ctx,?grad:?ndarray)?->?Tuple[ndarray,?ndarray]:
????????x,?y?=?ctx.saved_tensors
????????return?unbroadcast(grad?/?y,?x.shape),?unbroadcast(grad?*?(-x?/?y?**?2),?y.shape)
由于Python3只有 __truediv__ 相關(guān)魔法方法,因?yàn)闉榱撕?jiǎn)單,也將我們的除法命名為TrueDiv。
同時(shí)修改tensor中的register方法。
至此,加減乘除都實(shí)現(xiàn)好了。下面我們來實(shí)現(xiàn)矩陣乘法。
實(shí)現(xiàn)矩陣乘法
先寫測(cè)試用例:
import?numpy?as?np
import?torch
from?core.tensor?import?Tensor
from?torch?import?tensor
def?test_simple_matmul():
????x?=?Tensor([[1,?2],?[3,?4],?[5,?6]],?requires_grad=True)??#?(3,2)
????y?=?Tensor([[2],?[3]],?requires_grad=True)??#?(2,?1)
????z?=?x?@?y??#?(3,2)?@?(2,?1)?->?(3,1)
????assert?z.data.tolist()?==?[[8],?[18],?[28]]
????grad?=?Tensor(np.ones_like(z.data))
????z.backward(grad)
????np.testing.assert_array_equal(x.grad.data,?grad.data?@?y.data.T)
????np.testing.assert_array_equal(y.grad.data,?x.data.T?@?grad.data)
def?test_broadcast_matmul():
????x?=?Tensor(np.arange(2?*?2?*?4).reshape((2,?2,?4)),?requires_grad=True)??#?(2,?2,?4)
????y?=?Tensor(np.arange(2?*?4).reshape((4,?2)),?requires_grad=True)??#?(4,?2)
????z?=?x?@?y??#?(2,2,4)?@?(4,2)?->?(2,2,4)?@?(1,4,2)?=>?(2,2,4)?@?(2,4,2)??->?(2,2,2)
????assert?z.shape?==?(2,?2,?2)
????#?引入torch.tensor進(jìn)行測(cè)試
????tx?=?tensor(x.data,?dtype=torch.float,?requires_grad=True)
????ty?=?tensor(y.data,?dtype=torch.float,?requires_grad=True)
????tz?=?tx?@?ty
????assert?z.data.tolist()?==?tz.data.tolist()
????grad?=?np.ones_like(z.data)
????z.backward(Tensor(grad))
????tz.backward(tensor(grad))
????#?和老大哥?pytorch保持一致就行了
????assert?np.allclose(x.grad.data,?tx.grad.numpy())
????assert?np.allclose(y.grad.data,?ty.grad.numpy())
這里矩陣乘法有點(diǎn)復(fù)雜,不過都在理解廣播和常見的乘法中分析過了。同時(shí)我們引入了torch僅用作測(cè)試。

在常見運(yùn)算的計(jì)算圖中對(duì)句子乘法的反向傳播進(jìn)行了分析。我們下面就來實(shí)現(xiàn):
class?Matmul(_Function):
????def?forward(ctx,?x:?ndarray,?y:?ndarray)?->?ndarray:
????????'''
????????z?=?x?@?y
????????'''
????????assert?x.ndim?>?1?and?y.ndim?>?1,?f"the?dim?number?of?x?or?y?must?>=2,?actual?x:{x.ndim}??and?y:{y.ndim}"
????????ctx.save_for_backward(x,?y)
????????return?x?@?y
????def?backward(ctx,?grad:?ndarray)?->?Tuple[ndarray,?ndarray]:
????????x,?y?=?ctx.saved_tensors
????????return?unbroadcast(grad?@?y.swapaxes(-2,?-1),?x.shape),?unbroadcast(x.swapaxes(-2,?-1)?@?grad,?y.shape)
為了適應(yīng) (2,2,4) @ (4,2) -> (2,2,4) @ (1,4,2) => (2,2,4) @ (2,4,2) -> (2,2,2)的情況,通過swapaxes交換最后兩個(gè)維度的軸,而不是簡(jiǎn)單的轉(zhuǎn)置T。
下面來實(shí)現(xiàn)聚合運(yùn)算,像Sum和Max這些。
實(shí)現(xiàn)求和運(yùn)算
先看測(cè)試用例:
import?numpy?as?np
from?core.tensor?import?Tensor
def?test_simple_sum():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????y?=?x.sum()
????assert?y.data?==?6
????y.backward()
????assert?x.grad.data.tolist()?==?[1,?1,?1]
def?test_sum_with_grad():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????y?=?x.sum()
????y.backward(Tensor(3))
????assert?x.grad.data.tolist()?==?[3,?3,?3]
def?test_matrix_sum():
????x?=?Tensor([[1,?2,?3],?[4,?5,?6]],?requires_grad=True)??#?(2,3)
????y?=?x.sum()
????assert?y.data?==?21
????y.backward()
????assert?x.grad.data.tolist()?==?np.ones_like(x.data).tolist()
def?test_matrix_with_axis():
????x?=?Tensor([[1,?2,?3],?[4,?5,?6]],?requires_grad=True)??#?(2,3)
????y?=?x.sum(axis=0)??#?keepdims?=?False
????assert?y.shape?==?(3,)
????assert?y.data.tolist()?==?[5,?7,?9]
????y.backward([1,?1,?1])
????assert?x.grad.data.tolist()?==?[[1,?1,?1],?[1,?1,?1]]
def?test_matrix_with_keepdims():
????x?=?Tensor([[1,?2,?3],?[4,?5,?6]],?requires_grad=True)??#?(2,3)
????y?=?x.sum(axis=0,?keepdims=True)??#?keepdims?=?True
????assert?y.shape?==?(1,?3)
????assert?y.data.tolist()?==?[[5,?7,?9]]
????y.backward([1,?1,?1])
????assert?x.grad.data.tolist()?==?[[1,?1,?1],?[1,?1,?1]]

class?Sum(_Function):
????def?forward(ctx,?x:?ndarray,?axis=None,?keepdims=False)?->?ndarray:
????????ctx.save_for_backward(x.shape)
????????return?x.sum(axis,?keepdims=keepdims)
????def?backward(ctx,?grad:?ndarray)?->?ndarray:
????????x_shape,?=?ctx.saved_tensors
????????#?將梯度廣播成input_shape形狀,梯度的維度要和輸入的維度一致
????????return?np.broadcast_to(grad,?x_shape)
我們這里支持keepdims參數(shù)。
下面實(shí)現(xiàn)一元操作,比較簡(jiǎn)單,根據(jù)計(jì)算圖可以直接寫出來。
實(shí)現(xiàn)Log運(yùn)算
測(cè)試用例:
import?math
import?numpy?as?np
from?core.tensor?import?Tensor
def?test_simple_log():
????x?=?Tensor(10,?requires_grad=True)
????z?=?x.log()
????np.testing.assert_array_almost_equal(z.data,?math.log(10))
????z.backward()
????np.testing.assert_array_almost_equal(x.grad.data.tolist(),?0.1)
def?test_array_log():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????z?=?x.log()
????np.testing.assert_array_almost_equal(z.data,?np.log([1,?2,?3]))
????z.backward([1,?1,?1])
????np.testing.assert_array_almost_equal(x.grad.data.tolist(),?[1,?0.5,?1?/?3])

class?Log(_Function):
????def?forward(ctx,?x:?ndarray)?->?ndarray:
????????ctx.save_for_backward(x)
????????#?log?=?ln
????????return?np.log(x)
????def?backward(ctx,?grad:?ndarray)?->?ndarray:
????????x,?=?ctx.saved_tensors
????????return?grad?/?x
實(shí)現(xiàn)Exp運(yùn)算
測(cè)試用例:
import?numpy?as?np
from?core.tensor?import?Tensor
def?test_simple_exp():
????x?=?Tensor(2,?requires_grad=True)
????z?=?x.exp()??#?e^2
????np.testing.assert_array_almost_equal(z.data,?np.exp(2))
????z.backward()
????np.testing.assert_array_almost_equal(x.grad.data,?np.exp(2))
def?test_array_exp():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????z?=?x.exp()
????np.testing.assert_array_almost_equal(z.data,?np.exp([1,?2,?3]))
????z.backward([1,?1,?1])
????np.testing.assert_array_almost_equal(x.grad.data,?np.exp([1,?2,?3]))

class?Exp(_Function):
????def?forward(ctx,?x:?ndarray)?->?ndarray:
????????ctx.save_for_backward(x)
????????return?np.exp(x)
????def?backward(ctx,?grad:?ndarray)?->?ndarray:
????????x,?=?ctx.saved_tensors
????????return?np.exp(x)
實(shí)現(xiàn)Pow運(yùn)算
from?core.tensor?import?Tensor
def?test_simple_pow():
????x?=?Tensor(2,?requires_grad=True)
????y?=?2
????z?=?x?**?y
????assert?z.data?==?4
????z.backward()
????assert?x.grad.data?==?4
def?test_array_pow():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????y?=?3
????z?=?x?**?y
????assert?z.data.tolist()?==?[1,?8,?27]
????z.backward([1,?1,?1])
????assert?x.grad.data.tolist()?==?[3,?12,?27]

class?Pow(_Function):
????def?forward(ctx,?x:?ndarray,?c:?ndarray)?->?ndarray:
????????ctx.save_for_backward(x,?c)
????????return?x?**?c
????def?backward(ctx,?grad:?ndarray)?->?Tuple[ndarray,?None]:
????????x,?c?=?ctx.saved_tensors
????????#?把c當(dāng)成一個(gè)常量,不需要計(jì)算梯度
????????return?grad?*?c?*?x?**?(c?-?1),?None
實(shí)現(xiàn),這里看成是常量,變量是。常量不需要計(jì)算梯度,我們返回None即可。
實(shí)現(xiàn)取負(fù)數(shù)
其實(shí)就是加一個(gè)負(fù)號(hào)y = -x。
import?numpy?as?np
from?core.tensor?import?Tensor
def?test_simple_exp():
????x?=?Tensor(2,?requires_grad=True)
????z?=?-x??#?-2
????assert?z.data?==?-2
????z.backward()
????assert?x.grad.data?==?-1
def?test_array_exp():
????x?=?Tensor([1,?2,?3],?requires_grad=True)
????z?=?-x
????np.testing.assert_array_equal(z.data,?[-1,?-2,?-3])
????z.backward([1,?1,?1])
????np.testing.assert_array_equal(x.grad.data,?[-1,?-1,?-1])

class?Neg(_Function):
????def?forward(ctx,?x:?ndarray)?->?ndarray:
????????return?-x
????def?backward(ctx,?grad:?ndarray)?->?ndarray:
????????return?-grad
總結(jié)
本文實(shí)現(xiàn)了常見運(yùn)算的計(jì)算圖,下篇文章會(huì)實(shí)現(xiàn)剩下的諸如求最大值、切片、變形和轉(zhuǎn)置等運(yùn)算。
最后一句:BUG,走你!


Markdown筆記神器Typora配置Gitee圖床
不會(huì)真有人覺得聊天機(jī)器人難吧(一)
Spring Cloud學(xué)習(xí)筆記(一)
沒有人比我更懂Spring Boot(一)
入門人工智能必備的線性代數(shù)基礎(chǔ)
1.看到這里了就點(diǎn)個(gè)在看支持下吧,你的在看是我創(chuàng)作的動(dòng)力。
2.關(guān)注公眾號(hào),每天為您分享原創(chuàng)或精選文章!
3.特殊階段,帶好口罩,做好個(gè)人防護(hù)。
