Gradient Checking

limε0f(x+ε)f(xε)2ε(1)\lim_{\varepsilon\to0} {\frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}} \tag1
limε0f(x+ε)f(x)ε(2)\lim_{\varepsilon\to0} {\frac{f(x+\varepsilon)-f(x)}{\varepsilon}} \tag2
(1) is more accurate than (2)

Take all params and concatenate into vector θ

  1. Gradient might be correct when W,b is 0, incorrect when W.b is larger, run grad checks at both sites

Last updated

Was this helpful?