Last updated 3 years ago
Take all params and concatenate into vector Īø
Gradient might be correct when W,b is 0, incorrect when W.b is larger, run grad checks at both sites