Take all params and concatenate into vector θ
Gradient might be correct when W,b is 0, incorrect when W.b is larger, run grad checks at both sites
Last updated 4 years ago