Step by step intro
Do you like pytorch?
Convert input to tensor
if not torch.is_tensor(input):
self.input = torch.from_numpy(input)
Convert to tf:
change if data type matches! numpy tensor etc
tf.cast(x, dtype)
Initialize parameters
np.random.randn
np.random.seed(3)
W1 = np.random.randn(n_h, n_x) * 0.01
b1 = np.zeros(shape = (n_h, 1))
W2 = np.random.randn(n_y, n_h) * 0.01
b2 = np.zeros(shape = (n_y, 1))
Choose an optimization algorithm
Build a model
1. Forward propagate an input
torch.nn.Module.forward
Z = np.dot(W, A) + b
2. Compute the loss function
nn.CrossEntropyLoss
cost = -np.sum(Y*np.log(AL)+(1-Y)*np.log(1-AL)) / m
Link to other loss functions
3. Compute the gradients of the cost with respect to parameters using backpropagation
dW = np.dot(dZ, cache[0].T) / m
db = np.sum(dZ, axis=1, keepdims=True) / m
dA_prev = np.dot(cache[1].T, dZ)
4. Update each parameter using the gradients, according to the optimization algorithm
dZ = relu_backward(dA, activation_cache)
dA_prev, dW, db = linear_backward(dZ, linear_cache)
Fine-tuning
for name, param in model.named_parameters():
if name.startswith("2"):
param.requires_grad = False
else:
param.requires_grad = True #make it explicit
# 0.weight
# 0.bias
# 2.weight
# 2.bias
Last updated
Was this helpful?