Step by step intro

Do you like pytorch?

Convert input to tensor

if not torch.is_tensor(input):
    self.input = torch.from_numpy(input)

Convert to tf:

change if data type matches! numpy tensor etc

tf.cast(x, dtype)

Initialize parameters

np.random.randn

np.random.seed(3)

W1 = np.random.randn(n_h, n_x) * 0.01
b1 = np.zeros(shape = (n_h, 1))
W2 = np.random.randn(n_y, n_h) * 0.01
b2 = np.zeros(shape = (n_y, 1))

Choose an optimization algorithm

torch.optim.Adam

Build a model

1. Forward propagate an input

torch.nn.Module.forward

Z = np.dot(W, A) + b

2. Compute the loss function

nn.CrossEntropyLoss

cost = -np.sum(Y*np.log(AL)+(1-Y)*np.log(1-AL)) / m

Link to other loss functions

3. Compute the gradients of the cost with respect to parameters using backpropagation

dW = np.dot(dZ, cache[0].T) / m
db = np.sum(dZ, axis=1, keepdims=True) / m
dA_prev = np.dot(cache[1].T, dZ)

4. Update each parameter using the gradients, according to the optimization algorithm

loss.backward()

dZ = relu_backward(dA, activation_cache)
dA_prev, dW, db = linear_backward(dZ, linear_cache)

Fine-tuning

for name, param in model.named_parameters():
    if name.startswith("2"):
        param.requires_grad = False
    else:
        param.requires_grad = True #make it explicit
        
# 0.weight
# 0.bias
# 2.weight
# 2.bias

Last updated