🌻
Models
  • Step by step intro
  • Bash
  • Git
    • Remove folder
  • Embedding
    • Normalize Input
    • One-hot
  • Hyperparameter tuning
    • Test vs Validation
    • Bias vs Variance
    • Input
      • Normalize input
      • Initialize weight
    • Hidden Layer
      • Hidden layer size
    • Learning Rate
      • Oscillate learning rate
      • Learning rate finder
    • Batch Size
    • Epoch
    • Gradient
      • Vanishing / Exploding Gradients
      • Gradient Checking
    • Cost Function
      • Loss Binary Cross Entropy
    • Regularization
      • Lā‚‚ regularization
      • L₁ regularization
      • Dropout regularization
      • Data augmentation
      • Early stopping
  • Fine-tuning
    • Re-train on new data
    • Freeze layer/weight
  • Common Graphing Stats
    • Confidence interval (CI) and error bar
    • Confusion matrix and type I type II error
    • Effect size
  • Models
    • Inverted Pendulum Model
    • Recurrent Neural Networks
      • GRU and LSTM
      • Neural Turing Machines
    • Hopfield
    • Attention
      • Re-attention
      • Enformer
    • Differential Equations
      • Ordinary Differential Equations
        • Language Ordinary Differential Equations (ODE)
        • Neural Ordinary Differential Equations (ODE)
          • Adjoint Sensitive Method
          • Continuous Backpropagation
          • Adjoint ODE
      • Partial Differential Equations
      • Stochastic Differential Equations
    • Knowledge Tracing Models
      • Bayesian Knowledge Tracing
    • trRosetta
    • Curve up grades
  • deeplearning.ai
    • Neural Networks and Deep Learning
      • Wk2 - Python Basics with Numpy
      • Wk2 - Logistic Regression with a Neural Network mindset
      • Wk3 - Planar data classification with a hidden layer
      • Wk4 - Building your Deep Neural Network: Step by Step
      • Wk4 - Deep Neural Network - Application
    • Hyperparameter Tuning, Regularization and Optimization
      • Wk1 - Initialization
      • Wk1 - Regularization
      • Wk1 - Gradient Checking
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
    • Sequence Models
  • Neuroscience Paper
    • Rotation and Head Direction
    • Computational Models of Memory Search
    • Bayesian Delta-Rule Model Explains the Dynamics of Belief Updating
    • Sensory uncertainty and spatial decisions
    • A Neural Implementation of the Kalman Filter
    • Place cells, spatial maps and the population code for memory (Hopfield)
    • Spatial Cognitive Map
    • Event Perception and Memory
    • Interplay of Hippocampus and Prefrontal Cortex in Memory
    • The Molecular and Systems Biology of Memory
    • Reconsidering the Evidence for Learning in Single Cells
    • Single Cortical Neurons as Deep Artificial Neural Networks
    • Magnetic resonance-based eye tracking using deep neural networks
Powered by GitBook
On this page
  • Convert input to tensor
  • Convert to tf:
  • Initialize parameters
  • Choose an optimization algorithm
  • Build a model
  • Fine-tuning

Was this helpful?

Step by step intro

Do you like pytorch?

Convert input to tensor

if not torch.is_tensor(input):
    self.input = torch.from_numpy(input)

Convert to tf:

change if data type matches! numpy tensor etc

tf.cast(x, dtype)

Initialize parameters

np.random.randn

np.random.seed(3)

W1 = np.random.randn(n_h, n_x) * 0.01
b1 = np.zeros(shape = (n_h, 1))
W2 = np.random.randn(n_y, n_h) * 0.01
b2 = np.zeros(shape = (n_y, 1))

Choose an optimization algorithm

Build a model

1. Forward propagate an input

torch.nn.Module.forward

Z = np.dot(W, A) + b

2. Compute the loss function

nn.CrossEntropyLoss

cost = -np.sum(Y*np.log(AL)+(1-Y)*np.log(1-AL)) / m

3. Compute the gradients of the cost with respect to parameters using backpropagation

dW = np.dot(dZ, cache[0].T) / m
db = np.sum(dZ, axis=1, keepdims=True) / m
dA_prev = np.dot(cache[1].T, dZ)

4. Update each parameter using the gradients, according to the optimization algorithm

dZ = relu_backward(dA, activation_cache)
dA_prev, dW, db = linear_backward(dZ, linear_cache)

Fine-tuning

for name, param in model.named_parameters():
    if name.startswith("2"):
        param.requires_grad = False
    else:
        param.requires_grad = True #make it explicit
        
# 0.weight
# 0.bias
# 2.weight
# 2.bias

NextBash

Last updated 3 years ago

Was this helpful?

Link to other

torch.optim.Adam
loss functions
loss.backward()
Wk4 - Building your Deep Neural Network: Step by Step