🌻
Models
  • Step by step intro
  • Bash
  • Git
    • Remove folder
  • Embedding
    • Normalize Input
    • One-hot
  • Hyperparameter tuning
    • Test vs Validation
    • Bias vs Variance
    • Input
      • Normalize input
      • Initialize weight
    • Hidden Layer
      • Hidden layer size
    • Learning Rate
      • Oscillate learning rate
      • Learning rate finder
    • Batch Size
    • Epoch
    • Gradient
      • Vanishing / Exploding Gradients
      • Gradient Checking
    • Cost Function
      • Loss Binary Cross Entropy
    • Regularization
      • Lβ‚‚ regularization
      • L₁ regularization
      • Dropout regularization
      • Data augmentation
      • Early stopping
  • Fine-tuning
    • Re-train on new data
    • Freeze layer/weight
  • Common Graphing Stats
    • Confidence interval (CI) and error bar
    • Confusion matrix and type I type II error
    • Effect size
  • Models
    • Inverted Pendulum Model
    • Recurrent Neural Networks
      • GRU and LSTM
      • Neural Turing Machines
    • Hopfield
    • Attention
      • Re-attention
      • Enformer
    • Differential Equations
      • Ordinary Differential Equations
        • Language Ordinary Differential Equations (ODE)
        • Neural Ordinary Differential Equations (ODE)
          • Adjoint Sensitive Method
          • Continuous Backpropagation
          • Adjoint ODE
      • Partial Differential Equations
      • Stochastic Differential Equations
    • Knowledge Tracing Models
      • Bayesian Knowledge Tracing
    • trRosetta
    • Curve up grades
  • deeplearning.ai
    • Neural Networks and Deep Learning
      • Wk2 - Python Basics with Numpy
      • Wk2 - Logistic Regression with a Neural Network mindset
      • Wk3 - Planar data classification with a hidden layer
      • Wk4 - Building your Deep Neural Network: Step by Step
      • Wk4 - Deep Neural Network - Application
    • Hyperparameter Tuning, Regularization and Optimization
      • Wk1 - Initialization
      • Wk1 - Regularization
      • Wk1 - Gradient Checking
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
    • Sequence Models
  • Neuroscience Paper
    • Rotation and Head Direction
    • Computational Models of Memory Search
    • Bayesian Delta-Rule Model Explains the Dynamics of Belief Updating
    • Sensory uncertainty and spatial decisions
    • A Neural Implementation of the Kalman Filter
    • Place cells, spatial maps and the population code for memory (Hopfield)
    • Spatial Cognitive Map
    • Event Perception and Memory
    • Interplay of Hippocampus and Prefrontal Cortex in Memory
    • The Molecular and Systems Biology of Memory
    • Reconsidering the Evidence for Learning in Single Cells
    • Single Cortical Neurons as Deep Artificial Neural Networks
    • Magnetic resonance-based eye tracking using deep neural networks
Powered by GitBook
On this page
  • Traditional neural network:
  • Neural ODE:

Was this helpful?

  1. Models
  2. Differential Equations
  3. Ordinary Differential Equations

Neural Ordinary Differential Equations (ODE)

return X, y #upper case: matrix, lower case: vector

Traditional neural network:

x→f(x)→yx \rightarrow f(x) \rightarrow yx→f(x)→y
f(x)=ax+bf(x) = ax + bf(x)=ax+b
loss=(f(x)βˆ’y)2=(ax+bβˆ’y)2loss = (f(x) - y)^2 = (ax + b -y)^2loss=(f(x)βˆ’y)2=(ax+bβˆ’y)2
a=aβˆ’βˆ‚lossβˆ‚a(gradientΒ descent)a = a - \frac{\partial loss}{\partial a} \tag{gradient descent}a=aβˆ’βˆ‚aβˆ‚loss​(gradientΒ descent)
a=aβˆ’2(ax+bβˆ’y)β‹…xβ‹…LRa = a - 2(ax + b - y) \cdot x \cdot LRa=aβˆ’2(ax+bβˆ’y)β‹…xβ‹…LR
b=bβˆ’2(ax+bβˆ’y)β‹…Yβ‹…LRb = b - 2(ax + b - y) \cdot Y \cdot LRb=bβˆ’2(ax+bβˆ’y)β‹…Yβ‹…LR

Neural ODE:

ΞΈ=[a,b]\theta = [a,b]ΞΈ=[a,b]
βˆ‚zβˆ‚t=f(z,t,ΞΈ)\frac{\partial z}{\partial t} = f(z, t, \theta)βˆ‚tβˆ‚z​=f(z,t,ΞΈ)
xβ†’g(x)β†’y=z0β†’g(x)β†’ztx \rightarrow g(x) \rightarrow y = z_0 \rightarrow g(x) \rightarrow z_txβ†’g(x)β†’y=z0​→g(x)β†’zt​
loss=(ztβˆ’ODE(f(z0)))2loss = (z_t - ODE(f(z_0)))^2loss=(ztβ€‹βˆ’ODE(f(z0​)))2
βˆ‚lossβˆ‚zT=2Γ—(ztβˆ’ODE(f(z0)))\frac{\partial loss}{\partial z_T} = 2 \times (z_t - ODE(f(z_0)))βˆ‚zTβ€‹βˆ‚loss​=2Γ—(ztβ€‹βˆ’ODE(f(z0​)))
ΞΈ=ΞΈβˆ’2βˆ‚lossβˆ‚ΞΈβ‹…LR\theta = \theta - 2\frac{\partial loss}{\partial \theta} \cdot LRΞΈ=ΞΈβˆ’2βˆ‚ΞΈβˆ‚loss​⋅LR

PreviousLanguage Ordinary Differential Equations (ODE)NextAdjoint Sensitive Method

Last updated 3 years ago

Was this helpful?

ODE solver