123 points by quantum_mind 7 months ago flag hide 18 comments
theape 7 months ago next
This is really impressive work! I've been thinking about the connection between differential equations and neural networks for a while, great to see someone put it into practice.
julia_programmer 7 months ago next
Neat! I'd love to get some more details on the implementation and if you plan to open source it.
tom 7 months ago prev next
This brought me back to the days of learning ODEs in college. Haven't thought about them in years!
pytorch_wiz 7 months ago prev next
Really cool approach, I wonder how it compares to the current techniques we have now like regularization and dropout?
alexnet 7 months ago next
Regarding the comparison to current techniques, did you see any impact on training and testing accuracy? Is there a chance of over-fitting when using differential equations in the same way we see with dropout when the probability is set low?
fastai_dev 7 months ago prev next
This is really neat. I'm curious to know more about the mathematical intuition behind this method.
stanford_grad 7 months ago next
The intuition behind it is pretty simple, once you understand Ordinary Differential Equations(ODE) and Neural Networks. In a nutshell, the authors model the continuous dynamics of the neural network parameters using ODEs. It is like you're always training your neural network, but the weight changes happen differently between training steps.
cs_student 7 months ago next
I see, I think I'm starting to get it. But wouldn't you face performance issues using differential equations for large scale machine learning models?
stanford_grad 7 months ago next
That's a great question. From my understanding, there could be some performance impacts compared to the current discrete methods that ML mostly relies on, but the authors have presented some approaches to tackling those issues. They don't bring them up in the post, though. It would be worth checking out their code to see their thoughts on performance optimizations.
deep_learning_fan 7 months ago prev next
I read this paper recently, and it gave me a lot of food for thought. Neural ODEs with adaptive solver have an advantage over traditional approaches in that they are continuous, which means that they have the ability to get more accurate results and, in turn, could potentially lead to more accurate models.
topology_guy 7 months ago next
Interesting view! How do the models scale with datasets that change over time?
deep_learning_fan 7 months ago next
It's model-specific and would require more investigation. Based on the Neural ODEs paper, they mentioned applications that require capturing long term dependencies in the data, such as models for time series analysis, are well-suited to this continuous approach. They also explained how the adaptive solver approach efficiently handles the forward and backward passes for a wide array of applications of performance and scalability.
commentator1 7 months ago prev next
Thanks for sharing. I've also been thinking of ways to improve training for generative models, so this seems pretty relevant to my work.
ml_engineer 7 months ago next
Same here! I'm working on improving our reinforcement learning models, and I'm always on the lookout for new paradigms in training methods. I can see applications for this in RL, particularly when dealing with continuous state-action spaces.
stanford_grad 7 months ago next
You'll be pleased to know that we've already been exploring Neural ODEs in reinforcement learning context. There's a nice video of a talk by the author of the paper I mentioned - watching it could answer your questions better than I can!
technobuff 7 months ago prev next
Amazing work! It reminds me of how physics-informed machine learning is catching on in research lately. I'm glad to see that this area continues to grow.
professor_xyz 7 months ago prev next
Thanks for highlighting this. While I'm not working directly with neural networks at the moment, it's always exciting to see how and where they'll be used next.
research_enthusiast 7 months ago prev next
This seems highly related to the fields of optimal control and dynamical systems. It's neat to see this connection made, as I haven't been seeing enough of that lately.