256 points by neuralspectral 10 months ago flag hide 22 comments
differential_net 10 months ago next
Excited to share our new approach to neural network training using differential equations! It's a game changer. Check it out: [Link]
quantum_learner 10 months ago next
This is amazing! The paper mentions a x4 speedup in training on ImageNet, impressive! I wish to hear more about model complexity compared to the vanilla models. Any plan to open-source your code?
differential_net 10 months ago next
@quantum_learner It's a more efficient way to train models for sure! We found our models to be a little smaller in size since they trains faster. Sure, we will open-source our code repository soon! Stay tuned for updates.
loving_math 10 months ago next
@differential_net Your approach is giving me a serious academic déjà vu. Have you been publishing on arXiv or any similar platforms? I'll be very surprised otherwise!
loving_math 10 months ago next
@differential_net Yes, I see it now! I think I presented an almost identical method a few years ago. Seems like we both arrived at the same result from different angles. Good job!
loving_math 10 months ago next
@differential_net Great minds think alike! DM me when you publish so I can reference your work in mine.
curious_george 10 months ago prev next
Could you explain more about how this works? I have never heard of neural networks and differential equations being combined like this. Thanks
math_guru 10 months ago next
@curious_george Sure! Differential equations can represent the continuous change in variables, just like neural networks 'learn' to fit functions. Essentially, we marry the two by making the weights of neural networks change continuously through differential equations. The differential equation could look something like dw/dt = f(w, t).
curious_george 10 months ago next
@math_guru I watched a YouTube video on differential equations after reading your message, and it actually makes sense! I can't wait to learn more about your approach.
math_guru 10 months ago next
Glad you showed interest and explored more, @curious_george. I believe differential equations will become a standard technique for training neural networks soon.
differential_net 10 months ago prev next
@curious_george Check out on ODE-Net and its derivatives to learn more on this subject: [Link] I'll write a blog post explaining our approach to differential equations in neural network training too! :) More updates coming soon.
onlinelearner 10 months ago next
Thank you so much for pointing us to ODE-Net! We've been following your updates and hope to see your blog post soon!
onlinelearner 10 months ago next
The ODE-net research you mention was extremely valuable to our team and paved the way for further investigation in this area. Thanks for the references.
lit_code 10 months ago prev next
Impressive work, well done! Has anyone else experimented with something similar?
deep_nerd 10 months ago next
Topic seems interesting, but do most people have the mathematical background necessary to understand the mechanics of this approach and reproduce the results? Will there be tutorials or simple use cases provided along with the research?
gpus_4_aall 10 months ago prev next
Our team is looking at this, and we are very impressed! How this translates to better GPU utilization? We want to know more.
differential_net 10 months ago next
@deep_nerd Definitely! We're actually working on a tutorial that covers both the background material and a simple use case. We will announce it on our Twitter when ready. Follow us to stay updated! @gpus_4_aall We saw a reduced GPU time; our method trains using continuous variables instead of the vanilla iterative manner. I'll write more on GPU usage and optimization in the forthcoming blog post.
random_developer 10 months ago prev next
This is amazing! Any idea how this might be applied to reinforcement learning or generative adversarial networks (GANs)?
differential_net 10 months ago next
@random_developer We haven't tested our these approaches with reinforcement learning or GANs yet. However, we believe it might benefit these areas as well. We will experiment and share our findings in future publications.
random_developer 10 months ago next
@differential_net I suspected so, and I wouldn't be surprised to see this become a core concept in NN training. Can't wait to read the tutorial your team creates. Already following you!
edgy_algorithm 10 months ago prev next
Although new approaches to the neural network space are compelling, I am curious if the test results would be more impressive using AWS' trainium than a standard GPU setup. Is there a comparison you can share with us?
research_hound 10 months ago next
As the user mention, I was wondering if this paper has any comparison with hardware acceleration architectures. Surely, a differential equation-based training method should thrive with parallel processing systems such as a TPU or tensor accelerator.