123 points by deeplearning_wiz 5 months ago flag hide 15 comments
deeplearning_fan 5 months ago next
This is fascinating! I've been waiting for a breakthrough in neural network training. I'm eager to try it out. Anyone tried it on their datasets yet?
trainable_expert 5 months ago next
I'm giving this approach a try for my computer vision models and so far I see a notable increase in performance. Will be sharing details on my blog soon.
many_project 5 months ago prev next
Congrats on your great work! I'm looking forward to incorporating this into my projects. I have 7 projects on my to-do list, two of which will be prime candidates for this.
just_a_visitor 5 months ago next
Seven projects is an impressive number! How long do you think it would take for this to be tested/implemented within your projects? I wonder about the development time before we'll see a large-scale rollout.
just_a_visitor 5 months ago next
I'm thinking about it too. I believe it could be applied to reinforcement learning contexts. If done properly, the training time would be reduced significantly. Exciting!
just_a_visitor 5 months ago prev next
This sounds amazing. Wondering if this will solve the 'exploding gradient problem' many deep learning researchers are currently dealing with?
deeplearning_fan 5 months ago next
Do you have any resources you recommend for understanding this method from a mathematical/algorithms perspective? Would love to understand how it actually works under the hood.
trainable_expert 5 months ago next
If you have a strong CS/math background, I think the 'Understanding LSTM Networks' ebook will certainly help. Currently it doesn't cover the newer approach mentioned here, but it will get you started.
datasciencedebate 5 months ago next
While the approach itself sounds interesting, I want to play devil's advocate for a moment here. How do we know this isn't just a novel trick to better optimize existing methods, rather than a fundamentally different paradigm?
gimme_insight 5 months ago next
This is valid point. While I agree that the impact of this approach might be primarily seen in its optimization abilities, it definitely is a step forward from the current methods.
arxiv_reader 5 months ago prev next
Came across a paper on arXiv recently which I believe is relevant to the topic at hand: 'Adaptive methods for efficient backpropagation and stochastic optimization'.
adaptive_researcher 5 months ago next
Yes, the paper mentioned by arxiv_reader (https://arxiv.org/abs/1605.08245) looks highly relevant. It discusses adaptive methods that could work synergistically with this new approach.
using_different 5 months ago next
Great suggestion! I might just try combining this with a reinforcement learning algorithm for my next shot at reinforced backgammon. Maybe we'll see some convincing outcomes!
using_different', 5 months ago prev next
Has anyone attempted to combine this with other training methods like reinforcement learning or genetic algorithms? Could provide even more interesting results.
trainable_expert 5 months ago next
I do like this idea, but I'm hesitant to mix too many approaches. With more methods tied to one another, the probability of something going wrong rises exponentially, and the codebase becomes a mess. Trade-offs we deal with every day.