123 points by alex_deepmind 6 months ago flag hide 25 comments
hackerno1 6 months ago next
[WOW!] This is a game changer for implementing ML models in production. I'm curious if people have any real-world experiences with this speedup? Any potential downsides you can think of?
deeplearning1 6 months ago next
In one of my projects, I experienced similar speedups. It allowed me to train much larger models which helped my predictive performance. Just be aware of overfitting.
datasci6 6 months ago next
That's impressive. If the model can retain the same level of performance, it's a clear win-win situation in terms of cost reduction for businesses.
stats3 6 months ago prev next
I would be worried about how it handles sampling biases or noisy data. It may not be robust to unclean datasets.
reproduc3 6 months ago next
Has this algorithm been thoroughly reproduced by the community? I saw a thread mentioning discrepancies between the article and its results.
reproduc5 6 months ago next
@reproduc3, I think the discrepancies might come from the fact that the community used different frameworks than those reported in the paper.
mlaware2 6 months ago prev next
Congratulations to the researchers for the discovery! Another exciting improvement in an already swiftly progressing field.
edward6 6 months ago next
Have they experimented with this algorithm on convolutional or recurrent neural networks? Would be interesting to see those results.
guest7 6 months ago next
They has indeed implemented a convolutional neural network example in the official codebase, the results are impressive.
open_source4 6 months ago prev next
The corresponding Github repo has very sparse code. Does anyone know if they're planning to release a better documented version soon?
codeb0t 6 months ago prev next
This algorithm seems perfect for automating model training and iterations; has anyone tried it for AutoML?
aut0m8 6 months ago next
I have tried it with AutoML, works great and frees up time for more urgent tasks in the pipeline.
ai_sister5 6 months ago prev next
Incredible! Although, I'd like to know how comfortable businesses would be with adopting this approach for mission-critical applications.
hacking7 6 months ago prev next
Will have to catch up on the official paper tonight. Wonder if they explain the hyperparameter tuning for the algorithm; it usually is the bottleneck.
hacking7 6 months ago next
It's stated to work with any NN architecture, very promising! Going to give it a try on my Generative Adversarial Network (GAN) model for image generation.
hacking8 6 months ago next
I've successfully applied the algorithm on my GAN model, awesome results I must say. It converges faster and generates more realistic images!
cloud9_er 6 months ago prev next
Any idea how this compares with previous approaches like Stochastic Gradient Descent or Adam Optimization? Would love to know a direct comparison.
learner12 6 months ago next
My findings are similar. Anybody experiencing better results than I did, please share your configurations.
learner10 6 months ago prev next
Experimenting with this today, will report back with my results here. I'm looking forward to seeing how this impacts real-world use cases.
learner11 6 months ago next
In the case of NLP, I noticed a slight decrease in performance with this new algorithm compared to SGD.
supporter8 6 months ago next
That's a beneficial contribution, thanks for sharing your early results. Keep up the good work!
learner13 6 months ago prev next
Just reached 70% model training accuracy with half the time using the new approach! Looking forward to fine-tuning the model and seeing the benefits.
evalexper6 6 months ago prev next
Has this algorithm been tested with various NLP tasks such as sentiment analysis, NER, and question-answering?
evalexper7 6 months ago next
Yes, the model performs well across all tested tasks, with slight differences in performance.
dev_ops9 6 months ago prev next
This could be a GPU-intensive method, curious if anyone has tried running it on a server with multiple GPUs?