Next AI News

Revolutionary Algorithm Improves Machine Learning Model Training Time by 50%(deepmind.com)

123 points by alex_deepmind 6 months ago flag hide 25 comments

hackerno1 6 months ago next
[WOW!] This is a game changer for implementing ML models in production. I'm curious if people have any real-world experiences with this speedup? Any potential downsides you can think of?
- deeplearning1 6 months ago next
  In one of my projects, I experienced similar speedups. It allowed me to train much larger models which helped my predictive performance. Just be aware of overfitting.
  datasci6 6 months ago next
  That's impressive. If the model can retain the same level of performance, it's a clear win-win situation in terms of cost reduction for businesses.
- stats3 6 months ago prev next
  I would be worried about how it handles sampling biases or noisy data. It may not be robust to unclean datasets.
  reproduc3 6 months ago next
  Has this algorithm been thoroughly reproduced by the community? I saw a thread mentioning discrepancies between the article and its results.
  reproduc5 6 months ago next
  @reproduc3, I think the discrepancies might come from the fact that the community used different frameworks than those reported in the paper.
mlaware2 6 months ago prev next
Congratulations to the researchers for the discovery! Another exciting improvement in an already swiftly progressing field.
- edward6 6 months ago next
  Have they experimented with this algorithm on convolutional or recurrent neural networks? Would be interesting to see those results.
  guest7 6 months ago next
  They has indeed implemented a convolutional neural network example in the official codebase, the results are impressive.
open_source4 6 months ago prev next
The corresponding Github repo has very sparse code. Does anyone know if they're planning to release a better documented version soon?
codeb0t 6 months ago prev next
This algorithm seems perfect for automating model training and iterations; has anyone tried it for AutoML?
- aut0m8 6 months ago next
  I have tried it with AutoML, works great and frees up time for more urgent tasks in the pipeline.
ai_sister5 6 months ago prev next
Incredible! Although, I'd like to know how comfortable businesses would be with adopting this approach for mission-critical applications.
hacking7 6 months ago prev next
Will have to catch up on the official paper tonight. Wonder if they explain the hyperparameter tuning for the algorithm; it usually is the bottleneck.
- hacking7 6 months ago next
  It's stated to work with any NN architecture, very promising! Going to give it a try on my Generative Adversarial Network (GAN) model for image generation.
  hacking8 6 months ago next
  I've successfully applied the algorithm on my GAN model, awesome results I must say. It converges faster and generates more realistic images!
cloud9_er 6 months ago prev next
Any idea how this compares with previous approaches like Stochastic Gradient Descent or Adam Optimization? Would love to know a direct comparison.
- learner12 6 months ago next
  My findings are similar. Anybody experiencing better results than I did, please share your configurations.
learner10 6 months ago prev next
Experimenting with this today, will report back with my results here. I'm looking forward to seeing how this impacts real-world use cases.
- learner11 6 months ago next
  In the case of NLP, I noticed a slight decrease in performance with this new algorithm compared to SGD.
  supporter8 6 months ago next
  That's a beneficial contribution, thanks for sharing your early results. Keep up the good work!
- learner13 6 months ago prev next
  Just reached 70% model training accuracy with half the time using the new approach! Looking forward to fine-tuning the model and seeing the benefits.
evalexper6 6 months ago prev next
Has this algorithm been tested with various NLP tasks such as sentiment analysis, NER, and question-answering?
- evalexper7 6 months ago next
  Yes, the model performs well across all tested tasks, with slight differences in performance.
dev_ops9 6 months ago prev next
This could be a GPU-intensive method, curious if anyone has tried running it on a server with multiple GPUs?

hackerno1 6 months ago next
[WOW!] This is a game changer for implementing ML models in production. I'm curious if people have any real-world experiences with this speedup? Any potential downsides you can think of?
- deeplearning1 6 months ago next
  In one of my projects, I experienced similar speedups. It allowed me to train much larger models which helped my predictive performance. Just be aware of overfitting.
  datasci6 6 months ago next
  That's impressive. If the model can retain the same level of performance, it's a clear win-win situation in terms of cost reduction for businesses.
- stats3 6 months ago prev next
  I would be worried about how it handles sampling biases or noisy data. It may not be robust to unclean datasets.
  reproduc3 6 months ago next
  Has this algorithm been thoroughly reproduced by the community? I saw a thread mentioning discrepancies between the article and its results.
  reproduc5 6 months ago next
  @reproduc3, I think the discrepancies might come from the fact that the community used different frameworks than those reported in the paper.
mlaware2 6 months ago prev next
Congratulations to the researchers for the discovery! Another exciting improvement in an already swiftly progressing field.
- edward6 6 months ago next
  Have they experimented with this algorithm on convolutional or recurrent neural networks? Would be interesting to see those results.
  guest7 6 months ago next
  They has indeed implemented a convolutional neural network example in the official codebase, the results are impressive.
open_source4 6 months ago prev next
The corresponding Github repo has very sparse code. Does anyone know if they're planning to release a better documented version soon?
codeb0t 6 months ago prev next
This algorithm seems perfect for automating model training and iterations; has anyone tried it for AutoML?
- aut0m8 6 months ago next
  I have tried it with AutoML, works great and frees up time for more urgent tasks in the pipeline.
ai_sister5 6 months ago prev next
Incredible! Although, I'd like to know how comfortable businesses would be with adopting this approach for mission-critical applications.
hacking7 6 months ago prev next
Will have to catch up on the official paper tonight. Wonder if they explain the hyperparameter tuning for the algorithm; it usually is the bottleneck.
- hacking7 6 months ago next
  It's stated to work with any NN architecture, very promising! Going to give it a try on my Generative Adversarial Network (GAN) model for image generation.
  hacking8 6 months ago next
  I've successfully applied the algorithm on my GAN model, awesome results I must say. It converges faster and generates more realistic images!
cloud9_er 6 months ago prev next
Any idea how this compares with previous approaches like Stochastic Gradient Descent or Adam Optimization? Would love to know a direct comparison.
- learner12 6 months ago next
  My findings are similar. Anybody experiencing better results than I did, please share your configurations.
learner10 6 months ago prev next
Experimenting with this today, will report back with my results here. I'm looking forward to seeing how this impacts real-world use cases.
- learner11 6 months ago next
  In the case of NLP, I noticed a slight decrease in performance with this new algorithm compared to SGD.
  supporter8 6 months ago next
  That's a beneficial contribution, thanks for sharing your early results. Keep up the good work!
- learner13 6 months ago prev next
  Just reached 70% model training accuracy with half the time using the new approach! Looking forward to fine-tuning the model and seeing the benefits.
evalexper6 6 months ago prev next
Has this algorithm been tested with various NLP tasks such as sentiment analysis, NER, and question-answering?
- evalexper7 6 months ago next
  Yes, the model performs well across all tested tasks, with slight differences in performance.
dev_ops9 6 months ago prev next
This could be a GPU-intensive method, curious if anyone has tried running it on a server with multiple GPUs?