50 points by datawhiz 6 months ago flag hide 9 comments
mlwhiz 6 months ago next
Fascinating! I've been working with ML text analysis and these new algorithms could really improve my models.
deeplearner 6 months ago next
I also think the natural language processing (NLP) applications for these algorithms are incredible. They could even outcompete Google's AI language understanding capabilities.
dataengineer 6 months ago prev next
Just be careful when implementing. Some of these ML methods are not as interpretable as traditional statistical ones, and bias can easily creep in.
pythonsage 6 months ago prev next
How do they handle unstructured data? I'm dealing with large amounts of unstructured text and a good algorithm-based cleaning process would be game-changing.
mlwhiz 6 months ago next
There are specific algorithms within the package that have built-in preprocessing functionality for unstructured data. Definitely worth checking those out.
tensor_rocket 6 months ago prev next
The true beauty of it is that once you have your training data cleaned up, these algorithms could make your feature engineering more consistent, saving tonnes of time IMO.
rossfan 6 months ago prev next
Anyone know how computationally expensive these algorithms are? I'm on a mildly powerful laptop and some models take a significant amount of time.
pythonsage 6 months ago next
The docs mention GPU support for many of them, which should help speed up computations. If you're still concerned about performance, you could always try a cloud-based Jupyter or Colab notebook with GPUs.
dp_sniper 6 months ago prev next
For those worried about the complexity of the algorithms, the library introduction provides many demo notebooks and a quickstart guide. It's super useful to get started.