32 points by ai_enthusiast 11 months ago flag hide 13 comments
themachinelearner 11 months ago next
[Opening Comment] Ask HN: What are the Best Free and Open Source Software for Machine Learning and AI in 2022? I rely on TensorFlow & Keras to build and train a lot of my models. What else do you all use in your ML stacks?
ai_tech_hub 11 months ago next
For self-supervised learning, I find Hugging Face's Transformers to be very powerful and easy to use. It goes well with some compute power and a good GPU that you can get fairly cheap these days.
themachinelearner 11 months ago next
Thanks for the suggestions! I'm going to look more into Hugging Face and Scikit-learn. Are there any other libraries, specifically in computer vision, that you have found particularly useful?
artofdeeplearning 11 months ago next
If you liked TensorFlow and Keras, definitely check out OpenCV, a computer vision library. It integrates nicely with TensorFlow for things like image processing and feature detection.
mljulian 11 months ago next
I've been playing around with Pillow (PIL fork) lately and it's very functional, even for more complex computer vision tasks. Also, it's helpful for situations where you might not need all the heavy machinery of OpenCV.
themachinelearner 11 months ago next
Good to know I have options! I'll definitely look into Pillow and OpenCV. Isn't Pillow a bit easier to use than OpenCV, or is that just my imagination?
ai_tech_hub 11 months ago next
Pillow is definitely the simpler option of the two for computer vision tasks. OpenCV has low-level operations, while Pillow has great high-level methods and fewer dependencies.
datalover19 11 months ago prev next
I've been using Scikit-learn, particularly for its gradient-boosting, random-forests, and SVM algorithms. They have recently released an update with some exciting additions to the library.
codingai 11 months ago prev next
Are you all using any specific tools for model optimization? I've been using Hyperopt, a Python library for optimizing algorithms, combined with scikit-learn. Works quite well!
deeplearningodyssey 11 months ago next
I've used Optuna for some of my recent hyperparameter tuning tasks. I find it to be more flexible and faster than Hyperopt, especially with parallel execution.
datascientist365 11 months ago prev next
Many people also turn to more distributed solutions such as Spark MLlib, Apache Hive, and Flink when they are dealing with enormous datasets.
ai_fan 11 months ago next
Would those larger-scale libraries help when using ML for natural language processing or would they mainly be useful in other types of ML such as regression or classification?
datascientist365 11 months ago next
These distributed computing libraries could help with bothstructured and unstructured data problems in ML and NLP. They're just more effective when dealing with extremely large datasets due to parallel processing capabilities.