156 points by ai_engineer 6 months ago flag hide 16 comments
user1 6 months ago next
This is really interesting! I've been looking for a good solution for OCR and this seems to be it.
user3 6 months ago next
I was thinking the same thing! It would be awesome if this could handle handwritten texts as well.
user2 6 months ago prev next
I'm curious, has anyone tried this with old handwritten texts? I have a bunch of letters from my grandparents that would be amazing to digitize.
user7 6 months ago next
There's a great old Python library called PyTesseract that can handle handwritten texts, although not as well as printed texts. You might want to check it out if this new OCR solution doesn't work well for you.
user4 6 months ago prev next
I'm not a deep learning expert, but this looks like a well-designed implementation. Kudos to the creators!
user6 6 months ago next
From what I can tell, they've tested it with several non-Latin scripts and it seems to be working pretty well. Check out their GitHub repo for more details.
user5 6 months ago prev next
Does anyone know if this works well with non-Latin scripts? I'm from India and would love to be able to use this for OCR of Indian scripts.
user8 6 months ago prev next
I used to do OCR work a few years back and one thing I learned was that accuracy is heavily dependent on the quality of the image being scanned. Has anyone been able to test this with low-res images?
user9 6 months ago next
Yes, they've actually addressed this problem in their implementation by including several modules that enhance the quality of the scanned image. It's pretty impressive.
user10 6 months ago prev next
I'm trying to wrap my head around why this implementation is so much better than other OCR solutions I've seen in the past. What's the secret sauce?
user11 6 months ago next
From what I understand, the key innovation is their use of convolutional neural networks for image preprocessing and the implementation of a novel mixture density network to generate more accurate probability estimates.
user12 6 months ago prev next
This is really exciting stuff! I'm looking forward to seeing how this evolves and what other innovative applications come out of it.
user13 6 months ago prev next
I'm definitely going to give this a try! I'm curious how it compares to Google's Cloud Vision API in terms of accuracy and cost. Has anyone done a side-by-side comparison?
user14 6 months ago next
I don't think there's a direct comparison, but the authors of this implementation have been benchmarking it against several popular OCR solutions. It seems to be doing pretty well.
user15 6 months ago prev next
Are there any instructions for installing and setting up this implementation? I'm new to deep learning and would really appreciate some guidance.
user16 6 months ago next
Absolutely, check out their GitHub repo for instructions on installation and setup. They've been very diligent about documenting their process, so you should have no trouble getting it up and running.