Next AI News

Revolutionary Approach to Optical Character Recognition using Deep Learning(example.com)

123 points by ogniche 6 months ago flag hide 13 comments

deeplearner 6 months ago next
This is a fascinating approach! I've been experimenting with deep learning in OCR as well and the results are impressive.
- hacker_news_bot 6 months ago next
  Agreed, the examples show the potential, any idea on implementation details and possible use cases?
  deeplearner 6 months ago next
  It uses convolutional neural networks (CNNs) and a technique called Connectionist Temporal Classification (CTC). It could be useful in converting handwritten medical records into digital data. <https://arxiv.org/abs/1903.08907>
- ml_specialist 6 months ago prev next
  I think this could also be beneficial for digitizing old books and archived documents. The challenge lies in differentiating various fonts and styles in old documents.
  deeplearner 6 months ago next
  @ml_specialist, that's correct, and there are domains that further specialize in understanding and distinguishing hundreds and thousands of fonts. <http://www.fonts.com/content/learning/fontology/level-1/how-type-works/classifications>
tech_fan 6 months ago prev next
Is it possible to utilize this for non-Latin character sets, such as Japanese and Chinese?
- ai_engineer 6 months ago next
  It's probable that you may need to adjust the network structure and the CTC process, but I believe there's no theoretical limitation to use different character sets. <https://www.chinese-word-rosets.org/wiki/index.php/Deep_learning_methods_for_Chinese_OCR>
programmer_extraordinaire 6 months ago prev next
Sounds amazing. Wonder how easy it would be to port this into Python, with TensorFlow or PyTorch?
- dl_library_enthusiast 6 months ago next
  It should work with both TensorFlow and PyTorch, but it would require some tinkering to adapt the models in the source code. <https://github.com/Belval/ctc-transform>
optical_illusion 6 months ago prev next
Any pointers on the overall accuracy rates vs. traditional OCR algorithms?
- metrics_analyst 6 months ago next
  In certain cases, this approach has demonstrated improvements in accuracy over traditional OCR algorithms, especially when dealing with handwriting or warped text. <https://distill.pub/2017/scan-read-the-world/>
curious_hacker 6 months ago prev next
This is groundbreaking! Have you posted this research to arXiv or another paper repository?
- deeplearner 6 months ago next
  @curious_hacker, yes, you can find the research article here: <https://arxiv.org/abs/1903.08907>

deeplearner 6 months ago next
This is a fascinating approach! I've been experimenting with deep learning in OCR as well and the results are impressive.
- hacker_news_bot 6 months ago next
  Agreed, the examples show the potential, any idea on implementation details and possible use cases?
  deeplearner 6 months ago next
  It uses convolutional neural networks (CNNs) and a technique called Connectionist Temporal Classification (CTC). It could be useful in converting handwritten medical records into digital data. <https://arxiv.org/abs/1903.08907>
- ml_specialist 6 months ago prev next
  I think this could also be beneficial for digitizing old books and archived documents. The challenge lies in differentiating various fonts and styles in old documents.
  deeplearner 6 months ago next
  @ml_specialist, that's correct, and there are domains that further specialize in understanding and distinguishing hundreds and thousands of fonts. <http://www.fonts.com/content/learning/fontology/level-1/how-type-works/classifications>
tech_fan 6 months ago prev next
Is it possible to utilize this for non-Latin character sets, such as Japanese and Chinese?
- ai_engineer 6 months ago next
  It's probable that you may need to adjust the network structure and the CTC process, but I believe there's no theoretical limitation to use different character sets. <https://www.chinese-word-rosets.org/wiki/index.php/Deep_learning_methods_for_Chinese_OCR>
programmer_extraordinaire 6 months ago prev next
Sounds amazing. Wonder how easy it would be to port this into Python, with TensorFlow or PyTorch?
- dl_library_enthusiast 6 months ago next
  It should work with both TensorFlow and PyTorch, but it would require some tinkering to adapt the models in the source code. <https://github.com/Belval/ctc-transform>
optical_illusion 6 months ago prev next
Any pointers on the overall accuracy rates vs. traditional OCR algorithms?
- metrics_analyst 6 months ago next
  In certain cases, this approach has demonstrated improvements in accuracy over traditional OCR algorithms, especially when dealing with handwriting or warped text. <https://distill.pub/2017/scan-read-the-world/>
curious_hacker 6 months ago prev next
This is groundbreaking! Have you posted this research to arXiv or another paper repository?
- deeplearner 6 months ago next
  @curious_hacker, yes, you can find the research article here: <https://arxiv.org/abs/1903.08907>