Analysis of Methods for Classification and Aggregation of Textual Data From Images

Authors

DOI:

https://doi.org/10.31861/sisiot2024.1.01008

Keywords:

text recognition, machine learning, data processing automation, multilingual texts, comparative analysis

Abstract

This study investigates modern methods of text recognition from images, specifically comparing optical character recognition and intelligent character recognition. The technologies of machine learning, including convolutional and recurrent neural networks, are compared based on criteria such as accuracy and efficiency in processing handwritten and printed texts. The advantages and limitations of existing solutions for forming digital documents from images containing various handwriting styles and low-quality text images are analyzed. Key challenges associated with processing multilingual texts are identified, and future prospects for the development of text recognition technologies are discussed.

Downloads

Download data is not yet available.

Author Biographies

  • Bohdan Popovych, National Technical University "Kharkiv Polytechnic Institute"

    Bohdan, 23 years old, studied and obtained his bachelor's and master's degrees at NTU "KhPI." In 2023, he researched the use of AI in web applications while working on his master's thesis. In 2024, he joined the team at ReMnemo, a high-potential startup that successfully completed an acceleration program. At ReMnemo, he is engaged in the application of AI in various projects.

  • Ganna Zavolodko, National Technical University "Kharkiv Polytechnic Institute"

    Ganna, 46 years old, Ph.D., Associate Professor at NTU "KhPI," IEEE Senior Member; CEO and co-founder of ReMnemo.

References

Wikipedia, “Intelligent Word Recognition.” [Online]. Available: https://en.wikipedia.org/wiki/Intelligent_word_recognition. [Accessed: Jul. 29, 2024].

ABTO Software, “Intelligent Character Recognition (ICR) of Handwritten Text.” [Online]. Available: https://www.abtosoftware.com/intelligent-character-recognition-icr. [Accessed: Jul. 29, 2024].

Shufti Pro, “Demand for OCR Technology Increasing in ID Verification.” [Online]. Available: https://shuftipro.com/demand-for-ocr-technology-increasing-in-id-verification. [Accessed: Jul. 29, 2024].

Label Your Data, “What is ICR Technology.” [Online]. Available: https://www.labelyourdata.com/what-is-icr-technology. [Accessed: Jul. 29, 2024].

AWS, “What is OCR.” [Online]. Available: https://aws.amazon.com/what-is/ocr/. [Accessed: Jul. 29, 2024].

Shufti Pro, “Intelligent Character Recognition (ICR) Software: One Step Ahead of OCR.” [Online]. Available: https://shuftipro.com/intelligent-character-recognition-software. [Accessed: Jul. 29, 2024].

M. M. Nayak and D. Vaidehi, “Handwritten Character Recognition Using CNN,” International Journal of Computer Engineering and Technology (IJCET), vol. 15, no. 3, pp. 219-229, 2024.

S. L. Wasankar, H. Mahajan, D. Deshmukh, and H. Munot, “Self Intelligence with Text Recognization,” Government College of Engineering, Amravati, India, 2024.

Techopedia, “How Intelligent Character Recognition (ICR) is Overcoming OCR Limitations in Document Processing.” [Online]. Available: https://www.techopedia.com/how-intelligent-character-recognition-icr-is-overcoming-ocr-limitations-in-document-processing. [Accessed: Jul. 29, 2024].

F. M. Shiri, T. Perumal, N. Mustapha, and R. Mohamed, “A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU,” pp. 9-11, 2023.

R. Smith, “An Overview of the Tesseract OCR Engine,” in Ninth International Conference on Document Analysis and Recognition (ICDAR), pp. 629-633, 2007.

S. Malakar and P. Roy, “A Study on the Impact of Intelligent Character Recognition (ICR) on Digitizing Handwritten Documents,” International Journal of Advanced Research in Computer Science and Software Engineering, pp. 15-20, 2018.

Downloads


Abstract views: 26

Published

2024-08-30

Issue

Section

Articles

How to Cite

[1]
B. Popovych and G. Zavolodko, “Analysis of Methods for Classification and Aggregation of Textual Data From Images”, SISIOT, vol. 2, no. 1, p. 01008, Aug. 2024, doi: 10.31861/sisiot2024.1.01008.

Similar Articles

1-10 of 30

You may also start an advanced similarity search for this article.