Technologies Overview for Typo Segregation
DOI:
https://doi.org/10.31861/sisiot2024.1.01009Keywords:
typo, spelling error, typo detection methods, automation of typo correctionAbstract
The article focuses particularly on the difference between typos (accidental mechanical errors) and spelling or conceptual errors that arise from insufficient knowledge of language rules. Modern typo detection methods are analyzed, highlighting the advantages and disadvantages of each. The Levenshtein method is one of the most common algorithms for detecting and correcting errors in text. It effectively identifies and corrects errors in short words where the number of operations to convert the erroneous word to the correct one is small. However, this method does not consider the context in which the word is used, which can lead to incorrect corrections. The keyboard layout-based typo detection method analyzes probable errors that can occur due to the proximity of keys on the keyboard. It is simple to implement and integrate into existing spell-checking systems but does not consider the context of word usage. The contextual analysis method for typo detection relies on using contextual information to identify and correct errors in text, requiring significant computational resources and a large, diverse corpus of texts for effective model training. Deep models, such as BERT or GPT, consider the context of entire sentences or even larger text blocks, allowing for high accuracy in typo detection but require significant computational resources for training and inference, as well as large volumes of high-quality data for training. Machine learning methods, such as n-grams and Bayesian classifiers, show significant potential due to their simplicity and efficiency but may not account for complex dependencies between words and context, reducing their accuracy. The study highlights the importance of accurate error detection in student assessment systems, where typos can affect final grades and the relevance of answers.
Downloads
References
A. A. Khansir and F. Pakdel, "Place of error correction in English language teaching," Educational Process: International Journal, vol. 7, no. 3, pp. 189-199, 2018.
D. Hládek, J. Staš, and M. Pleva, "Survey of automatic spelling correction," Electronics, vol. 9, no. 1670, 2020.
F. J. Damerau, "A technique for computer detection and correction of spelling errors," Commun. ACM, vol. 7, no. 3, pp. 171-176, 1964.
Y. Korolekh and G. Zavolodko, "Enhancing digital search: Synergizing the Levenshtein algorithm with NLP techniques," in IX International Scientific and Practical Conference "Scientific Problems and Options for Their Solution," Bucharest, Romania, Feb. 7-9, 2024, International Scientific Unity, pp. 60-64.
D. Ittner and H. Baird, "Programmable contextual analysis," in Document Analysis Systems, A. Spitz and A. Dengel, Eds. Singapore: World Scientific, 1995, pp. 76-92.
E. Puerto, J. Aguilar, and A. Pinto, "Automatic spell-checking system for Spanish based on the Ar2p neural network model," Computers, vol. 13, no. 3, p. 76, 2024.
V. C. Mawardi, F. Augusfian, J. Pragantha, and S. Bressan, "Spelling correction application with Damerau-Levenshtein distance to help teachers examine typographical error in exam test scripts," E3S Web Conf., vol. 188, p. 00027, Sep. 2020, doi: 10.1051/e3sconf/202018800027.
W. Clarissa and F. P. Putri, "MeDict: Health dictionary application using Damerau-Levenshtein distance algorithm," IJNMT (International J. New Media Technol.), vol. 7, no. 2, pp. 98-101, 2020, doi: 10.31937/ijnmt.v7i2.1654.
L. Cheng, P. Ben, and Y. Qiao, "Research on automatic error correction method in English writing based on deep neural network," Computational Intelligence and Neuroscience, vol. 2022, Article ID 2709255, 2022.
J.-H. Lee, M. Kim, and H.-C. Kwon, "Deep learning-based context-sensitive spelling typing error correction," IEEE Access, vol. 8, pp. 152565-152578, 2020.
J. Long, "A grammatical error correction model for English essay words in colleges using natural language processing," Mobile Information Systems, vol. 2022, no. 5, pp. 1-9, Jul. 2022.
Published
Issue
Section
License
Copyright (c) 2024 Security of Infocommunication Systems and Internet of Things
This work is licensed under a Creative Commons Attribution 4.0 International License.