Untitled Document
| |
|
|
| |
OLS Digital Library |
| |
DL Home => Proceedings => WINBIS'10 => Citation |
| |
|
| |
Automatic
Text Error Detection and Correction Using Cross-Field Technique |
| |
Full text |
PDF |
| |
Source |
International Conference on Wireless Information Networks & Business Information System |
| |
|
Kathmandu, Nepal |
| |
|
Pages :110 - 117 |
| |
|
Year of Publication : 2010 |
| |
|
ISSN : 2091-0266 |
| |
Authors |
Thakerng
Wongsirichot and Sukgamon Sukpisit |
| |
|
Prince of Songkla University , Songkhla, Thailand |
| |
|
|
| |
Sponsor |
: Open Learning Society (P) Ltd. |
| |
Abstract : |
|
| |
|
In the context of quantitative analyses, data quality has been mentioned as one of the most important properties. In real world scenarios, a dataset file usually contains a number of text errors. Researchers attempt to discover techniques especially automatic ones to diminish the errors. From our previous research, a technique of simple clustering and similarity measures were selected for text error detections and corrections. The overall performance achieved 85%. This research paper presents a technique of cross-field analysis in order to improve text error detection and correction performances. The overall performance reaches up to 98.98%. Text error detection; Text error correction; Misspelling words Similarity measures; Cross-Field technique;
|
| |
References : |
|
| |
|
-
[1] K. Kukich, “Techniques for automatically correcting words in text,” ACM Computing Surveys, vol. 24, pp.377–439, December 1992.
-
W. Wong, W. Liu, and M. Bennamoun, “Integrated scoring for spelling error correction, abbreviation expansion and case restoration in dirty text,” Proceeding Fifth Australasia Data Mining Conference (AusDM2006), pp.83–89, November 2006.
-
S. W. Chan, B. He, and I. Ounis, “An in-depth study of the automatic detection and correction of spelling mistakes,” 5th Dutch-Belgian Information Retrival Workshop, vol. 5 , pp.71–82 , January 2005.
-
G. Navarro, “A guided tour to approximate string matching,” ACM Computing Surveys, vol. 33, pp.31–88, March 2001.
-
D. Lin, “An information-theoretic definition of similarity,” Proceeding of the Fifteenth International Conference on Machine Learning, pp. 296– 304, July 1998.
-
J. S. Simonoff, Analyzing Categorial Data. NY: New York University, 2003.
-
T. Wongsirichot, N. Singhakosit, and P. Chootiraka, “An automated error detection and correction tool for enhancing data preprocessing efficiency,” Proceeding 2008
|
| |
|
|
| |
|
|
| |
|
|
|