Open Learning Society
Untitled Document
     
  OLS Digital Library
  DL Home => Proceedings => WINBIS'10 => Citation
   
  Automatic Text Error Detection and Correction Using Cross-Field Technique
  Full text PDF
  Source International Conference on Wireless Information Networks & Business Information System
    Kathmandu, Nepal
    Pages :110 - 117
    Year of Publication : 2010
    ISSN : 2091-0266
  Authors Thakerng Wongsirichot and Sukgamon Sukpisit
   

Prince of Songkla University , Songkhla, Thailand

     
  Sponsor : Open Learning Society (P) Ltd.
  Abstract :  
   

 In the context of quantitative analyses, data quality has  been mentioned as one of the most important properties. In real  world scenarios, a dataset file usually contains a number of text  errors. Researchers attempt to discover techniques especially automatic ones to diminish the errors. From our previous research, a technique of simple clustering and similarity measures were selected for text error detections and corrections. The overall performance achieved 85%. This research paper presents a technique of cross-field analysis in order to improve text error detection and correction performances. The overall performance reaches up to 98.98%.  Text error detection; Text error correction; Misspelling words  Similarity measures; Cross-Field technique;

  References :  
   
  1. [1] K. Kukich, “Techniques for  automatically correcting words in text,” ACM Computing Surveys, vol. 24,   pp.377–439, December 1992.

  2. W. Wong, W. Liu, and M. Bennamoun, “Integrated scoring for spelling error  correction, abbreviation expansion and case restoration in dirty text,” Proceeding  Fifth Australasia Data Mining Conference  (AusDM2006), pp.83–89, November 2006.

  3. S. W. Chan, B. He, and I. Ounis, “An  in-depth study of the automatic detection  and correction of spelling mistakes,” 5th  Dutch-Belgian Information Retrival  Workshop, vol. 5 , pp.71–82 , January  2005.

  4. G. Navarro, “A guided tour to approximate string matching,” ACM Computing Surveys, vol. 33, pp.31–88,  March 2001.

  5. D. Lin, “An information-theoretic  definition of similarity,” Proceeding of the  Fifteenth International Conference on  Machine Learning, pp. 296– 304, July 1998.

  6. J. S. Simonoff, Analyzing Categorial  Data. NY: New York University, 2003.

  7. T. Wongsirichot, N. Singhakosit, and  P. Chootiraka, “An automated error  detection and correction tool for enhancing  data preprocessing  efficiency,”  Proceeding 2008

     
     
     
     
© Copyright 2011 Open Learning Society– All Rights Reserved