![]() This research proposed en-hanced ensemble technique to overcome the drawbacks of existing techniques. Existing ensemble techniques need to be more effective to reduce OCR error rate. They are passed through the OCR engine to turn them into different OCR outputs, which later leads to select the best between them. These versions are similar but not identical. The idea of the ensemble recognition techniques is to produce N-versions of an input image. Ensemble recognition techniques are used to improve OCR accuracy. ![]() OCR systems often produce poor accu-racy for noisy images. ![]() Optical character recognition (OCR) is the electronic transformation of images into a computer-encoded text. Further, we show that only 2.0% of the full training corpus of over 500,000 feature cases is needed to achieve correction results comparable to those using the entire training corpus, effectively reducing both the complexity of the training process and the learned correction model. New features capture the recurrence of hypothesis tokens and yield an additional relative reduction in WER of 2.30%. Second, we show the strength of lexical features from the training sets on two unrelated test sets, yielding a relative reduction in word error rate on the test sets of 6.52%. First, we correct errors using conditional random fields (CRF) trained on synthetic training data sets in order to demonstrate the applicability of the methodology to unrelated test sets. assessing the data requirements of the correction learning method. enhancing the correction algorithm with novel features, and 3. in demonstrating the applicability of novel methods for correcting optical character recognition (OCR) on disparate data sets, including a new synthetic training set, 2. Building on our earlier work, the contributions of this paper are: 1. As the digitization of historical documents, such as newspapers, becomes more common, the need of the archive patron for accurate digital text from those documents increases.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |