Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise

Vladimir Grigorievich Spitsyn, Yuliya Alexandrovna Bolotova, Ngoc Hoang Phan, Thi Thu Trang Bui

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

In this paper we propose a novel algorithm for optical character recognition in the presence of impulse noise by applying a wavelet transform, principal component analysis, and neural networks. In the proposed algorithm, the Haar wavelet transform is used for low frequency components allocation, noise elimination and feature extraction. The principal component analysis is used to reduce the dimension of the extracted features. We use a set of different multi-layer neural networks as classifiers for each character; the inputs are represented by a reduced set of features. One of the key features of the proposed approach is creating a separate neural network for each type of character. The experimental results show that the proposed algorithm can effectively recognize the characters in images in the presence of impulse noise; the results are comparable with ABBYY FineReader and Tesseract OCR.

Original languageEnglish
Pages (from-to)249-257
Number of pages9
JournalComputer Optics
Volume40
Issue number2
DOIs
Publication statusPublished - 1 Mar 2016

Fingerprint

Optical character recognition
Impulse noise
principal components analysis
wavelet analysis
Principal component analysis
Wavelet transforms
impulses
Neural networks
character recognition
Multilayer neural networks
classifiers
noise reduction
pattern recognition
Feature extraction
Classifiers
low frequencies

Keywords

  • Neural networks
  • Optical character recognition
  • Principal component analysis
  • Wavelet transform

ASJC Scopus subject areas

  • Atomic and Molecular Physics, and Optics
  • Electrical and Electronic Engineering
  • Computer Science Applications

Cite this

Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. / Spitsyn, Vladimir Grigorievich; Bolotova, Yuliya Alexandrovna; Phan, Ngoc Hoang; Bui, Thi Thu Trang.

In: Computer Optics, Vol. 40, No. 2, 01.03.2016, p. 249-257.

Research output: Contribution to journalArticle

Spitsyn, Vladimir Grigorievich ; Bolotova, Yuliya Alexandrovna ; Phan, Ngoc Hoang ; Bui, Thi Thu Trang. / Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. In: Computer Optics. 2016 ; Vol. 40, No. 2. pp. 249-257.
@article{d5df607b681542f689554d7f61642342,
title = "Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise",
abstract = "In this paper we propose a novel algorithm for optical character recognition in the presence of impulse noise by applying a wavelet transform, principal component analysis, and neural networks. In the proposed algorithm, the Haar wavelet transform is used for low frequency components allocation, noise elimination and feature extraction. The principal component analysis is used to reduce the dimension of the extracted features. We use a set of different multi-layer neural networks as classifiers for each character; the inputs are represented by a reduced set of features. One of the key features of the proposed approach is creating a separate neural network for each type of character. The experimental results show that the proposed algorithm can effectively recognize the characters in images in the presence of impulse noise; the results are comparable with ABBYY FineReader and Tesseract OCR.",
keywords = "Neural networks, Optical character recognition, Principal component analysis, Wavelet transform",
author = "Spitsyn, {Vladimir Grigorievich} and Bolotova, {Yuliya Alexandrovna} and Phan, {Ngoc Hoang} and Bui, {Thi Thu Trang}",
year = "2016",
month = "3",
day = "1",
doi = "10.18287/2412-6179-2016-40-2-249-257",
language = "English",
volume = "40",
pages = "249--257",
journal = "Computer Optics",
issn = "0134-2452",
publisher = "Institution of Russian Academy of Sciences, Image Processing Systems Institute of RAS",
number = "2",

}

TY - JOUR

T1 - Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise

AU - Spitsyn, Vladimir Grigorievich

AU - Bolotova, Yuliya Alexandrovna

AU - Phan, Ngoc Hoang

AU - Bui, Thi Thu Trang

PY - 2016/3/1

Y1 - 2016/3/1

N2 - In this paper we propose a novel algorithm for optical character recognition in the presence of impulse noise by applying a wavelet transform, principal component analysis, and neural networks. In the proposed algorithm, the Haar wavelet transform is used for low frequency components allocation, noise elimination and feature extraction. The principal component analysis is used to reduce the dimension of the extracted features. We use a set of different multi-layer neural networks as classifiers for each character; the inputs are represented by a reduced set of features. One of the key features of the proposed approach is creating a separate neural network for each type of character. The experimental results show that the proposed algorithm can effectively recognize the characters in images in the presence of impulse noise; the results are comparable with ABBYY FineReader and Tesseract OCR.

AB - In this paper we propose a novel algorithm for optical character recognition in the presence of impulse noise by applying a wavelet transform, principal component analysis, and neural networks. In the proposed algorithm, the Haar wavelet transform is used for low frequency components allocation, noise elimination and feature extraction. The principal component analysis is used to reduce the dimension of the extracted features. We use a set of different multi-layer neural networks as classifiers for each character; the inputs are represented by a reduced set of features. One of the key features of the proposed approach is creating a separate neural network for each type of character. The experimental results show that the proposed algorithm can effectively recognize the characters in images in the presence of impulse noise; the results are comparable with ABBYY FineReader and Tesseract OCR.

KW - Neural networks

KW - Optical character recognition

KW - Principal component analysis

KW - Wavelet transform

UR - http://www.scopus.com/inward/record.url?scp=84967025622&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84967025622&partnerID=8YFLogxK

U2 - 10.18287/2412-6179-2016-40-2-249-257

DO - 10.18287/2412-6179-2016-40-2-249-257

M3 - Article

VL - 40

SP - 249

EP - 257

JO - Computer Optics

JF - Computer Optics

SN - 0134-2452

IS - 2

ER -