Project Seminar Dibakar

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Project Seminar Dibakar as PDF for free.

More details

  • Words: 1,117
  • Pages: 34
TITLE CHARACTER RECOGNITION

PLATFORM • Operating System : Windows XP/2000 • Language : Visual Basic.net

PROJECTEES • • • • • • • •

BOUDHAYAN MAITY BRATIN BISWAS DIBAKAR SINHA GURUSHARAN SINGH HIMANGSHU HAZARIKA NITIN TOMAR PARAG DHAR BARUAH YASSER HAZARIKA

GUIDED BY: MR. AMOL G. MULEY.

INTRODUCTION The recognition of characters is known to be one of the earliest application of artificial Neural Network which partially emulate human thinking in the domain of artificial intelligence.

WHY VB.NET • Faster and the easiest way to create applications for Microsoft Windows. • Provides a complete set of tools to simplify rapid application development. • Visual basic .NET avoids writing of numerous lines of codes to describe the appearance and location of interface elements. • VB.NET provides a graphical environment in which you visually design the forms and controls.

PROBLEM DEFINITION • The same characters differ in size, shape and style from person to person and even from time to time for the same person. • Like any image, visual characters are subject to spoilage due to noise. • There are no hard and fast rule that define the appearance of visual character. Hence rules need to be heuristically deduced from samples.

Human system of vision is excellent in the sense of the following qualitiesThe human brain is adaptive to minor changes and errors in visual pattern. Thus we are able to read the handwritings of many people despite different styles of writing. The human vision system learns from experience. Thus we are able to grasp newer styles and scripts easily. The human vision system is immune to most variations of size, aspect ratio, location and orientation of visual characters.

PROBLEM SOLUTION We solve the problem using ANN which have the following capabilities------• Adaptive to the changing environment. • Learning from prior experience • Using of image digitization, learning mechanism and employed architecture.

Task Involved • Segmentation : Given input image, indentify individual glyphs. • Feature extraction : From each glyph image extract features to be used as input of ANN, this is the most critical part of this approach. • Classification : Train the ANN using training sample. Then given new glyphs, classify it.

Single Discrete Perceptron Training Algorithm (SDPTA) • We will begin to examine neural network classifiers that derive their weights during the learning cycle. • The sample pattern vectors X1, X2, …, Xp, called the training sequence, are presented to the machine along with the correct response. • Based on the Perceptron learning rule seen

earlier.

Given are P training pairs {X1,d1,X2,d2....Xp,dp}, where Xi is (n*1) di is (1*1) i=1,2,...P Yi= Augmented input pattern( obtained by appending 1 to the input vector) i=1,2,…P In the following, k denotes the training step and p denotes the step counter within the training cycle Step 1: c>0 is chosen. Step 2: Weights are initialized at w at small values, w is (n+1)*1. Counters and error are initialized. k=1,p=1,E=0 Step 3: The training cycle begins here. Input is presented and output computed: Y=Yp, d=dp O=sgn(wtY)

SDPTA contd.. Step 4: Weights are updated: W=W+1/2c(d-o)Y Step 5: Cycle error is computed: E=1/2(d-o)2+E Step 6: If p

0,then E=0 ,p=1, and enter the new training cycle by going to step 3.

IMAGE DETECTION The procedure for analyzing images to detect characters consist of two steps: • Determining character lines • Detecting individual symbols

DETERMINING CHARACTER LINES Enumeration of character lines in a character image (‘page’) is essential in delimiting the bounds within which the detection can proceed. Thus detecting the next character in an image does not necessarily involve scanning the whole image all over again.

DETECTING INDIVIDUAL SYMBOLS Detection of individual symbols involves scanning character lines for orthogonally separable images composed of black pixels.

Line and Character boundary detection The detected character bound might not be the actual bound for the character. This issue arises with the height and bottom alignment irregularity that exists with printed alphabetic symbols. Thus a line top does not necessarily mean top of all characters and a line bottom might not mean bottom of all characters as well. Hence a confirmation of top and bottom for the character is needed.

An optional confirmation algorithm implemented in the project is: • Start at the top of the current line and left of the character • Scan up to the right of the character – if a black pixels is detected register y as the confirmed top – if not continue to the next pixel – if no black pixels are found increment y and reset x to scan the next horizontal line

IMAGE DIGITIZATION • Need of It • Way of it

Need of Image Digitization • Image may provide pictures and colors that do not provide useful information in the instant sense of character recognition. • Character which need to be single analyzed may exist as word clusters or may be located at various point of the document.

Way of Image Digitization • The input image is sampled into binary window which forms the input to the recognition system.

Algorithm of Digitization • In order to be able to feed the matrix data to the network (which is of a single dimension) the matrix must first be linearized to a single dimension. This is accomplished with a simple routine with the following algorithm: • start with the first matrix element (0,0)

• increment x keeping y constant up to the matrix width – map each element to an element of a linear array (increment array index) – if matrix width is reached reset x, increment y

• repeat up to the matrix height (x,y)=(width, height

MATRIX MAPPING

TRAINING • Those patterns will be used for teaching the neural network to recognize the images. Basically, each training pattern consists of two single-dimensional arrays of float numbers – Inputs and Outputs arrays. The Inputs array contains your input data. In our case it is a digitized representation of the character's image. Output array will

Network Architecture • In this system the candidate pattern is the input • Block M provides input matrix M to the weight blocks Wk for each K.

REFERENCES • An Introduction to Neural Networks, James A. Anderson. • T. Allen, W. Hunter, M. Jacobson, and M. Miller. Comparing several discrete handwriting recognition algorithms.

Acknowledgement We are grateful to our lecturers and especially our guide Mr. AMOL G. MULEY who has helped in undergoing this project. Moreover if inadvertently we have committed any mistakes, we would highly appreciate if our lecturers rectify it with the same.


Related Documents