Proposal for “Optical character recognition” By.Isha Govind Roll no 47 Div.-D OBJECTIVE To study the development in Optical character recognition method.
INTRODUCTION In today’s world advancement in sophisticated scientific techniques is pushing further the limits of human outreach in various fields of technology. One such field is the field of character recognition commonly known as OCR (Optical Character Recognition).In this fast paced world there is an immense urge for the digitalization of printed documents and documentation of information directly in digital form. And there is still some gap in this area even today. OCR techniques and their continuous improvisation from time to time are trying to fill this gap. Optical character recognition is a scheme which enables a computer to learn, understand, improvise and interpret the printed character in their own language, but present correspondingly as specified by the user. Optical Character Recognition uses the image processing technique to identify any character computer/typewriter printed. A lot of work has been done in this field. But a continuous improvisation of OCR techniques is being done based on the fact that algorithm must have higher accuracy of recognition, higher persistency in number of times of correct prediction and increased execution time.
PURPOSE OF RESEARCH The idea is to device efficient algorithms which get input in digital image format. After that it processes the image for better comparison. Then after the processed image is compared with already available set of font images. The last step gives the text in the digital image in an editable form which can be used in future. It provides a user module in which only the authorized and the registered users have the accessibility to use the DMS (Document management system) and use the OCR conversion technology. This provides security and authentication to the system. It also involves a admin module in which admin will be monitoring all the tasks performed by the users as well as has the power to discard any of the user found unauthorized. If any of the new user wants to access to or download any file of a DMS ,the request will be sent to the admin and after checking his record admin will decide whether the access permission to that file should be granted or not to the users.
RESEARCH METHODOLOGY Research methodology involves 5 steps. They are: a) Preprocessing of image b) Indexing and boxing of characters c) Cropping, reshaping and resizing d) Training and testing the network e) Identification of characters
CONCLUSION The implementation of Optical Character Recognition in Document Management System can be efficiently used to speed up the conversion of image based documents into editable format text in a file are currently easy to discover, process and search. The OCR technology and Document Management System is one of the most attractive, labor reducing technology. The recognition of characters by this system is very quick. The extension to software other than editing and searching is topic to future works.
FUTURE SCOPE Training and future recognition speed can be increased greater by greater making it more user friendly. Many application exists where it would be desirable to read handwritten entries. As reading handwriting is very difficult task, though the progress will be made. So, in future we try to be improved the accuracy of the presentation Optical Character Recognition system by better image preprocessing of image of any angle and feature extracting method and add many features through which used will be more served in Document Management System.
REFRENCES
READIRIS PRO 16: OCR SOFTWARE MORE FOCUSED ON SPEED THAN ACCURACY- BY J.R. BOOKWALTER When Paper Meets the Paperless World- By Andrew D. Gross, Daniel G. Neely, and Juergen Sidgman Vamvakas, Athens Gatos, “A Complete Optical Character Recognition Methodology for Historical Documents”. Ayatullah Faruk Mollah1, Nabamita Majumder2, Subhadip Basu3 and Mita Nasipuri, “Design of an Optical Character Recognition System for Camerabased Handheld Devices”. Te ́ofilo E. de Campos, Bodla Rakesh Babu, Manik Varma , “Character recognition in natural images ”. Vivek Kumar Verma, Pradeep Kumar Tiwari, “Removal of Obstacles in Devanagari Script For Efficient Optical Character Recognition”. Anwar Ali Sanjrani, Junaid Baber, Maheen Bakhtyar, Waheed Noor, Muhammad Khalid, “Handwritten Optical Character Recognition System for Sindhi Numerals”.