Voronoi Skelton

  • Uploaded by: mami
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Voronoi Skelton as PDF for free.

More details

  • Words: 1,707
  • Pages: 32
Document Analysis: philozophyly of Voronoi skeletization

spring semester 2008

Prénom Nom

Outline    

 

Objectives Typology Processing chain Methodology  Character recognition  Word recognition An OCR experiment Conclusion

2

© Prof. Rolf Ingold

Objectives  

Text recognition is the most advanced domain of document analysis It aims at analyzing images to extracting text, i.e. sequences of character codes (ASCII/Unicode)

3

© Prof. Rolf Ingold

Character recognition typology   



Machine printed vs. handwritten text  on-line vs. off-line handwriting Isolated characters, connected characters, cursive text Various Alphabet  Western languages (Roman, Greek,  Asian (Chinese, Japanese, Korean, Thai, ...  Arabic alphabets Limited vocabulary  only numbers, only uppercase text  with/without diacritics/punctuation  restricted vocabulary (city names, street names, ...)  language  contextual knowledge

4

© Prof. Rolf Ingold

Factor influencing performance 







Variability in style  single scriber, omni-scriber  mono-font, multi-font, omni-font Geometrical variability  in size  in orientation (rotated)  in transformations (slanted, perspective view, ...) Image resolution  binary images, starting at 200 dpi  grey-level images starting at 150 dpi Image quality  degraded support (historical documents)  acquisition conditions (bad illumination, optical aberration, noise, ...) 5

© Prof. Rolf Ingold

European languages 

European text is characterized by  limited set of characters (26 to ~100 classes)  diacritics and punctuation  isolated characters and cursive scripts  left-to right and top-to-bottom writing  large variety of fonts  different handwriting styles

6

© Prof. Rolf Ingold

Arabic 

Arabic text is characterized  right-to-left writing  limited set of characters  context dependent glyphs  connected characters  diacritics  justification by word stretching

7

© Prof. Rolf Ingold

Asian scripts 

Asian text is characterized  numerous scripts (hanzi, kanji, hanja, han tu, hiragana, katakana, ...)  horizontal and vertical writing  very large alphabets  structured characters  grid based layout

8

© Prof. Rolf Ingold

OCR Methodologies 

OCR systems are very complex and combine several steps  image segmentation into characters or isolated shapes  preprocessing of hypothetical segmented characters  size normalization, morphological filtering, thinning, ...  feature extraction of isolated shapes  global measures (width to height ratio, density, center of gravity, moments, ...)  local properties (stroke thickness, ...)  shape identification using  a single classifier (neural network, support vector machine, ...)  multiple classifiers and fusion methods  word validation using contextual information

9

© Prof. Rolf Ingold

Character vs. word recognition 

There is a paradox with cursive text or connected characters:  character recognition supposes prior character segmentation  character segmentation requires prior character recognition



Several approaches to bypass this paradox  entire word modeling and recognition  multiple hypothesis generation and testing  combined character segmentation and recognition (HMM)

10

© Prof. Rolf Ingold

Processing chain line segmentation word segmentation character segmentation isolated character rec.

word recognition

normalization

normalization

feature/primitive extr.

feature/primitive extr.

identification

identification

post-analysis recognized word

11

© Prof. Rolf Ingold

Isolated character recognition 





Isolated character recognition is applicable  on high quality printed text  on constrained handwriting (forms) The challenge is to take into account the variability of the class

Performance depends on  size of alphabet (number of classes)  image quality

12

© Prof. Rolf Ingold

Several classification strategies 

Direct comparison with class model



Statistical pattern recognition  using features



Structural pattern recognition  using primitives



Hybrid approaches combining statistical and structural approaches



Use of multiple classifiers and fusion of their results

13

© Prof. Rolf Ingold

Comparison with class model or class samples 

The unknown pattern is compared with one representative of each class (model)  a similarity measure is returned  decision is determined by most similar sample  rejection may occur if similarity is above a threshold

14

© Prof. Rolf Ingold

Similarity measures 

Hamming distance



Warping distance

15

© Prof. Rolf Ingold

Features for statistical approaches 



Someimes preprocessing at image level is required  smoothing  size normalization  stroke normalization  skeletization  ... Features are extracted  horizontal and vertical projection profile  central moments  intersections with lines  global transforms (Hough, Fourier, ...)  local features (densities, moments, ...)  ...

16

© Prof. Rolf Ingold

Central moments 

Central moments are shift invariant properties defined as

µ pq = ∑x ∑y ( x − x) p ( y − y) q f ( x, y) with 

x=

m10 m00

y=

m01 m00

mpq = ∑x ∑y x p y q f ( x, y)

They can be computed using the following formulas

µ00 = m00 µ10 = 0 µ01 = 0 µ20 = m20 − m10 x µ02 = m02 − m01 x

µ11 = m11 + m10 y 2 µ12 = m12 − m02 x − 2m11 y + 2m10 y 2 µ21 = m21 − m20 y − 2m11 x + 2m01 x 2 µ30 = m30 − 3m20 x + 2m10 x 2 µ03 = m03 − 3m02 y + 2m01 y 17

© Prof. Rolf Ingold

Primitives for structural approaches 



Shapes are decomposed in strokes and several properties are extracted  number of connected components  number of holes  number and relative position of singular points  extremities  connections  crossings  concavities, convexities  ... These primitives are represented as  strings  trees  graphs and used for comparison 18

© Prof. Rolf Ingold

Identification 

For statistical approaches, different classifier are used  discriminant functions  kNN classifier  multi-layer perceptron  support  ...



For statistical approaches use  hierarchical classification  string distances  graph matching

19

© Prof. Rolf Ingold

Multilayer perceptron 

Information is propagated throw a layered network of "neurons"  decision is given by the highest activation on the output layer  weights of connections are computed in a training phase

20

© Prof. Rolf Ingold

Hierarchical classification 

Structural pattern recognition can be performed hierarchically

21

© Prof. Rolf Ingold

OCR Difficulties 

The main sources of errors are  variability of character shapes (special fonts, handwriting)  image defects : noise and distortions  broken or touching characters

 shape similarity ("0" and "O", "1", "I" and "l", "5" and "S", ...)  small shapes : punctuation, accents, superscripts ("er", "ème")  special characters ("©", "½", "±", ...) or bullets

22

© Prof. Rolf Ingold

Word recognition 

Text recognition at word level makes sense  in case of restricted vocabulary  for language driven approaches  for knowledge based approaches  for keyword spotting



Word recognition is typically used  cursive scripts  handwriting  noisy text, difficult to segment  low resolution text

23

© Prof. Rolf Ingold

Word Recognition specificities 





Word recognition is more complex than character recognition  usually the number of classes is much higher  more features are needed Word recognition can take into account external information  language based knowledge  dictionary  character frequencies, bigrams, trigrams, etc.  structural constraints (security number, dates, ...)  restricted vocabulary (e.g. city names, street names, ...)  redundancy (e.g. zip codes and city names) Hidden Markov Models (HH) are suited for word recognition

24

© Prof. Rolf Ingold

Hidden Markov models (1) 

Each class is modeled by a two stage stochastic process using hidden and visible states



A model λ=(A,B, π) is composed of  A, the matrix of transition probabilities

aij = P( qt = x j | qt −1 = xi )  B, the matrix of observation probabilities

b j ( k ) = P( yk | qt = x j )  π, the vector of the initial state probabilities

π i = P(q0 = xi )

25

© Prof. Rolf Ingold

Hidden Markov models (2) 

The probability of an observation can be computed using

P (O | λ ) = ∑ P (O, Q | λ ) = ∑ P (O | λ , Q) P (Q | λ ) Q

=

∑π

q

Q

bq1 (o1 )aq1q2 bq2 (o2 ) ... aqT −1qT bqT (oT )

q1 ,... qT



A pattern is assigned to the model with highest posterior probability (i.e, the model that best explains the pattern)



The parameters of the model (probabilities) are determined in a training phase using training samples

26

© Prof. Rolf Ingold

Post-analysis 

Post-analysis is performed with the aim of validating / correcting character recognition  based on dictionaries  bigrams, trigrams  confidence of character recognition (if available)

27

© Prof. Rolf Ingold

Character Recognition Performance 

OCR (Optical Character Recognition) is the most mature technique of document analysis



For most applications, very high accuracy is required  99% recognition rate would generate 30-50 errors per page  99,9% – 99,99% is often requested



OCR systems may be designed for  standard OCR-A, OCR-B fonts, specially designed for OCR  mono-font recognition, specialized for typewriting  omni-font or multi-font text recognition (the most popular)  trainable systems, being tuned for specific fonts and styles

28

© Prof. Rolf Ingold

OCR experiment 

One magazine page [Hebdo 18, 2007, Editorial]  good quality printed text  medium layout complexity

29

© Prof. Rolf Ingold

First Experiment (Standard MS-Office OCR)   

Layout not understood Many OCR errors => Unusable !

30

© Prof. Rolf Ingold

Second Experiment (ReadIris)     

Page layout correctly recognized Italic correctly detected  including one false detection A few word segmentation errors Correct de-hyphenation A few OCR errors  often as consequence of segmentation errors !

31

© Prof. Rolf Ingold

Conclusion on OCR technologies   



Imperfect results in printed character recognition Recognition of uncontrolled handwriting not mature Practical problems with  mathematics (symbols and formulas)  special fonts or scripts  logos => perfect document recognition is not achievable !

  

Some applications can deal with approximate results Recognition algorithms should be tuned to prefer rejections to errors Include manual correction in the processing step

32

© Prof. Rolf Ingold

Related Documents


More Documents from ""

Voronoi Skelton
May 2020 3
Original Perfume
May 2020 4
Forum Payudara
May 2020 11
Charlie Chaplin
June 2020 11
16904.pdf
June 2020 3
Home.docx
June 2020 3