Introduction To Dip

  • Uploaded by: Vinay
  • 0
  • 0
  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Introduction To Dip as PDF for free.

More details

  • Words: 22,481
  • Pages: 394
Image Processing and the Applications

Rajashekhara

2/22/2009

1

What is an Image? „

„ „

a representation, likeness, or imitation of an object or thing a vivid or graphic description something introduced to represent something else

2/22/2009

2

Where are we? Display/Printing? Computer Vision?

Imaging?

Digital Image Processing

Computer Graphics? 2/22/2009

Biological Vision? 3

Computer vision Vs computer graphics Transformation(Computer Graphics)

3D information

Image or Display

Extraction(Computer Vision)

2/22/2009

4

Computer vision, Computer graphics, Image processing

Computer vision estimates 3D data from one or more 2D images. Computer graphics generates 2D/3D images from the 3D (from the mathematical functions) of an object. Computer vision and Computer graphics are inverse operations each other. They both use image processing which is there fore regarded as low level (or basic) operation for computer vision and computer graphics. Note that Computer Vision and Computer graphics and Image processing are normally considered as three overlapping areas but none them are subset of the other

2/22/2009

5

Computer Vision Means „ „ „ „ „

Machine Vision Robot Vision Scene Analysis Image Understanding Image Analysis

2/22/2009

6

Image processing Means „

Image processing refers to a set of computational techniques which accept images as input. The results of the processing can be new images or information extracted from the input images. Video is just time sequence of images called frames. All image processing techniques can be applied to frames. Image processing has many applications

2/22/2009

7

Why Image processing?

ƒ Why? – Coding/compression – Enhancement, restoration, reconstruction – Analysis, detection, recognition, understanding – Visualization 2/22/2009

8

What do we do? Image Processing/ Manipulation

Digital Image Processing

Image Analysis/ Interpretation 2/22/2009

Image Coding/ Communication 9

Digital Image

2/22/2009

10

What is an Image? „

„

a visual representation of objects, their components, properties, relationships, Mapping 3D scene on to a 2D plane.

2/22/2009

11

What is Digital Image ? „

A digital image contains a fixed number of rows and columns of integer numbers. Each integer number is called pixel, picture elements, representing brightness at the points of the image. R

2/22/2009

34

58

98

13

25

39

88

47

17

12

Digital Image Digital image = a multidimensional array of numbers (such as intensity image) or vectors (such as color image)

Each component in the image called pixel associates with the pixel value (a single number in 2/22/2009 the case of intensity images or a vector in the case of color images).

⎡10 10 16 28⎤ 56 ⎥ 43⎤ ⎢ 9 ⎡656 70 26 37 ⎥ 78 ⎥ 67 ⎤ ⎢ ⎢32 ⎡99 54 70 96 56 ⎢ ⎥ ⎥ 67⎥ ⎢15 25⎢6013 902296 ⎢ ⎢ 21 ⎢ 54 47 ⎥ 42⎥ ⎥ ⎢32 ⎢ 15⎢8587 8539⎥4313⎥ 92⎥ ⎢54 ⎢ 65 65 39⎥ ⎥ ⎢32 65 87 99⎥

Digital Image Types : Binary Image Binary image or black and white image Each pixel contains one bit : 1 represent white 0 represents black

Binary data

2/22/2009

⎡0 ⎢0 ⎢ ⎢1 ⎢ ⎢1

0 0 1 1

0 0 1 1

0⎤ 0⎥ ⎥ 1⎥ ⎥ 1⎥

14

Digital Image Types : Intensity or Gray image Intensity image or monochrome image each pixel corresponds to light intensity normally represented in gray scale (gray level).

Gray scale values

2/22/2009

⎡10 10 16 28⎤ ⎢ 9 6 26 37⎥ ⎥ ⎢ ⎢15 25 13 22⎥ ⎥ ⎢ ⎢32 15 87 39⎥

15

Digital Image Types : RGB image Color image or RGB image: each pixel contains a vector representing red, green and blue components.

RGB components

2/22/2009

⎡10 10 16 28⎤ 56 ⎥ 43⎤ ⎢ 9 ⎡656 70 26 37 ⎥ 78 ⎥ 67 ⎤ ⎢ ⎢32 ⎡99 54 70 96 56 ⎢ ⎥ ⎥ 67⎥ ⎢15 25⎢6013 902296 ⎢ ⎢ 21 ⎢ 54 47 ⎥ 42⎥ ⎥ ⎢32 ⎢ 15⎢8587 8539⎥43 ⎥ 92⎥ 39⎥ ⎢54 ⎢ 65 65 16 ⎥ ⎢32 65 87 99⎥

Digital Image Types : Index Image Index image Each pixel contains index number pointing to a color in a color table Color Table Index No.

⎡1 4 9 ⎤ ⎢6 4 7 ⎥ ⎥ ⎢ ⎢⎢6 5 2⎥⎥ 2/22/2009

Index value



Red

Green

Blue

component

component

component

1

0.1

0.5

0.3

2

1.0

0.0

0.0

3

0.0

1.0

0.0

4

0.5

0.5

0.5

5

0.2

0.8

0.9







17

Human Vision & Image Visualization In the beginning…

we’ll have a look at the human eye

2/22/2009

18

Cross section of Human Eye

2/22/2009

19

Visual perception : Human eye 1.

The lens contains 60-70% water, 6% of fat.

2. 3.

The iris diaphragm controls amount of light that enters the eye. Light receptors in the retina - About 6-7 millions cones for bright light vision called photopic - Density of cones is about 150,000 elements/mm2. - Cones involve in color vision. - Cones are concentrated in fovea about 1.5x1.5 mm2. - About 75-150 millions rods for dim light vision called scotopic - Rods are sensitive to low level of light and are not involved color vision.

4. Blind spot is the region of emergence of the optic nerve from the eye. 20

2/22/2009

20

Electromagnetic Spectrum

The whole electromagnetic spectrum is used by “imagers” The human eye is sensible to electromagnetic waves in the ‘visible spectrum’ : Electromagnetic Spectrum

cosmic rays

-4

10

gamma rays

-2

10

radio frequency microwave (SAR)

visible X-Rays

1

UV

10

2

IR

10

4

10

6

8

10

10

10

12

10

wavelength (Angstroms) 1 Å = 10

2/22/2009

-10

m

21

The human eye is sensible to electromagnetic waves in the ‘visible spectrum’ , which is around a wavelength of

0.000001 m = 0.001 mm

2/22/2009

22

The human eye

•IIs able to perceive electromagnetic waves in a certain spectrum

•IIs able to distinguish between wavelengths in this spectrum (colors)

•HHas a higher density of receptors in the center

•Maps our 3D reality to a 2 dimensional 2/22/2009

23

image !

The retinal model is mathematically hard to handle (e.g. neighborhood ?) Easier: 2D array of cells, modelling the cones/rods

Each cell contains a numerical value (e.g. between 0-255) 2/22/2009

24

•TThe position of each cell defines the position of the receptor

•TThe numerical value of the cell represents the illumination received by the receptor

5 7 1 0 12 4 ………

2/22/2009

25

•WWith this model, we can create GRAYVALUE images

•VValue = 0: BLACK (no illumination / energy) •VValue = 255: White (max. illumination / energy)

2/22/2009

26

What is light?

• TThe visible portion of the electromagnetic (EM) spectrum. • IIt occurs between wavelengths of approximately 400 and 700 nanometers.

2/22/2009

27

Short wavelengths

• DDifferent wavelengths of radiation have different properties. • TThe x-ray region of the spectrum, it carries sufficient energy to penetrate a significant volume or material.

2/22/2009

28

Long wavelengths

• CCopious quantities of infrared (IR) radiation are emitted from warm objects (e.g., locate people in total darkness).

2/22/2009

29

Long wavelengths

• “Synthetic aperture radar” (SAR) imaging techniques use an artificially generated source of microwaves to probe a scene. • SAR is unaffected by weather conditions and clouds (e.g., has provided us images of the surface of Venus).

2/22/2009

30

Range images

• AAn array of distances to the objects in the scene. • TThey can be produced by sonar or by using laser rangefinders.

2/22/2009

31

Sonic images

• PProduced by the reflection of sound waves off an object. • HHigh sound frequencies are used to improve resolution.

2/22/2009

32

Image formation „

Image is two dimensional pattern of brightness

What to do to get info on 3D world? -Study image formation process - Understand how the brightness pattern is produced. Two important tasks - Where the image of some point will appear? - How bright the image of some surface will be ? 2/22/2009

33

A simple model of image formation

„

„

„ „

„ „

„ „ „

The scene is illuminated by a single source. The scene reflects radiation towards the camera. The camera senses it via chemicals on film.

Light reaches surfaces in 3D. Surfaces reflect. Sensor element receives light energy. Intensity is important. Angles are important. 2/22/2009 Material is important.

34

Geometry and physics – The geometry of image formation, which determines where in the image plane the projection of a point in the scene will be located. – The physics of light, which determines the brightness of a point in the image plane as a function of illumination and surface properties. 2/22/2009

35

Image Formation „

Digital image generation is first step in any image processing or computer vision method. The generated image is function of many parameters like the reflection characteristics of object surface, sensor characteristics of the camera, the optical characteristics of lens, the analog to digital converter, characteristics of light source, geometric laws on the basis of which image is acquired. 2/22/2009

36

Image formation – The first task is related to the camera projection which can be either a perspective projection or an orthographic projection. The perspective projection is more general than the orthographic projection, but requires calculations. – The second task is related to surface reflection properties, illumination conditions, and surface orientation with respect to camera and light sources. 2/22/2009

37

Geometric camera models The projection of surface point of 3 dimensional scene into 2 dimensional image plane can be described by a perspective or orthographic projection

Pine hole camera A camera with zero aperture size All rays from the 3D scene points always pass through optical center of the lens ©

2/22/2009

38

Coordinate system „

In computer vision, we deal with three kinds of coordinates systems – Image coordinate system, Camera coordinate system, and World coordinate system. Image coordinate is basically two-dimensional image plane. Camera coordinate is one, that is adjusted to the camera. It can be either camera centered or image centered. In the camera-centered coordinate system the origin is the focal point and the optical axis is Z axis. In the image – centered system the origin is positioned in the XY image plane. World coordinate system is general coordinate system with some reference axes. 2/22/2009

39

Perspective projection set up Consider pinhole camera model

2/22/2009

Projection of a scene point P of the XYZ space onto the image point P’ in the xy image plane is perspective projection The optical axis defined to be perpendicular from the pinhole C to the image plane The distance f between C and the image plane is focal length The coordinate system of XYZ space is defined such that the XY plane is parallel to the image plane, and origin is at the pinhole C, then the Z axis lies along the optical axis 40

Perspective projection equations

2/22/2009

41

Perspective projection equations From the similar triangles (CA’P’) and (CAP

From the two similar triangles (A’B’P’) and (ABP)

From the last two equations, perspective projection equations are obtained: 2/22/2009

42

Perspective projection Points go to points Line go to lines Planes go to whole image or half planes Polygons go to polygons

Long focal length -> narrow field of view Small focal length -> large (wide) field of view-wide angle cameras 2/22/2009

43

Perspective projection • Produces a view where the object’s size depends on the distance from the viewer • An object farther away becomes smaller

2/22/2009

44

Perspective projection „ „ „ „

Horizon – observer’s eye level Ground Line – plane on which object rests Vanishing point – position on horizon where depth projectors converge Projection plane – plane upon which object is projected

2/22/2009

45

Vanishing points „ „

Object edges parallel to projection plane remain parallel in a perspective projection Object edges not parallel to projection plane converge to a single point in a perspective projection Æ vanishing point

(vp)

2/22/2009

46

Camera with aperture „

„

In practice, the aperture must be larger to admit more light. Lenses are placed to in the aperture to focus the bundle of rays from each scene point onto the corresponding point in the image plane

2/22/2009

47

Orthographic projection

Orthographic projection is modeled by rays parallel to optical axis rather than passing through the optical center Suppose that the image of a plane lying at Z=Zo parallel to the image plane is formed. The magnification m can be defined as the ratio of distance between two points in the image to distance between their corresponding points on the image plane

2/22/2009

48

Orthographic projection

2/22/2009

49

Orthographic projection For an object located at average distance –Zo and variations in Z over its visible surface is not significant compared to –Zo (when distance between camera and object is very large relative to the variations in the object depth) then the image this object will be magnified by a factor m. For all the visible points of object , projection equations are

The scaling factor m is usually set to 1 or –1 for convenience. Simple projection equations are x=X, y=Y 2/22/2009

50

Radiometry basics „

What determines the brightness of an image pixel? Light source properties

Exposure

2/22/2009

Surface shape

Optics

Surface reflectance properties 51

Radiometry basics „ „ „

Foreshortening and Solid angle Measuring light : radiance Incoming Outgoing Light at surface : interaction between light and surface – – –

„

irradiance = light arriving at surface BRDF outgoing radiance

Special cases and simplifications : Lambertain, specular, parametric and non-parametric models 2/22/2009

52

Foreshortening Two sources that look the same to a receiver must have same effect on the receiver; Two receivers that look the same to a source must receive the same energy.

2/22/2009

53

Solid Angle „

„

By analogy with angle (in radians), the solid angle subtended by a region at a point is the area projected on a unit sphere centered at that point The solid angle dω subtended by a patch of area dA is given by:

dA cos θ dω = 2 r ƒMeasured in steradians (sr) ƒForeshortening : patches that look the same, same solid angle. 2/22/2009

A

54

Radiometry basics „

„

„

Radiometry is a branch of physics that deals with the measurement of the flow and transfer of radiant energy. Radiance is the power of light that is emitted from a unit surface area into some spatial angle; the corresponding photometric term is brightness. Irradiance is the amount of energy that an image capturing device gets per unit of an efficient sensitive area of the camera. Quantizing it gives image gray tones.c 2/22/2009

55

Radiometry basics „

Radiance (L): energy carried by a ray – Power per unit area perpendicular to the direction of travel, per unit solid angle – Units: Watts per square meter per steradian (W m-2 sr-1)

Irradiance (E): energy arriving at a surface – Incident power in a given direction per unit area – Units: W m-2 – For a surface receiving radiance L(x,θ,φ) coming in from dω the corresponding irradiance is n



E (θ , φ ) = L(θ , φ ) cos θ dω θ

„

2/22/2009

dA

dA cos θ

56

Radiance –emitted light Radiance = power traveling at some point in a direction per unit area perp to direction of travel, per solid angle „ „

unit = watts/(m2sr) constant along a ray

P L(x, θ , φ ) = (dA cos θ )dω

A

θ

A cos θ



P

dA

Radiance transfer : Power received at dA2 at dist r from emitting area dA1

P1→2 = LdA1 cos θ1 ( 2/22/2009

dA2 cos θ 2 ) 2 r dω21

P1→2 = P2→1

θ1

r

dA1

θ2

dA2

57

Light at surface : irradiance Irradiance = unit for light arriving at the surface

φ

dE (x) = L(x, θ , φ ) cos θdω

θ

x

Total power = integrate irradiance over all incoming angles 2π π / 2

E ( x) =

∫ ∫ L(x,θ , φ ) cosθ sin θdθdφ 0 0

2/22/2009

x



58

Bidirectional reflectance distribution „

„

Model of local reflection that tells how bright a surface appears when viewed from one direction when light falls on it from another Definition: ratio of the radiance in the outgoing direction to irradiance in the incident direction surface normal

ρ (θ i , φi ,θ e , φe ) =

„

Le (θ e , φe ) Le (θ e , φe ) = Ei (θ i , φi ) Li (θ i , φi ) cos θ i dω

Radiance leaving a surface in a particular direction: add contributions from every incoming direction

∫ ρ (θ , φ ,θ , φ ,)L (θ , φ )cosθ dω i

2/22/2009

Ω

i

e

e

i

i

i

i

i 59

Light leaving surface : BRDF

?

x

many effects :

Assume:

ƒ transmitted - glass ƒ reflected - mirror

• surfaces don’t fluorescent

ƒ scattered – marble, skin

• cool surfaces

ƒ travel along a surface, leave some other

• light leaving a surface due to light arriving

ƒ absorbed - sweaty skin

BRDF = Bi-directional reflectance distribution function Measures, for a given wavelength, the fraction of incoming irradiance from a direction ωi in the outgoing direction ωo [Nicodemus 70]

ρ (x,θ i , φi ,θ e , φe ) =

Le (x, θ e , φe ) Li (x, θ i , φi ) cos θ i dω

Reflectance equation : measured radiance (radiosity = power/unit area leaving surface

Lo ( x, θ o2/22/2009 , φo ) = ∫ ρ ( x, θ i , φi ,θ o , φo ) L (θ i , φi ) cos(θ i ) dωi Ω

Correction 60

Reflection as convolution Reflectance L (x, θ , φ ) = ρ (x, θ ' , φ ' ,θ ' , φe' ) L(θ , φ ) cos(θ )dω o e e i i e i i i i ∫Ω' equation

= ∫ ρ (x, θ i ' , φi ' ,θe' , φe ' ) L( Rα , β (θ i ' , φi ' )) cos(θ i )dωi Ω

Reflection behaves like a convolution in the angular domain BRDF – filter Light - signal

2/22/2009

61

Lambertian BRDF ƒ Emitted radiance constant in all directions ƒ Models – perfect diffuse surfaces : clay, mate paper, … ƒ BRDF = constant = albedo ƒ One light source = dot product normal and light direction

Lo (x) = ρLi (x,θ i , φi ) cos θ i = ρ (N • L i ) albedo

normal

light dir

Diffuse reflectance acts like a low pass filter on the incident illumination.

Lo ( x, θo , φo ) = ∫ ρ L(θi , φi ) cos(θi )dωi Ω'

2/22/2009

62

BRDF for Lambertian surface Image irradiance = 1 * scene Π radiance

2/22/2009

63

How to represent surface ? Z Y ɵ r=1 X

ɸ

Surface normal ( n ) – It is directional vector with magnitude unity Camera in z direction we see hemisphere Depth representation would be z=z(x,y) 2/22/2009 64 ^

How to represent surface ? Equation of the sphere (surface) x^2+y^2+z^2=a^2 Z=+sqrt(a^2-x^2-y^2) If the surface is well behaved

z = z( x, y) = z( xo , yo ) + ∂xT ∇z = z( xo , yo ) + ( x − xo ) ∂z

∂x

+ ( y − yo) ∂z

∂y

+ ........higherorderterms

If the surface is smooth, simply neglect higher order terms i.e. small neighborhood you can consider it as plane called planar approximation

z = ( x − xo ) ∂z

+ ( y − yo) ∂z

= p∂x + q∂y ∂x ∂y First order approximation of surface, p and q relate to 2/22/2009 65 The gradient of surface

Surface normal z ∧

n

y



n

B



OA = (∂x,0, p , ∂x ) = (1,0, p )

q∂y



A

∂y

p∂x

∂x

OB = (0,1, q ) x

Surface normal perpendicular to tangent plane p =slope of surface in x direction q =slope of surface in y direction Cross product of OA & OB vectors gives the surface normal ∧

n = OA × OB ∧

n = 2/22/2009

( − p , − q ,1 ) 1+ p

2

+ q

2

(p,q) is known, n∧ is known and henc surface normal 66

Specular reflection Smooth specular surfaces ƒ Mirror like surfaces ƒ Light reflected along specular direction ƒ Some part absorbed

Rough specular surfaces ƒ Lobe of directions around the specular direction ƒ Microfacets

Lobe 2/22/2009

ƒ ƒ ƒ ƒ

Very small – mirror Small – blurry mirror 67 Bigger – see only light sources Very big – fait specularities

Diffuse reflection

„ „ „ „ „

Dull, matte surfaces like chalk or latex paint Microfacets scatter incoming light randomly Light is reflected equally in all directions: BRDF is constant Albedo: fraction of incident irradiance reflected by the surface Radiosity: total power leaving the surface per unit area (regardless of direction) 2/22/2009

68

Radiosity -summary Radiance

Light energy

Irradiance

Unit incoming light

dE (x) = L(x, θ , φ ) cos θdω

Total Energy incoming Radiosity

Energy at surface

Ei (x) = ∫ L(x, θ , φ ) cos θdω

Unit outgoing radiance

Lo (x, θ e , φe ) = ∫ ρ (x, θ i , φi ,θ e , φe ) L(θ i , φi ) cos(θ i )dωi

Total energy 2/22/2009 leaving

Energy leaving the surface

L(θ , φ ) =

P (dA cos θ )dω

ω

Ω

Eo =



Ωo

⎡ ⎤ ρ ( x , θ , φ , θ , φ ) L ( θ , φ ) cos( θ ) d ω ⎢∫ i i e e i i i i ⎥ cos(θ e ) dωe ⎢⎣Ωi ⎥⎦ 69

Interaction of light and matter What happens when a light ray hits a point on an „

object?

– Some of the light gets absorbed „ converted to other forms of energy (e.g., heat) – Some gets transmitted through the object „ possibly bent, through “refraction” – Some gets reflected „ possibly in multiple directions at once – Really complicated things can happen „ fluorescence „

Let’s consider the case of reflection in detail – In the most general case, a single incoming ray could be reflected in all directions. How can we describe the amount of light reflected in each direction?

2/22/2009

70

Image formation system Relation between what camera captures and what the surface reflects

2/22/2009

71

Image formation system

-Consists of a thin lens and an image plane The diameter of the lens is d and the value of the focal length is fp. The system is assumed to be focused, rays originating from a particular point on the object meet at single point in the image plane rays originating from infinitesimal area dAo on the object are projected into some area dAp in the image plane and no rays from outside the area dAo will reach dAp When a camera captures the image of an object, the measured gray value is proportional to the image irradiance which is related to the reflection properties of the object surface. 2/22/2009

72

Image formation system

How to calculate image irradiance in an image forming system - The radiant flux dɸ that is emitted from the surface patch dAo and passes through the entrance aperture can be calculated by

Where integration is over solid angle occupied by the entrance aperture as seen from the surface patch By assuming that there is no power loss in the medium, the image area dAp will receive the same flux dɸ that is emitted from dAo By definition, the image irradiance is the incident flux per unit area 2/22/2009

73

Image formation system From the previous equations

- Let Ѳr’ be the angle between the surface normal and the line to the entrance aperture, and let α be the angle between this line and the optical axis. The solid angle occupied by the surface patch dAo and seen from the entrance aperture equals the solid angle occupied by the image area dAp : From the previous equations

2/22/2009

74

Image formation system

If the size of the lens is small relative to the distance between the lens and the object, then the values of angle Ѳr’ in the previous integral can be approximated by Ѳr’ and the reflectance Lr tends to be constant and can be removed from the integral which leads to

The solid angle occupied by the lens as seen from the surface patch is approximately equal d Π ( ) cos( α ) divided to the foreshortened area 2 by the distance Cos f ( α ) 2

o

2/22/2009

75

Image formation system Finally the expression of the image irradiance is obtained as That is, the image irradiance is proportional to the scene radiance and the factor of proportionality is a function of the off-axis angle.

fp

d

- >F stop number of camera.

2/22/2009

76

Image formation ⎡π ⎛ d ⎞2 ⎤ E = ⎢ ⎜ ⎟ cos 4 α ⎥ L ⎢ 4 ⎜⎝ f p ⎟⎠ ⎥ ⎣ ⎦ „ „

„

Image irradiance is linearly related to scene radiance Irradiance is proportional to the area of the lens and inversely proportional to the squared distance between the lens and the image plane The irradiance falls off as the angle between the viewing ray and the optical axis increases 2/22/2009

77

„ „ „

What happens on Image plane ? (CCD camera plane)

Lens collects light arrays. Array of small fixed elements replace chemicals of film. Each element generates a voltage signal based on irradiance value

2/22/2009

78

Digitization „

„

Analog images are continuous representations of color This is somewhat of a problem for computers, which like discrete measurements

2/22/2009

79

Digitization

object

Imaging systems

Sample and quantize

Digital storage (disk)

Digital computer

On-line buffer

observe

digitize

store

process

Refresh /store

2/22/2009

Display output

Record

80

Digital image acquisition process

2/22/2009

81

Image sampling & quantization „

y

Grayscale image – A grayscale image is a function I(x,y) of the two spatial coordinates of the image plane. – I(x,y) is the intensity of the image at the point (x,y) on the image plane.

„

I(x,y) takes non-negative values assume the image is bounded by a rectangle [0,a]×[0,b]

I: [0, a] × [0, b] → [0, inf ) z

250

Color image

intensity

„

x

200 150 100

50 I(x,y)

– Can be represented by three functions, R(x,y) for red, G(x,y) for green, and B(x,y) for blue.

0 100 80

y

rows 2/22/2009

60

50

40 20 0

0

columns

x 82

Image sampling & Quantization „

„

The analog signal representing a continuous image is sampled to produce discrete values which can be stored by a computer The frequency of digital samples greatly affects the quality of the digital image

2/22/2009

83

Image sampling & Quantization

„

„

„

To create a digital image, we need to convert continuous sensed data into digital form. This involves two processes: sampling and quantisation The basic idea behind sampling and quantization is illustrated in Fig. 3.1.

2/22/2009

84

Image sampling & quantization

„ „

Computer handles “discrete” data. Sampling

– Sample the value of the image at the nodes of a regular grid on the image plane. – A pixel (picture element) at (i, j) is the image intensity value at grid point indexed by the integer coordinate (i, j). „

255 (white)

Quantization – Is a process of transforming a real valued sampled image to one taking only a finite number of distinct values. – Each sampled value in a 256-level 2/22/2009 grayscale image is represented by 8 bits.

0 (black) 85

How sampling works ?

The original analog representation

2/22/2009

Measurements are made at equal intervals

Discrete samples are taken from the measurements

86

Image sampling & quantization

„

„

„

Figure 3.1(a) shows a continuous image, f (x, y), that we want to convert to digital form. To convert it to digital form, we have to sample the function in both coordinates and in amplitude. An image may be continuous with respect to the x- and y-coordinates and also in amplitude. 2/22/2009

87

Image sampling & quantization „

Digitizing the coordinate values is called

„

Digitizing the amplitude values is called

sampling.

quantization.

2/22/2009

88

Image sampling & quantization

Fig 3.1 Generating a digital image (a) Continuous image. (b) A scan line from A to B in the continuous image. (c) Sampling & quantisation. (d) Digital scan 2/22/2009 89 line.

Image sampling & quantization „

„

„

The one-dimensional function shown in Fig. 3.1(b) is a plot of amplitute (gray level) values of the continuous image along the line segment AB in Fig. 3.1(a). To sample this function, we take equally spaced samples along line AB, as shown in Fig. 3.1(c). Location of each sample is given by a vertical tick mark in the bottom part of the figure. 2/22/2009

90

Image sampling & quantization „

„

„

The samples are shown as small white squares superimposed on the function. The set of these discrete locations gives the sampled function. However, the values of the samples still span (vertically) a continuous range of gray-level values. In order to form a digital function, the gray-level values also must be converted (quantized) into discrete quantities. 2/22/2009

91

Image sampling & quantization „

„

„

The right side of Fig. 3.1(c) shows the gray-level scale divided into eight discrete levels, ranging from black to white. The vertical tick marks indicate the specific value assigned to each of eight gray levels. The continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to each sample. 2/22/2009

92

Image sampling & quantization „

„

The assignment is made depending on the vertical proximity of a sample to a vertical tick mark. The digital samples resulting from both sampling and quantization are shown in Fig. 3.1(d) and Fig 3.2 (b).

2/22/2009

93

Original image

How to choose the spatial resolution : Nyquist rate

1mm

Sampled image

2mm

No detail is lost! Minimum Period

Spatial resolution (sampling rate)

= Sampling locations 2/22/2009

Nyquist Rate: Spatial resolution must be less or equal half of the minimum period of the image or sampling frequency must be greater or Equal twice of the maximum frequency. 94 94

Aliased frequencyx (t ) = sin(2πt), 1

f =1

x2 (t ) = sin(12πt ),

1

f =6

0.5 0 -0.5 -1

0

0.5

1

1.5

2

0

0.5

1

1.5

2

Sampling rate: 5 samples/sec

1 0.5 0 -0.5 -1

Two2/22/2009 different frequencies but the same results !

95

Image sampling & quantization

Fig. 3.2 (a) Continuous image projected onto a sensor array. 2/22/2009 (b) Result of image sampling and quantisation

96

Image digitization

• SSampling means measuring the value of an

image at a finite number of points. • QQuantization is the representation of the measured value at the sampled point by an integer. 2/22/2009

97

Image digitization

2/22/2009

98

Image sampling & quantization

Fig. 3.3. Coordinate convention used to represent 2/22/2009 99 digital images

Image sampling & quantization

Fig. 3.4. A digital image of size M x N 2/22/2009

100

Image sampling & quantization „

It is advantageous to use a more traditional matrix notation to denote a digital image and its elements.

2/22/2009

Fig. 3.5 A digital image

101

Image sampling & quantization „

„

„ „ „

The number of bits required to store a digitised image is b=MxNxk Where M & N are the number of rows and columns, respectively. The number of gray levels is an integer power of 2: L = 2k where k =1,2,…24 It is common practice to refer to the image as a “k-bit image” 2/22/2009

102

Image sampling & quantization „

„

„

The spatial resolution of an image is the physical size of a pixel in that image; i.e., the area in the scene that is represented by a single pixel in that image. It is smallest discernible detail in an image. Sampling is the principal factor determining spatial resolution. Gray level resolution refers to smallest discernible change in gray level (often power of 2) Dense sampling will produce a high resolution image in which there are many pixels, each of which represents of a small part of the scene. Coarse sampling, will produce a low resolution image in which there are a few pixels, each of which represents of a relatively large part of the scene. 2/22/2009

103

Image sampling & quantization

Fig. 3.6 Effect of resolution on image interpretation (a) 8x8 2/22/2009 image. (b) 32x32 image © 256x256 image

104

Effect of sampling

256x256 64x64 16x16 2/22/2009

105

Examples of Sampling

256x256 pixels

2/22/2009

64x64 pixels

128x128 pixels

32x32 pixels

106

Effect of spatial resolution

2/22/2009

107

Effect of spatial resolution

2/22/2009

108

Can we increase spatial resolution by interpolation ?

2/22/2009

109

Down sampling is an irreversible process.

Image Sampling original image

sampled by a factor of 4

2/22/2009

sampled by a factor of 2

sampled by a factor of 8

110

Image sampling & quantization

Fig.3.7 Effect of quantisation on image interpretation. (a) 4 2/22/2009 levels. (b) 16 levels. (c) 256 levels

111

Effect of Quantization

8 bits / pixel 4 bits / pixel 2 bits / pixel 2/22/2009

112

Effect of quantization levels

256 levels

128 levels

2/22/2009

113

64 levels

32 levels

Effect of quantization

16 levels

8 levels

4 levels

2 levels

In this image, it is easy to see false contour.

2/22/2009

114

Image quantization •

256 gray levels (8bits/pixel)

32 gray levels (5 bits/pixel)

16 gray levels (4 bits/pixel)



8 gray levels (3 bits/pixel)

4 gray levels (2 bits/pixel)

2 gray levels (1 bit/pixel)

2/22/2009

115

Image representation „

„

„

„

The result of sampling and quantisation is a matrix of integer numbers as shown in Fig.3.3, Fig.3.4. and Fig 3.5. The values of the coordinates at the origin are (x,y) = (0,0). The next coordinate values along the first row are (x,y) = (0,1). The notation (0,1) is used to signify the 2nd sample along the 1st row. 2/22/2009

116

Image representation „

„

Images can be represented by 2D functions of the form f(x,y). The physical meaning of the value of f at spatial coordinates (x,y) is determined by the source of the image. 2/22/2009

f(x,y) y x

117

Image representation „

„

„

In a digital image, both the coordinates and the image value become discrete quantities. Images can now be represented as 2D arrays (matrices) of integer values: I[i,j] (or I[r,c]). The term gray level is used to describe monochromatic intensity.

2/22/2009

62

79

23

119

120

105

4

0

10

10

9

62

12

78

34

0

10

58

197

46

46

0

0

48

176

135

5

188

191

68

0

49

2

1

1

29

26

37

0

77

0

89

144

147

187

102

62

208

255

252

0

166

123

62

0

31

166

63

127

17

1

0

99

30

118

How to select the suitable size and pixel depth of images The word “suitable” is subjective: depending on “subject”.

Low detail image Lena image

Medium detail image High detail image Cameraman image

To satisfy human mind 1. For images of the same size, the low detail image may need more pixel depth. 2. As an image size increase, fewer gray levels may be needed. 2/22/2009

119

The pixel „

Sample location and sample values combine to make the picture element or

pixel

„

3 color samples per pixel: – 1 RED sample – 1 GREEN sample – 1 BLUE sample

„

Information about pixels is stored in a rectangular pattern and displayed to the screen in rows called rasters (from Spalter).

2/22/2009

120

The pixel „

„ „ „

Monitor pixels are actually circular light representations of red, green and blue phosphors Pixel density is measured using Dots Per

Inch (DPI)

Pixel size is measured using Dot Pitch DPI and Dot Pitch have an inverse relationship ( DPI = Dot Pitch)

2/22/2009

121

Image characteristics Each pixel is assigned a numeric value „ (bit depth) that represents a shade of gray based on the attenuation characteristics of the volume of tissue imaged „

2/22/2009

122

Pixel depth „

„

„

TThe number of bits determines the number of shades of gray the system is capable of displaying on the digital images.

110- and 12- bit pixel can display 1024 and 4096 shades of gray, respectively. IIncreasing pixel bit depth improves image quality

2/22/2009

123

Bit-Depth „

Number of bits to represent pixel color

Expression

Name

Colors

21

2-bit

2

24

4-bit

16

26

6-bit

64

2/22/2009

124

Bit-Depth „

Number of bits to represent pixel color

Expression

Name

Colors

28

8-bit

256

216

16-bit

65, 536

224

24-bit (True Color)

About 16-million

2/22/2009

125

Digital image characteristics AA digital image is „

„

„

2/22/2009

displayed as a combination of rows and columns known as matrix The smallest component of the matrix is the pixel (picture element) The location of the pixel within the image matrix corresponds to an area within the patient or volume of tissue referred to as voxel 126

Matrix size For a given field of view, a larger matrix size includes a greater number of smaller pixels. 2/22/2009

127

„ „

„

Color Fundamentals UUsed heavily in human vision. VVisible spectrum for humans is 400 nm (blue) to 700 nm (red). MMachines can “see” much more; e.g., X-rays, infrared, radio waves.

2/22/2009

128

HVS

„ „ „

CColor perception Llight hits the retina, which contains photosensitive cells. TThese cells convert the spectrum into a few discrete values.

2/22/2009

129

„ „ „ „ „ „ „

„

HVS

TThere are two types of Pphotosensitive cells: CCones : Cones are sensitive to ccolored light, but not very sensitive tto dim light. RRods : Sensitive to achromatic light. TCones perceive color using three different types of cones. Each one is sensitive in a different region of the spectrum. 445 nm (blue), 535 nm (green), 575 nm (Red). Have different sensitivities. We are more sensitive to green than red. (

2/22/2009

130

„ „

„

„

W WHumans discern thousands of color shades and ColorcanFundamentals intensities compared to about only two dozen shades of gray. WWhen a beam of sunlight passes through a glass prism Emerging beam of light is continuous spectrum of colors ranging from violet to at one end to red at the other. If the light is achromatic its only attribute is its intensity, or amount. What can be seen on black and white television set. Gray level refers to scale measure Intensity that ranges from black, to grays, and finally to white. 2/22/2009

131

Color fundamentals Chromatic light spans the electromagnetic spectrum from 400 to 700nm. Quantities to describe quality of chromatic light source: radiance, luminance, and brightness. Radiance is the total amount of energy that flows from the light source. Luminance is the amount of energy perceived by the observer. Brightness is subjective measure that is practically impossible to measure. It embodies the achromatic notion of intensity. Human eye contains three types of cones red, green and blue. Due to the absorption characteristics of the human eye, colors are seen as variable combinations of the so called primary colors. R(red), green(G), and blue(B). Wavelengths of these colors are 700nm, 546.1nm, and 435.8nm, respectively (as per CIE standard). The primary colors can be added to produce the secondary colors of light – magneta(red plus blue), cyan(green plus blue), and yellow(red plus green). 2/22/2009 132

Color fundamentals The characteristics that distinguish one color from another are brightness, hue, and saturation. Brightness embodies the notion of achromatic intensity. Hue is an attribute associated with the dominant wavelength in a mixture of light waves. Hue is dominant color perceived by an observer. When we call an object is red, blue, orange, yellow we are referring to its hue. Saturation means relative purity or amount of whit light mixed with a hue. Degree of saturation is inversely proportional to the amount of white light added. Hue and saturation taken together are called chromaticity. Color may be characterized by its brightness and chromaticity. The amount of red, green, blue needed to form any particular color are referred to as tristimulus values and denoted X, Y, and Z, respectively. A color is characterized by its trichromatic coefficients, defined as 2/22/2009

x

=

y

=

z

=

X

+

X

+

X

+

X Y Y Y Z Y

+

Z

+

Z

+

Z

133

Color fundamentals

It is noted from these equations x+y+z=1. Another approach for specifying colors is to use CIE chromaticity diagram (Fig), which shows color composition as function of x(red) and y(green). For any value of x and y, the corresponding value of z(blue) is obtained as z=1-x-y. The point marked in figure has approximately 62% green and 25% red content. Composition of blue is 13%. The position of the various spectrum colors – from voilet 380nm to red at 780nm –are indicated around the boundary of the tongue shaped chromaticity diagram. At any point within the boundary represents some mixture of spectrum colors. The point of equal energy corresponds to equal fractions of three primary colors: it represents CIE standard for white light. Any point on the boundary of chromaticity chart is fully saturated. As we progress towards point of equal energy more and more white is added to the color and becomes less saturated. Saturation is zero at the point of equal energy. 134 2/22/2009

„

„

CIE Chromaticity model TThe Commission Internationale de l’Eclairage

defined 3 standard primaries: X, Y, Z that can be added to form all visible colors. Y It was chosen so that its color matching function matches the sum of the 3 human cone responses. ⎡ X ⎤ ⎡0.6067 0.1736 0.2001⎤ ⎡ R ⎤ ⎢Y ⎥ = ⎢0.2988 0.5868 0.1143⎥ ⎢G ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢⎣ Z ⎥⎦ ⎢⎣0.0000 0.0661 1.1149 ⎥⎦ ⎢⎣ B ⎥⎦

2/22/2009

⎡ R⎤ ⎡ 1.9107 − 0.5326 − 0.2883⎤⎡ X ⎤ ⎢G⎥ = ⎢− 0.9843 1.9984 − 0.0283⎥⎢ Y ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢⎣ B⎥⎦ ⎢⎣ 0.0583 − 0.1185 0.8986 ⎥⎦⎢⎣ Z ⎥⎦

135

CIE Chromaticity model „

„

„

„

xx, y, z normalize X, Y, Z such that x + y + z = 1. Aactually only x and y are needed because z = 1 - x - y. Ppure colors are at the curved boundary. Wwhite is (1/3, 1/3, 1/3).

2/22/2009

136

„

„

Color fundamentals TThey provide a standard way of specifying a

particular color using a 3D coordinate system. HHardware oriented

„

RRGB: additive system (add colors to black) used for displays. CCMY: subtractive system used for printing. YYIQ: used for TV and is good for compression.

„

IImage processing oriented

„

„

„

HHSI: good for perceptual space for art, psychology and recognition.

2/22/2009

137

Color fundamentals • Primary Colors

2/22/2009

138

Color fundamentals The RGB model, each color appears in its primary spectral components of red green, and blue. This model is based on cartesian coordinate system. The color subspace of interest is the cube. RGB are the three primary colors and secondary colors cyan, magnet, and yellow are located at the corners of cube. In this model the gray scale extends from black to white along the line that joins origin to (1,1,1). All the values of RGB are assumed to be in the range [0 1]. Image represented in RGB color consist of three component images, one for each primary color. When fed to RGB monitor, these three images combine to produce composite color. The number of pixels used to represent each pixel in RGB space is called pixel depth. Each RGB pixel has depth of 24 bits. 2/22/2009

139

Color fundamentals „

Secondary colors (additive synthesis):

2/22/2009

140

Color fundamentals „

Secondary colors (additive synthesis): – adding primary colors: R + G + B = black R + G + B = blue R + G + B = green R + G + B = cyan R + G + B = red R + G + B = magenta R + G + B = yellow R+G+B= = white white

2/22/2009

141

Color fundamentals „

Secondary colors (additive synthesis): – weighted adding of primary colors: 0.5 ··R R + 0.5 ··G G + 0.5 ··B B = grey 1.0 ··R R + 0.2 ··G G + 0.2 ··B B = brown 0.5 ··R R + 1.0 ··G G + 0.0 ··B B = lime 1.0 ··R R + 0.5 ··G G + 0.0 ··B B = orange

2/22/2009

142

Color fundamentals Color images can be represented by 3D Arrays (e.g. 320 x 240 x 3)

2/22/2009

143

„ „

„

AAdditive fundamentals model. Color - RGB AAn image consists of 3 bands, one for each primary color. AAppropriate for image displays.

2/22/2009

144

Color fundamentals - CMY „

Primary colors (subtractive synthesis):

2/22/2009

145

Color fundamentals Cyan, Magneta, and Yellow are the secondary colors of light or alternatively primary colors of pigments. For example when a surface coated with cyan pigment is illuminated with white light, no red light is reflected from the surface. That is Cyan subtracts red light from reflected white light, which itself is composed of equal amounts of red, green, blue light. Most devices that deposit colored pigments on paper, such as color printers and copiers, require CMY data input or perform an RGB to CMY conversion internally. This conversion is performed using the simple operation 2/22/2009

146

CMY model „

„

CCyan-Magenta-Yellow is a subtractive model which is good to model absorption of colors. AAppropriate for paper printing. Assumption here is all the color values are normalized to the raneg [0 1]

⎡ C ⎤ ⎡1⎤ ⎡ R ⎤ ⎢ M ⎥ = ⎢1⎥ − ⎢G ⎥ ⎢ ⎥ ⎢⎥ ⎢ ⎥ ⎢⎣ Y ⎥⎦ ⎢⎣1⎥⎦ ⎢⎣ B ⎥⎦ 2/22/2009

147

Color fundamentals - CMYK Equal amounts of the pigment primaries, cyan, magneta, and yellow should produce black. In practice, combining these colors for printing produces a muddy-looking black. So in order to produce black, a fourth color black is added giving rise to the CMYK color model. When publishers talk about “four-color printing” they are referring to the three colors of CMY color model plus black.

2/22/2009

148

Color fundamentals - HSI RGB and CMY color models are ideally suitable for hardware implementations. RGB strongly matches with the fact that human eye strongly perceptive to red, green, and blue components. Unfortunately these color models and similar other models are not well suited for describing colors in terms that are practical for human interpretation. For example one does not refer to the color of an object by giving the percentage of each of the primaries composing its color. In other words we do not think of color images as being composed of three primary images that combine to form that single image. 2/22/2009

149

Color fundamentals - HSI When humans view a color object, we describe it by its hue, saturation, and brightness. Hue is a color attribute that describes pure color (pure red, orange, or yellow). Saturation give measure of the degree to which a pure color is diluted by white light. Brightness is subjective descriptor that is practically impossible to measure. It embodies achromatic notion of intensity and is one of key factors for color sensation. We do know that the intensity (grey level) is most useful descriptor of monochromatic images. This quantity is easily measurable and interpretable. One such model that decouples the intensity component from the color-carrying information (hue and saturation) in a color image is HSI. As a result HIS model is an ideal tool for developing image processing algorithms based on color descriptions that are natural and intuitive to humans.150 2/22/2009

Color fundamentals To summarize RGB is ideal for image generation (image capture by a color camera or image display on monitor screen), but its use for color description is much more limited. RGB color image can be viewed as three monochromatic intensity images. In the RGB model the line joining black and white vertex represents intensity axis. To determine intensity of any color point just pass a plane perpendicular to the intensity axis. That gives us intensity value in the range [0 1]. Saturation of color increases as a function of distance from the intensity axis. Saturation of points on the intensity axis is zero. It is length of the vector from the origin to the point. Note that origin is defined by intersection of color plane with intensity axis. Hue can also be determined from the RGB point. It is a plane formed by three points (black, white , cyan). All the colors generated by three colors lie in triangle defined by those colors. Usually hue of some point is determined by an angle from some reference point. Usually an angle of 0 from the red axis 151 2/22/2009 designates 0 hue and increases counterclockwise from there.

„

„ „ „

Color fundamentals

UUniform: equal (small) steps give the same perceived color changes. HHue is encoded as an angle (0 to 2π). SSaturation is the distance to the vertical axis (0 to 1). IIntensity is the height along the vertical axis (0 to 1).

2/22/2009

152

Color fundamentals - HSI

The three important components of the HSI color space are the vertical intensity axis, the length of the vector to the color point, and the angle this vector makes with the red axis. „ HTo summarize HSI: Hue, saturation, value are non-linear functions of RGB. Hue relations are naturally expressed in a circle. ( R+G+B) I= 3 min( R, G, B) S = 1− I ⎧ 1 / 2[( R − G )+( R − B)] ⎫⎪ −1 ⎪ H = cos ⎨ ⎬ if BG153 2 ⎪⎩ ( R − G ) +( R − B )(G − B) ⎪⎭

[

2/22/2009

]

[

]

Color fundamentals - HSI

„ „ „

(Left) Image of food originating from a digital camera. (Center) Saturation value of each pixel decreased 20%. (Right) Saturation value of each pixel increased 40%. 2/22/2009

154

„ „

„ „

Color fundamentals -YIQ HHave better compression properties. LLuminance Y is encoded using more bits than chrominance values I and Q (humans are more sensitive to Y than I and Q). LLuminance used by black/white TVs. AAll 3 values used by color TVs.

2/22/2009

0.114 ⎤ ⎡ R ⎤ ⎡Y ⎤ ⎡0.299 0.587 ⎢ I ⎥ = ⎢0.596 − 0.275 − 0.321⎥ ⎢G ⎥ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎢⎣Q ⎥⎦ ⎢⎣0.212 − 0.532 0.311 ⎥⎦ ⎢⎣ B ⎥⎦

155

Color fundamentals Summary „ TTo print (RGB Æ CMY or grayscale) „ „ „

„ „ „

TTo compress images (RGB Æ YUV) CColor description (RGB Æ HSI) CColor information (U,V) can be compressed 4 times without significant degradation in perceptual quality. TTo compare images (RGB Æ CIE Lab) CCIE Lab space is more perceptually uniform. EEuclidean distance in Lab space hence meaningful.

2/22/2009

156

Storing Images With the traditional cameras, the film is used both to record and store the image. With digital cameras, separate devices perform these two functions. The image is captured by the image sensor, then stored in the camera on a storage device of some kind. We look at many of the storage devices currently used. Removable Vs. Fixed Storage Older and less expensive cameras have built-in fixed storage that can’t be removed or increased. This greatly reduces the number of photos you can take before having to erase to make room for new ones. Allmost all newer digital cameras use some form of removable storage media, usually flash, memory cards, but occasionally small hard disks, and even CDs, and the variations of the floppy disk. Whatever its form, removable media let’s you remove one storage 2/22/2009 157 device when it is full and insert another.

Storing Images The number of images you take is limited only by the number of storage devices you have and the capacity of each. The number of images that you can store in a camera depends on a variety of factors including (1) The capacity of the storage device (expressed in Megabytes) (2) The resolution at which pictures are taken (3) The amount of compression used Number you can store is important because once you reach the limit you have no choice but to quit taking pictures or erase some existing ones to make room for new ones. How much storage capacity you need depends partly on what you use camera for. 2/22/2009

158

Storing images

The advantages of removable storage are many. They include the following: (1) They are erasable and reusable (2) They are usually removable, so you can remove one and plug in another so storage is limited only by the number of devices you have. (3) They can be removed from the camera and plugged into the computer or printer to transfer the images.

2/22/2009

159

Storing Images

Flash Card Storage:

As the popularity of digital cameras and other hand held devices has increased, so has the need for small, inexpensive memory devices. The type that is caught on is flash memory which uses solid state chips to store your image files. Although flash memory chips are similar to RAM chips that are used inside your computer there is one important difference. They don’t require batteries and don’t loose images when power is turned off. Your photographs are retained indefinitely without any power to the flash memory components. These chips are packaged inside a case equipped with electrical connectors and the sealed unit is called a card. Flash memory cards consume little power, take up little space, and are very rugged. They are also very convenient. You can carry lots of them with you and change them as needed. 2/22/2009

160

Storing images Until recently, most flash cards have been in the standard PC card format that is widely used in the network computers. However, with the growth of the digital camera and other markets, a number of smaller formats have been introduced. As a result of this competition, camera support confusing variety of incompatible flash memory cards including the following : PC Cards : CompactFlash: Smart media Memory sticks xD-picture cards Each of these formats is supported by its own group of companies 2/22/2009 161 and has its own following.

Storing images PC Cards : PC Cards have the highest storage capacities but their large size has led to their being used mainly in professional cameras CompactFlashCads They are generally the most advanced flash storage devices for consumer level digital cameras CompactFlash Terminology Compactflash cards and slots that are 3.3mm thick are called CompactFalsh(abbreviated CF) or CompactFlash Type 1(abbreviated CF-1) and that are 5mm thick are called Type II Smart media cards They are smaller than compactflash cards and generally don’t come 2/22/2009 162 with storage capacities quite as high

Storing Images Sony memory sticks, shaped something like stick of a gum, currently used mainly in sony products xD picture cards The xD picture cards are the smallest of the memory cards and used in very small cameras. It was developed jointly by Fuji and Olympus Memory card storage cases Cards are easy to misplace and the smaller they are, the easier they are to lose. If you don’t find a way to store them safely. One way to keep them safe is to use an inexpensive storage case.

2/22/2009

163

Storing Images Hard disk storage One of the current drawbacks of compact memory flash cards is their limited storage capacity. For high resolution cameras this is a reall drawback. One solution is high speed, high capacity hard disk drives. Untill recently these drives were too large and expensive to be mounted inside cameras, but that changed with IBM’s introduction of Microdrive hard disk drives. These drives now owned by Hitachi are smaller in volume lighter in volume than a roll of film. Infact, they are so small that they can be plugged into a Type II compact flash slot into digital camera or flash card reader. The Hitachi Microdrive fits a CF-II slot and is a marvel of engineering.

2/22/2009

164

Storing Images

Optical storage disks

CDs are used in few cameras and have the advantage that they can be read in any system with a CD drive. The disks are write once with archival quality with no danger of important files being deleted or written over. Sony’s line of Marvicas use CD discs for storage. Temporary storage : Portable digital image storage and viewing devices are advancing rapidly that’s good because they meet real need. When our photographing, if our storage device becomes filled with images, you need to place to temporarily store the images until images are transferred to main system. One device used for this is notebook computer. Not only many do have one of these, but their large screen and ability to run any of software. However, a notebook computer is not always the ideal temporary storage 2/22/2009 165 device, because of its weight, short battery life, and long startup time. Hence the introduction of portable hard drive.

Storing Images PC Cards : FlashTrax from SmartDisk is one of the new multimedia storage/viewer devices. To use one of these devices you insert your memory card into a slot, often using an adapter, and quickly transfer your images. You can erase your camera’s storage device to make room for new images and resume shooting. When you get back to your permanent set up, you copy or move your images from the intermediate storage device to the system you use for editing, printing, and distributing them. The speed with which you transfer depends on connections supported by the device. Most support USB 2 and some support FireWire. The latest trend is to incorporate image storage into multipurpose devices. Many of these devices let you review the stored images on the device itself or on connected TV. Some also let you to print the images 2/22/2009 166 directly on the printer without using computer.

Storing Images The ternd is to go even farther and combine digital photos, digital videos, and MP3 music, in the same device. With a device like this one will be able to create slide shows with special transitions, pans, and accompanying music and play them back anywhere. One way to eliminate or reduce the need for intermediate storage is to use a higher capacity storage device in the camera. For example, some devices store many gigabytes of data, enough to store hundreds of large photos.

2/22/2009

167

Storing Images The key questions to ask when considering an intermediate storage devices are: (1) What is its storage capacity ? What is the cost per megabyte of storage? (2) Does it have slots or adapters for the storage devices you use ? (3) Does it support image formats you use ? Many support common image formats like JPEG but not proprietary formats such as Canon’s RAW and Nikon’s NEF format. (4) Does it support video and MP3 music playback? Does it support camera’s movie format if it has one ? (5) What is the transfer rate and how long does it take to transfer the images from card to the device? (6) 2/22/2009 Can it display images on a TV set or be connected directly168to a printer ?

Storing Images (7) If it connects to a TV does it have remote control? (8) Can you view images on devices own screen ? (9) Are there ways to rotate, zoom in/out and scroll ?

2/22/2009

169

Introduction to double buffering

BitBlt - > It Stands for Bit-Block Transfer. It means that a “block” of bits, describing a rectangle in an image, is copied in one operation. Usually the graphics card supports this command in hardware. There is a function in the Win32 API of this name, which also occurs in MFC, but the FCL does not provide this function to you directly. Nevertheless it is essentially packaged for use as the Graphics.DrawImage method

2/22/2009

170

Introduction to double buffering Memory DC - > DC means “device context”. This is represented in the FCL as a Graphics object. So far, the Graphics objects we have used in our programs usually corresponded to the screen; and in one lecture, we used a Graphics object that corresponded to the printer. But it is possible to create a Graphics object that does not correspond to a physical device. Instead, it just has an area of RAM (called a buffer) that it writes to instead of writing to video RAM on the graphics card. When you (for example) draw a line or fill a rectangle in this Graphics object, nothing changes on the screen (even if you call Invalidate), since the memory area being changed by the graphics calls is not actually video RAM and has no connection with the monitor. This “virtual Graphics object” is loosely referred to as a memory DC. It is a “device” that exists only in memory; but usually its pixel format does correspond to a physical device such as the screen, so that when data is copied from this2/22/2009 buffer to video memory, it is correctly formatted. 171 2/22/2009

171

Double buffering

ƒ ƒ

We can use the images as offscreen drawing surfaces according to storing them as pictures This allows us to render any image, including text and graphics, ™

ƒ

The advantage of doing this is that the images is seen only ™

ƒ

to an offscreen buffer that we can display at a later time. when it is complete

Drawing a complicated image could take several milliseconds or more ™ ™

which can be seen by the user as flashing and flickering This flashing is distracting 9

ƒ

causes the user to perceive his rendering as slower than actually is

Usage of an offscreen image to reduce flickers is called double buffering. Because: ™ ™

the screen is considered a buffer for pixels, and the offscreen image is the second buffer, 9

2/22/2009

where we can prepare pixels for display. 172

What is double buffering ? Double buffering - > “Double buffering” refers to the technique of writing into a memory DC and then BitBlt-ing the memory DC to the screen. This works as follows: your program can take its own sweet time writing to a memory DC, without producing any delay or flicker on the screen. When the picture is finally complete, the program can call BitBlt and bang! Suddenly (at the next vertical retrace interval) the entire contents of the memory DC’s buffer are copied to the appropriate part of video RAM, and at the next sweep of the electron gun, the picture appears on the screen. This technique is known as double buffering. The name is appropriate because there are two buffers involved: one on the graphics card (video RAM) and one that is not video RAM, and the second one is a “double” of the first in the sense that it has the same pixel format. 2/22/2009

173

What is double buffering ? [Some books reserve this term for a special case, in which the graphics card has two buffers that are alternately used to refresh the monitor, eliminating the copying phase. But most books use the term double buffering for what we have described.] Whatever is stored in the memory DC will not be visible, unless and until it gets copied to the DC that corresponds to the screen. This is done with BitBlt, so that the display happens without flicker.

2/22/2009

174

Why use double buffering ? Double buffering can be used whenever the computations needed to draw the window are time-consuming. Of course, you could always use space to replace time, by storing the results of those computations. That is, in essence, what double-buffering does. The end result of the computations is an array of pixel information telling what colors to paint the pixels. That’s what the memory DC stores. This situation arises all the time in graphics programming. All three-dimensional graphics programs use double-buffering. MathXpert uses it for two-dimensional graphics. We will soon examine a computer graphics program to illustrate the technique. Another common use of double-buffering is to support animation. If you want to animate an image, you need to “set a timer” and then at regular time intervals, use BitBlt to update the screen to show the image in the next position. 2/22/2009 175

Why use double buffering? The BitBlt will take place during the vertical retrace interval. Meantime, between “ticks” of the timer, the next image is being computed and drawn into the memory DC. In the next lecture, this technique will be illustrated. When should the memory DC be created? Obviously it has to be created when the view is first created. Less obviously, it has to be created again when the view window is resized. But the memory DC is the same size as the screen DC, which is the same size as the client area of the view window. When this changes, you must change your memory DC accordingly. When you create a new memory DC, you must also destroy the old one, or else memory will soon be used up by all the old memory DC’s (provided the evil user plays around resizing his 2/22/2009 176 window many times).

What is double buffering ? It turns out that every window receives a Resize event when it is being resized, and it also receives a Resize event soon after its original creation (when its size changes from zero by zero to the initial size). Therefore, we add a handler for the Resize event and put the code for creating the memory DC in the Resize message handler.

2/22/2009

177

Cathode Ray Tube

Adobe Acrobat 7.0 Document

2/22/2009

178

Liquid crystal display LCD (Liquid Crystal Display) panels are "transmissive" displays, meaning they aren't their own light source but instead rely on a separate light source and then let that light pass through the display itself to your eye. We can start to describe how an LCD panel works by starting with that light source. The light source is a very thin lamp called a "back light" that sits directly behind the LCD panel as shown in Figure 1. Figure 1

2/22/2009

179

LCD The light from the backlighting then passes through a polarizing filter (a filter

that aligns the light waves in a single direction). From there the now polarized light then passes through the actual LCD panel itself. The liquid crystal portion of the panel either allows the polarized light to pass through or blocks the light from passing through depending on how the liquid crystals are aligned at the time the light tries to pass through. See Figure 2. Figure 2

2/22/2009

180

LCD The liquid crystal portion or the panel is spit up into tiny individual cells that

are each controlled by a tiny transistor to supply current. Three cells side by side each represent one "pixel" (individual picture element) of the image. An 800 x 600 resolution LCD panel would have 480,000 pixels and each pixel would have three cells for a total of 1,440,000 individual cells. Red, green and blue are the primary colors of light. All other colors are made up of a combination of the primary colors. An LCD panel uses these three colors to produce color which is why there are three cells per pixel — one cell each for red, green, and blue. Once the light is passed through the liquid crystal layer and the final polarizing filter it then passes through a color filter so that each cell will then represent one of the three primary colors of light. See Figure 3. Figure 3

2/22/2009

181

LCD The three cells per pixel then work in conjunction to produce color. For example, if a

pixel needs to be white, each transistor that controls the three color cells in the pixel would remain off, thus allowing red, green and blue to pass through. Your eye sees the combination of the three primary colors, so close in proximity to each other, as white light. If the pixel needed to be blue, for and area of an image that was going to be sky, the two transistors for the red and green cells would turn on, and the transistor for the blue cell would remain off, thus allowing only blue light to pass through in that pixel. Pros: 1.LCD displays are very thin. They can be mounted in places traditional CRT televisions and monitors cannot. 2.Color reproduction is excellent. 3.Contrast is good, although not great. 4.Pixel structure is very small, which creates a very smooth image. 5.Durable technology. 6.No burn-in issues. Cons: 1.Very expensive technology per square inch of viewing area. 2.Black levels and details in dark scenes are not as strong as those in competing technologies. 3.Dead pixels can be an issue, although quality has improved as the technology has matured. 2/22/2009 182 4.Sizes above 40" are cost prohibitive.

LCD Is an LCD Panel right for you? It depends on your needs. Below is a list of common scenarios where an LCD panel provides the best performance, followed by a list of scenarios that might suggest the need to use a different technology. Scenarios where an LCD flat panel will perform well: 1.

Any application that will require a screen of less than 42" diagonal.

2.

Installations that require the monitor/television to be built into a wall or cabinetry, and require a diagonal image of less than a 42".

3.

Pre-made entertainment centers and bedroom armoires.

4.

Any application that requires wall mounting and requires a diagonal image of less than 42".

Scenarios where another technology might be more effective: 1.

Any application that requires a large screen — larger than 40" diagonal. LCD displays get cost prohibitive for sizes above 40". If you opt to select an LCD panel of over 40", be prepared to pay.

2.

Applications where the best possible image quality is needed. A CRT is still going to give the best shadow detail and color. 2/22/2009 183

3.

Tight budgets; CRT technology will be much less expensive per viewing area.

Printers

Inch

Type of measurement equal to 25.4 millimeters or 2.54 centimeters.

Measurement When referring to computers, a measurement is the process of determining a dimension, capacity, or quantity of an object or the duration of a task. By using measurements an individual can help ensure that an object is capable of fitting within an area, a storage medium is capable of storing the necessary files, a task will complete in the required time, or how fast a product is when compared to another product. Below is a listing of different types of computer measurements you may encounter while working with a computer or in the 2/22/2009 184 computer field.

Printers PPI

Short for Pixels Per Inch, PPI is the number of pixels per inch a pixel image is made up of. The more pixels per inch the image contains, the higher quality the image will be. Term that comes from the words Picture Element (PEL). A pixel is the smallest portion of an image or display that a computer is capable of printing or displaying. You can get a better understanding of what a pixel is when zooming into an image, as seen in the example to the right. As you can see in this example, the character image in this picture has been zoomed into at 1600%. Each of the blocks seen in this example is a single pixel of this image. Everything on the computer display looks similar to this when zoomed in upon; the same is true with printed images, which are created by several little dots that are measured in DPI.

Pixel image A type of computer graphic that is composed entirely of pixels. 2/22/2009

185

Printers There seems to be a lot of confusion about what PPI means (apart from the fact that it means Pixels Per Inch of course). This article is for beginners in computer graphics and digital photography. Dots Per Inch usually means the maximum dots a printer can print per inch. Roughly speaking the more DPI the higher quality the print will be. DPI is for printers, PPI is for printed images. But I don't think there is an official definition of the difference. PPI and DPI are sometimes but not always the same, I'll assume for simplicity in this article that they are. However, until you print an image the PPI number is meaningless. 2/22/2009

186

Printers

Until you print an image the PPI number is meaningless. Imagine, for simplicity's sake, that the image below, when printed on your printer is one inch (or 2.54 cm) square:

If you count the pixels (blocks, dots) you'll find that there are 10 across the width of an image. If this was printed at the size of 1 inch it would be a 10 PPI image. Here is a 50 PPI version of the same image:

2/22/2009

187

Printers You need not count the pixels, believe me the square above is a 50 by 50 pixel image, and if printed so that it covered exactly 1 square inch it would be a 50 PPI image. Now lets look at a 150 PPI image:

PPI is simply how many pixels are printed per inch of paper. You may not be able to see the pixels (because your eyes or your printer is not that high a quality). The above images are approximations but you get the idea. I don't care what the "image information" on your camera says, or the what PPI reading on your paint program says, only when you print can you really say what the PPI is. And the same image will have different PPI when 188 printed2/22/2009 at different sizes.

Printers

DPI (dots per inch) is a measurement of printer resolution, though it is commonly applied, somewhat inappropriately, to monitors, scanners and even digital cameras. For printers, the DPI specification indicates the number of dots per inch that the printer is capable of achieving to form text or graphics on the printed page. The higher the DPI, the more refined the text or image will appear. To save ink, a low DPI is often used for draft copies or routine paperwork. This setting might be 300 or even 150 DPI. High resolution starts at 600 DPI for standard printers, and can far exceed that for color printers designed for turning out digital photography or other high-resolution images. 2/22/2009

189

Printers

In the case of monitors, DPI refers to the number of pixels present per inch of display screen. The technically correct term is "PPI" or pixels per inch, but DPI is commonly used instead. A display setting of 1280 x 1024 has 1.3 million DPI, while a setting of 800 x 600 has 480,000, or less than half the resolution of the higher setting. With fewer dots per inch, the picture will not have the clarity that can be achieved with a higher DPI saturation. This is because displays create images by using pixels. Each dot or pixel reflects a certain color and brightness. The greater the DPI, the more detailed the picture can be. Higher DPI also requires more memory and can take longer to 'paint' images, depending on the system's video card, processor and other components. 2/22/2009

190

Printers Scanners also operate at different resolutions. Scan time will increase with higher DPI settings, as the scanner must collect and store more data. However, the greater the DPI, or requested resolution, the richer the resulting image. A high DPI setting mimics the original image in a truer fashion than lower DPI settings are capable of doing. If the image is to be enlarged, a high DPI setting is necessary. Otherwise the enlarged picture will look "blocky" or blurry because the software lacks information to fill in the extra space when the image is enlarged. Instead it "blows up" each pixel to "smear" it over a wider area. Technically again, the more correct term in this application is sampled PPI, but DPI is more often used. 2/22/2009

191

Printers Digital cameras have their own specifications in terms of megapixels and resolution, but DPI is often mentioned in this context as well. Since DPI in all cases refers to the output image, a digital camera capable of the most basic current standards of resolution —- 3.0 megapixels and better —- will output an image capable of taking advantage of a very high DPI setting on the printer. However, if your printer is only capable of 600 DPI, the extra resolution of the camera will be lost in the printing process. When buying or upgrading components it is therefore critical that each product is capable of supporting the highest standards of any interfacing product. 2/22/2009

192

Printers Print quality The quality of the hard copy produced by a computer printer. Below is a listing of some of the more common reasons why the print quality may differ. 1. Type of printer - Each type of printer has its own capabilities of printing. With standard printers, dot matrix is commonly the lowest quality printer, ink jet printers are commonly average quality, and laser printers are commonly the best quality. 2.

Low DPI - Printer has a low DPI.

3. Print mode - The mode that the hard copy was produced may also affect the overall quality of the print. For example, if the mode was draft quality, the printer will print faster, but will be a lower quality. 4. Available toner or ink - If the printer is low on toner or ink the quality can be dramatically decreased. 5. Dirty or malfunctioning printer - If the printer is dirty or is malfunctioning this can also affect the quality of the print. 6. Image quality - It is important to realize that when printing a computer graphic, the quality may not be what you expect because of any of the below reasons. •Printer does not have enough colors to produce the colors in the image. For example, some printers may only have four available inks where others may have six or more available inks. See process color. •The image is a low quality or low resolution image. 2/22/2009 •Image is too small and/or has too many colors in a small area.

193

Printers Most people have used printers at some stage for printing documents but few are aware of how it works. Printed documents are arguably the best way to save data. There are two types of basic printers Impact and Non-impact. Impact printers, as the very name implies means that the printing mechanism touches the paper for creating an image. Impact printers were used in early 70s and 80s. In Dot Matrix printers a series of small pins is used to strike on a ribbon coated with ink to transfer the image on the paper. Other Impact Printers like Character printers are basically computerized typewriters. They have a series of bars or a ball with actual characters on them, which strike on the ink ribbon to transfer the characters on the paper. At a time only one character can be printed. Daisy Wheel printers use a 2/22/2009 194 plastic or metal wheel.

Printers

These types of printers have limited usage though because they are limited to printing only characters or one type of font and not the graphics. There are Line printers where a chain of characters or pins, print an entire line, which makes them pretty fast, but the print quality is not so good. Thermal printers are nothing but printers used in calculators and fax machines. They are inexpensive to use. Thermal printers work by pushing heated pins against special heat sensitive paper. More efficient and advanced printers have come out now which use new Non-impact Technology. Non-impact printers are those where the printing mechanism does not come into the contact of paper at all. This makes them quieter in operation in comparison to the impact 2/22/2009 printers. 195

Printers In mid 1980s Inkjet printers were introduced. These have been the most widely used and popular printers so far. Colour printing got revolutionized after inkjet printers were invented. An Inkjet printer's head has tiny nozzles, which place extremely tiny droplets of ink on the paper to create an image. These dots are so small that even the diameter of human hair is bigger. These dots are placed precisely and can be up to the resolution of 1440 x 720 per inch. Different combinations of ink cartridges can be used for these printers.

2/22/2009

196

Printers How an Inkjet printer works The print head in this printer scans the page horizontally back and forth and another motor assembly rolls the paper vertically in strips and thus a strip is printed at a time. Only half a second is taken to print a strip. Inkjet printers were very popular because of their ability to colour print. Most inkjets use Thermal Technology. Plain copier paper can be used in these printers unlike thermal paper used for fax machines. Heat is used to fire ink onto the paper through the print head. Some print heads can have up to 300 nozzles. Heat resistant and water based ink is used for these printers.

2/22/2009

197

Printers The latest and fastest printers are Laser Printers. They use the principal of static electricity for printing it as in photocopiers. The principle of static electricity is that it can be built on an insulated object. Oppositely charged atoms of objects (positive and negative) are attracted to each other and cling together. For example, pieces of nylon material clinging to your body, or the static you get after brushing hair. A laser printer uses this same principle to glue ink on the paper.

2/22/2009

198

Printers How Laser Printer works Unlike the printers before, Laser printers use toner, static electricity and heat to create an image on the paper. Toner is dry ink. It contains colour and plastic particles. The toner passes through the fuser in the computer and the resulting heat binds it to any type of paper. Printing with laser printers is fast and non-smudge and the quality is excellent because of the high resolution that it can achieve with 300 dots per inch to almost 1200 dpi at the higher end.

2/22/2009

199

Printers Basic components of a laser printer are fuser, photoreceptor drum assembly, developer roller, laser scanning unit, toner hopper, corona wire and a discharge lamp. The laser beam creates an image on the drum and wherever it hits, it changes the electrical charge like positive or negative. The drum then is rolled on the toner. Toner is picked up by charged portion of the drum and gets transferred to the paper after passing through the fuser. Fuser heats up the paper to amalgamate ink and plastic in toner to create an image. Laser printers are called "page printers" because entire page is transferred to the drum before printing. Any type of paper can be used in these printers. Laser printers popularized DTP or Desk Top Publishing for it can print any number of fonts and any graphics.. 2/22/2009

200

Printers This is how the computer and printer operate to print When we want to print something we simply press the command "Print". This information is sent to either RAM of the printer or the RAM of the computer depending upon the type of printer we have. The process of printing then starts. While the printing is going on, our computer can still perform a variety of operations. Jobs are put in a buffer or a special area in RAM or Random Access Memory and the printer pulls them off at its own pace. We can also line up our printing jobs this way. This way of simultaneously performing functions is called spooling. Our computer and the printer are thus in constant communication.

2/22/2009

201

Printing Images In image processing, there are overlapping terms that tend to get interchanged. Especially for image and print resolution: dpi (dots per inch), ppi (pixel or points per inch), lpi (lines per inch). In addition to this, the resolution of an image is stated by its dimensions in pixels or in inches (at a certain ppi or dpi resolution). Yes, we can understand if your head is swimming. Let’s understand this: When an image is captured using either a camera or a scanner, the result is a digital image consisting of rows – known as arrays – of different picture elements that are called pixels. This array has a horizontal and vertical dimension. The horizontal size of the array is defined by the number of pixels in one single row (say 1,280) and the number of rows (say 1,024), giving the image a horizontal orientation. That picture would have a “resolution” of “1,024 x 1,280 pixels”. 2/22/2009

202

Printing images The size of the image displayed is dependent o the number of pixels the monitor displays per inch. The “pixel per inch” resolutions (ppi) of monitors vary, and are usually in the range of 72 ppi to 120 ppi (the latter, lager 21.4” monitors). In most cases, however, with monitors the resolution is given as the number of pixels horizontally and vertically (e.g.1,0240 x 1,280 or 1,280 x 1,600). So the “size” of an image very much depends on how many pixels are displayed per inch. Thus, we come to a resolution given in ‘pixels per inch’ or ppi for short. With LCD monitors, their ppi resolution is fixed and can’t be adjusted (at least not without a loss of display quality). With CRT monitors you have more flexibility (we won’t go into this further0. When an image is printed, its physical size depends upon how many image pixels we put down on paper, but also how an individual image 2/22/2009 203 pixel is laid down on the paper.

How image pixels produced by printer dots? There are only a few printing technologies where a printer can directly produce a continuous color range within an individual image pixel printed. Most other types of printers reproduce the color of a pixel in an image by approximating the color by an n x n matrix of fine dots using a specific pattern and a certain combination of the basic colors available to the printer. If we want to reproduce a pixel of an image on paper, we not only have to place a physical printer’s ‘dot’ on paper, but also have to give that ‘dot’ the tonal value of the original pixel. With bitonal images, that is easy. If the pixel value is o, you lay down a black printed dot, and if the pixel is 1, you omit the dot. However, if the pixel has gray value (say 128 out of 256), and you print with a black-and-white laser printer (just to make the explanation a bit simpler), we must find different way. This technique is called rasterization or dithering. To simulate different tonal values (let’s just stick to black-and-white for the moment), a number of printed dots are placed in a certain pattern on the paper to reproduce a single pixel of the image. In a low-resolution solution, we could use matrix of 3 printed dots by 3 printed dots per pixel. 2/22/2009

204

How image pixels produced by printer dots? Using more printed dots per image pixel allows for more different tonal values. With a pattern of 6 x 6 dots, you get 37 tonal grades, (which is sufficient). For a better differentiation let’s call the matrix of printer dots representing a pixel of the image a raster cell. Now we see why a printer’s “dot per inch” (dpi) resolution has to be much higher than the resolution has to be much higher than the resolution of a display (where a single dot on a screen may be used to reproduce a single pixel in an images, as the individual screen dot (also called a pixel) may have different tonal (or brightness) values. When you print with a device using relatively low resolution for grayscale or colored images, you must make a trade-off between a high resolution image (having as many “raster cells per inch” as possible) and larger raster cells providing greater tonal value per cell. 2/22/2009

205

How image pixels produced by printer dots? The image impression may be improved when the printer is able to vary the size of its dots. This is done on some laser printers, as well as with some of today’s photo inkjet printers. If the dot size can be varied (also called modulated), fewer numbers of dots (n x n) are needed to create a certain number of different tonal values, (which results in a finer raster). You may achieve more tonal values from a fixed raster cell size. There are several different ways (patterns) to place single printed dots in a raster cell, and the pattern for this dithering is partly a secret of the printer driver. The dithering dot pattern is less visible and more photo-like, when the pattern is not the same for all raster cells having the same tonal values, but is modified from raster cell to raster cell in some random way. 2/22/2009

206

Linear Systems

2/22/2009

207

Linear Space Invariant System

2/22/2009

208

Linear Space Invariant System

2/22/2009

209

This Property holds

2/22/2009

210

Convolution in 1 Dimension Let’s look at some examples of convolution integrals, ∞

f (x) = g(x) ⊗ h(x) =

∫ g(x')h(x − x' )dx'

−∞ So there are four steps in calculating a convolution integral: #1. Fold h(x’) about the line x’=0 #2. Displace h(x’) by x #3. Multiply h(x-x’) * g(x’) #4. Integrate 2/22/2009

211

Math of Convolution n

g ( x ) = h * f ( x ) = ∑ h( n) f ( x − n) −n

1 2 1 2/22/2009

h(-1)=1 h(0)=2

h(1)=1

212

Convolution (1D) Filter coefficients

1 2 1

(mask, kernel, template, window) Filter

Input Signal/Image-row

1 1 2 2 1 1 2 2 1 1 Output Signal/Image-row Filter Response

2/22/2009

5 4

213

Math of 2D Convolution/Correlation n

Convolution

m

g ( x, y ) = h * f ( x, y ) = ∑∑ h(m, n) f ( x − m, y − n) −n −m

Correlation

n

m

g ( x, y ) = h ο f ( x, y ) = ∑∑ h(m, n) f ( x + m, y + n) −n −m

2/22/2009

214

Correlation (1D) This process is called Correlation!!

1 2 1

1 1 2 2 1 1 2 2 1 1

5 4 2/22/2009

7 4

7 4

5 4

5 4

7 4

7 4

5 4 215

Correlation Vs Convolution 1 2 1 Correlation

n

g ( x ) = h ο f ( x ) = ∑ h( n) f ( x + n) −n

1 1 2 2 1 1 2 2 1 1 1 2 1 Convolution

n

g ( x ) = h * f ( x ) = ∑ h( n) f ( x − n) −n

1 1 2 2 1 1 2 2 1 1 In image processing we use CORRELATION but (nearly) always call it CONVOLUTION!!!!! Note: When the filter is symmetric: correlation = convolution! 2/22/2009

216

Correlation on images „

1 2 1 1 2 2/22/2009

Process of moving a filter mask over the image and compute sum of products at each location. In convolution filter is first rotated by 180 degree Input Output

2 1 0 2 5

0 4 1 1 3

1 2 0 0 1

3 2 1 2 2

1 9

1 1 1

1 1 1

12 9

217

1 1 1

Correlation on images 1 9

1 1 1

1 1 1

1 1 1

Input

1 2 1 1 2 2/22/2009

2 1 0 2 5

0 4 1 1 3

Output

1 2 0 0 1

3 2 1 2 2

12 9

11 9

218

Applications of Convolution/correlation – Blurring – Edge detection –Template matching

2/22/2009

219

Blurring (smoothing) „ „

Also know as: Smoothing kernel, Mean filter, Low pass filter 1 1 1 The simplest filter: 1 – Spatial low pass filter 1 1 1 1 3

„

Another mask: – Gaussian filter: 1 1 2 1 4

2/22/2009

9

1 1

1 1

1 1

1 16

1 2 1

2 4 2

1 2 1

220

Applications of smoothing

„

„

Blurring to remove identity or other details Degree of blurring = kernel size

Show: camera, mean, convolution 2/22/2009

221

Applications of smoothing „ „

Preprocessing: enhance objects Smooth + Thresholding

2/22/2009

222

Uneven illumination „ „

Improve segmentation Uneven illumination – Within an image – Between images

„

Solution – “Remove background” – Algorithm: g(x,y) = f(x,y) – f(x,y), f(x,y)=mean – Use a big kernel for f(x,y), e.g., 10-50 (IJ: mean=50, sub,TH) 2/22/2009

223

Uneven illumination Input f(x,y)

2/22/2009

Mean f(x,y)

f(x,y) – f(x,y)

Edges

224

Application of smoothing „

Remove noise

2/22/2009

225

Correlation application: Template Matching

2/22/2009

226

Template Matching „

The filter is called a template or a mask

Input image

Output

Output as 3D

Template

„ „

The brighter the value in the output, the better the match Implemented as the correlation coefficient 2/22/2009

227

Template Matching Output

2/22/2009

228

Correlation application: Edge detection

2/22/2009

229

Edge detection Edge detection

2/22/2009

230

Edge detection

g x ( x, y ) ≈ f ( x + 1, y ) − f ( x − 1, y ) g y ( x, y ) ≈ f ( x, y + 1) − f ( x, y − 1) 2/22/2009

231

Edge detection g x ( x, y ) ≈ f ( x + 1, y ) − f ( x − 1, y ) g y ( x, y ) ≈ f ( x, y + 1) − f ( x, y − 1)

2/22/2009

232

Properties of convolution commutative:

associative:

f ⊗g=g⊗ f

f ⊗ (g ⊗ h) = ( f ⊗ g) ⊗ h

multiple convolutions can be carried out in any order. distributive: 2/22/2009

f ⊗ (g + h) = f ⊗ g + f ⊗ h

233

Convolution Theorem ℑ{ f ⊗ g} = F(k)⋅ G(k)

In other words, convolution in real space is equivalent to multiplication in Frequency space.

2/22/2009

234

Proof of convolution Theorem So we can rewrite the convolution integral, f ⊗g=



∫ f (x)g(x'−x)dx

−∞

as, f ⊗g=

1 ∞



4π 2 −∞

−∞



∫ dx ∫ F(k)e dk ∫ G(k' )eik'( x'− x )dk' ikx

−∞

change the order of integration and extract a delta function, ∞ ∞ 1 ∞ 1 f ⊗g= ∫ dkF(k) ∫ dk'G(k')eik'x' ∫ eix(k −k')dx 2π −∞ 2π −∞ −∞ 1 4 42 4 43 2/22/2009

δ (k−k')

235

Proof of convolution theorem ∞ ∞ 1 ∞ ik'x' 1 f ⊗g= ∫ dkF(k) ∫ dk'G(k')e ∫ eix(k −k')dx 2π −∞ 2π −∞ −∞ 1 4 42 4 43

δ (k−k')

or, ∞ 1 ∞ f ⊗g= ∫ dkF(k) ∫ dk'G(k')eik'x'δ (k − k') 2π −∞ −∞

Integration over the delta function selects out the k’=k value. 2/22/2009

1 ∞ f ⊗g= ∫ dkF(k)G(k)eikx' 2π −∞

236

Proof of convolution theorem 1 ∞ f ⊗g= ∫ dkF(k)G(k)eikx' 2π −∞ This is written as an inverse Fourier transformation. A Fourier transform of both sides yields the desired result.

ℑ{ f ⊗ g} = F(k)⋅ G(k) 2/22/2009

237

Convolution in 2-D For such a system the output h(x,y) is the convolution of f(x,y) with the impulse response g(x,y)

2/22/2009

238

Convolution in 2-D

2/22/2009

239

Example of 3x3 convolution mask

2/22/2009

240

In Plain Words Convolution is essentially equivalent to computing a weighted sum of image pixels where filter is rotated 180 degree.

Convolution is Linear operation

2/22/2009

241

Why Mathematical transformations? „

Why

– To obtain a further information from the signal that is not readily available in the raw signal.

„

Raw Signal

– Normally the time-domain signal

„

Processed Signal

– A signal that has been "transformed" by any of the available mathematical transformations

„

Fourier Transformation

– The most popular transformation

2/22/2009

242

What is a Transform and Why do we need one ? „ „

Transform: A mathematical operation that takes a function or sequence and maps it into another one Transforms are good things because… – The transform of a function may give additional /hidden information about the original function, which may not be available /obvious otherwise – The transform of an equation may be easier to solve than the original equation (recall your fond memories of Laplace transforms in DFQs) – The transform of a function/sequence may require less storage, hence provide data compression / reduction – An operation may be easier to apply on the transformed function, rather than the original function (recall other 243 2/22/2009 fond memories on convolution).

Why transform ?

2/22/2009

244

Introduction to Fourier Transform „

f(x): continuous function of a real variable x

„

Fourier transform of f(x): ℑ{ f ( x)} = F (u ) =



∫ f ( x) exp[− j 2πux]dx

Eq. 1

−∞

where 2/22/2009

j = −1 245

Introduction to Fourier transform „ „

„

(u) is the frequency variable. The integral of Eq. 1 shows that F(u) is composed of an infinite sum of sine and cosine terms and… Each value of u determines the frequency of its corresponding sine-cosine pair.

2/22/2009

246

Introduction to Fourier transform „

Given F(u), f(x) can be obtained by the inverse Fourier transform: −1

ℑ {F (u )} = f ( x) ∞

=

∫ F (u ) exp[ j 2πux]du

−∞

• The above two equations are the Fourier transform pair. 2/22/2009

247

Introduction to Fourier transform jθ

e = cosθ + j sinθ cos(−θ ) = cos(θ ) 1 M −1 F(u) = ∑ f (x)[cos2πux/ M − j sin2πux/ M ] M x=0 „

2/22/2009

Each term of the FT (F(u) for every u) is composed of the sum of all values of f(x) 248

Introduction to Fourier transform „

The Fourier transform of a real function function is is generally generally complex complex and and we we use use polar polar coordinates: coordinates: F ( u ) = R ( u ) + jI ( u ) F (u ) = F (u ) e

jφ ( u )

F ( u ) = [ R 2 ( u ) + I 2 ( u )] 1 / 2

φ ( u ) = tan 2/22/2009

−1

⎡ I (u ) ⎤ ⎢ R (u ) ⎥ ⎣ ⎦

249

Introduction to Fourier transform „

„

|F(u)| (magnitude function) is the Fourier spectrum of f(x) and φ(u) its phase angle. The square of the spectrum

P(u ) = F (u ) = R (u ) + I (u ) 2

2

2

is referred to as the power spectrum of f(x) (spectral density). 2/22/2009

250

Introduction to Fourier transform „

Fourier spectrum:

• Phase: • Power spectrum:

2/22/2009

[

]

F(u, v) = R (u, v) + I (u, v) 2

2

1/ 2

⎡ I (u , v ) ⎤ φ (u , v ) = tan ⎢ ⎥ ⎣ R (u , v ) ⎦ −1

P(u,v) = F(u,v) = R (u,v) +I (u,v) 2

2

2

251

Spatial Frequency decomposition 0.25µm myelin

• Any image can be decomposed into a series of sines and cosines added together to give the image

I(x) = ∑ ai (cos k i x) + ibi (sin k i x) i

Amplitudes

Phase

Fourier Transform -50

0

50

100

150

200

250

300

Pixel

-50

0

50

100

150

2/22/2009

200

250

300

252 Pixel

FT

Fourier Transform of the Myelin Image

Low frequency

High frequency 2/22/2009

253

FT reversible Fourier transform of myelin

1 F

2/22/2009

=

254

2-D Image Transform General Transform N −1 N −1

F(u, v) = ∑∑T (x, y, u, v) f (x, y) x=0 y=0

N −1 N −1

f ( x, y ) = ∑∑ I ( x, y, u, v) F (u, v) u =0 v =0

2/22/2009

255

Discrete Fourier Transform

2/22/2009

256

Discrete Fourier Transform „

A continuous function f(x) is discretized into a sequence:

{f (x0), f (x0 +Δx), f (x0 +2Δx),...,f (x0 +[N −1]Δx)} by taking N or M samples Δx units apart.

2/22/2009

257

Discrete Fourier Transform „

Where x assumes the discrete values (0,1,2,3,…,M-1) then

f ( x) = f ( x0 + xΔx) • The sequence {f(0),f(1),f(2),…f(M-1)} denotes any M uniformly spaced samples from a corresponding continuous function. 2/22/2009

258

Discrete Fourier Transform „

The discrete Fourier transform pair that applies to sampled functions is given by: M −1

1 F(u) = ∑ f (x)exp[− j2πux / M] M x= 0 f ( x) =

M −1

For u=0,1,2,…,M-1

and

∑ f (u) exp[ j2πux / M ]

For x=0,1,2,…,M-1

u= 0

2/22/2009

259

Discrete Fourier Transform „ „ „

To compute F(u) we substitute u=0 in the exponential term and sum for all values of x We repeat for all M values of u It takes M*M summations and multiplications M −1

1 F(u) = ∑ f (x)exp[−j2πux/ M] M x= 0 „

For u=0,1,2,…,M-1

The Fourier transform and its inverse always exist!

2/22/2009

260

Discrete Fourier Transform „

„

The values u = 0, 1, 2, …, M-1 correspond to samples of the continuous transform at values 0, Δu, 2Δu, …, (M-1)Δu. i.e. F(u) represents F(uΔu), where: 1 Δu = MΔx

2/22/2009

261

Discrete Fourier Transform „

In a 2-variable case, the discrete FT pair is: 1 M −1 N −1 F (u, v) = f ( x, y) exp[− j 2π (ux / M + vy / N )] ∑∑ MN x=0 y =0 For u=0,1,2,…,M-1 and v=0,1,2,…,N-1 M −1 N −1

AND:

f (x, y) = ∑∑ F (u, v) exp[j2π (ux / M + vy / N)] u=0 v=0

For x=0,1,2,…,M-1 and y=0,1,2,…,N-1 2/22/2009

262

Discrete Fourier Transform „

„

Sampling of a continuous function is now in a 2-D grid (Δx, Δy divisions). The discrete function f(x,y) represents samples of the function f(x0+xΔx,y0+yΔy) for x=0,1,2,…,M-1 and y=0,1,2,…,N-1. 1 Δu = , MΔx

2/22/2009

1 Δv = NΔy 263

Discrete Fourier Transform „

When images are sampled in a square array, M=N and the FT pair becomes: 1 F (u , v) = N

N −1 N −1

∑∑ f ( x, y) exp[− j 2π (ux + vy) / N ] x =0 y =0

For u,v=0,1,2,…,N-1

AND:

1 f ( x, y ) = N

N −1 N −1

∑∑ F (u, v) exp[ j 2π (ux + vy) / N ] u =0 v =0

For x,y=0,1,2,…,N-1 2/22/2009

264

Properties of 2-D Fourier transform Translation Distributivity and Scaling Rotation Periodicity and Conjugate Symmetry Separability Convolution and Correlation 2/22/2009

265

Translation

f (x,y)exp[j2π (u0 x /M + v0 y /N)]⇔F(u − u0,v −v0 ) and

f (x − x0,y − y0 ) ⇔F(u,v)exp[−j2π (ux0 /M + vy0 /N)]

2/22/2009

266

Translation „

The previous equations mean: – Multiplying f(x,y) by the indicated exponential term and taking the transform of the product results in a shift of the origin of the frequency plane to the point (u0,v0). – Multiplying F(u,v) by the exponential term shown and taking the inverse transform moves the origin of the spatial plane to (x0,y0). – A shift in f(x,y) doesn’t affect the magnitude of its 2/22/2009 267 Fourier transform

Distributivity & Scaling ℑ{ f1 ( x, y ) + f 2 ( x, y )} = ℑ{ f1 ( x, y )} + ℑ{ f 2 ( x, y )} ℑ{ f1 ( x, y ) ⋅ f 2 ( x, y )} ≠ ℑ{ f1 ( x, y )} ⋅ ℑ{ f 2 ( x, y )}

„

Distributive over addition but not over multiplication.

2/22/2009

268

Distributivity and Scaling

„

For two scalars a and b, af ( x, y ) ⇔ aF (u, v)

1 f (ax, by ) ⇔ F (u / a, v / b) ab

2/22/2009

269

Rotation „

Polar coordinates: x = r cosθ ,

y = r sin θ ,

u = ω cos ϕ ,

v = ω cos ϕ

Which means that: f ( x, y ), F (u , v) become f (r ,θ ), F (ω , ϕ ) 2/22/2009

270

Rotation f (r ,θ + θ 0 ) ⇔ F (ω , ϕ + θ 0 ) „

Which means that rotating f(x,y) by an angle θ0 rotates F(u,v) by the same angle (and vice versa).

2/22/2009

271

Periodicity & Conjugate Symmetry „

The discrete FT and its inverse are periodic with period N:

F(u,v)=F(u+M,v)=F(u,v+N)=F(u+M,v+N)

2/22/2009

272

Periodicity & conjugate symmetry „

„

Although F(u,v) repeats itself for infinitely many values of u and v, only the M,N values of each variable in any one period are required to obtain f(x,y) from F(u,v). This means that only one period of the transform is necessary to specify F(u,v) completely in the frequency domain (and similarly f(x,y) in the spatial domain).

2/22/2009

273

Periodicity & Conjugate Symmetry

(shifted spectrum) move the origin of the transform to u=N/2.

2/22/2009

274

Periodicity & Conjugate Symmetry „

For real f(x,y), FT also exhibits conjugate symmetry: F (u , v) = F * (−u ,−v) F (u , v) = F (−u ,−v) or

F (u ) = F (u + N ) F (u ) = F (−u )

• i.e. F(u) has a period of length N and the magnitude of the transform is centered on the origin. 2/22/2009

275

Separability „

The discrete FT pair can be expressed in separable forms which (after some manipulations) can be expressed as: 1 M −1 F(u,v) = F(x,v)exp[− j2πux / M] ∑ M x= 0

Where: 2/22/2009

⎡ 1 N −1 ⎤ F(x,v) = ⎢ ∑ f (x, y)exp[− j2πvy /N]⎥ ⎥⎦ ⎢⎣ N y= 0 276

Separability in Specific forms „

Separable

T ( x, y , u , v ) = T1 ( x, u )T2 ( y , v )

„

Symmetric

2/22/2009

T ( x, y , u , v ) = T1 ( x, u )T2 ( y , v ) 277

Separability „

„

For each value of x, the expression inside the brackets is a 1-D transform, with frequency values v=0,1,…,N-1. Thus, the 2-D function F(x,v) is obtained by taking a transform along each row of f(x,y) and multiplying the result by N.

2/22/2009

278

Separability „

The desired result F(u,v) is then obtained by making a transform along each column of F(x,v).

2/22/2009

279

Energy preservation g = f 2

N −1N −1

2

N −1N −1

∑ ∑ f ( x, y ) = ∑ ∑ g (u , v)

x =0 y =0

2/22/2009

2

2

u =0 v =0

280

Energy Compaction !

2/22/2009

281

An Atom „ „

Both functions have circular symmetry. The atom is a sharp feature, whereas its transform is a broad smooth function. This illustrates the reciprocal relationship between a function and its Fourier transform.

2/22/2009

282

Original Image – FA-FP

2/22/2009

283

A Molecule

2/22/2009

284

Fourier duck

2/22/2009

285

Reconstruction from Phase of cat & Amplitude of duck

2/22/2009

286

Reconstruction from Phase of duck & Amplitude of cat

2/22/2009

287

Original Image-Fourier Amplitude Keep Part of the Amplitude Around the Origin and Reconstruct Original Image (LOW PASS filtering)

2/22/2009

288

Keep Part of the Amplitude Far from the Origin and Reconstruct Original Image (HIGH PASS filtering)

2/22/2009

289

Example

2/22/2009

290

Reconstruction from phase of one image and amplitude of the other

2/22/2009

291

Example

2/22/2009

292

Reconstruction from phase of one image and amplitude of the other

2/22/2009

293

Reconstruction Example

Cheetah Image Fourier Magnitude (above) Fourier Phase (below) 2/22/2009

294

Reconstruction example

Zebra Image Fourier Magnitude (above) Fourier Phase (below) 2/22/2009

295

Reconstruction Reconstruction with Zebra phase, Cheetah Magnitude

2/22/2009

296

Reconstruction Reconstruction with Cheetah phase, Zebra Magnitude

2/22/2009

297

2/22/2009

298

Optical illusion

2/22/2009

299

Optical illusion

2/22/2009

300

Optical illusion

2/22/2009

301

Optical illusion

2/22/2009

302

Optical illusion

2/22/2009

303

Optical illusion

2/22/2009

304

Optical illusion

2/22/2009

305

Optical illusion

2/22/2009

306

Optical illusion

2/22/2009

307

Optical illusion

2/22/2009

308

Optical illusion

2/22/2009

309

2/22/2009

310

Optical illusion

2/22/2009

311

Optical illusion

2/22/2009

312

2/22/2009

313

Optical illusion

2/22/2009

314

Optical illusion

2/22/2009

315

Discrete Cosine Transform 1-D (2 x + 1)uπ ⎤ ⎡ C (u ) = a (u ) ∑ f ( x) cos ⎥⎦ x =0 ⎣⎢ 2 N u = 0,1,Κ , N − 1 N −1

⎧ ⎪ ⎪ a (u ) = ⎨ ⎪ ⎪⎩

2/22/2009

1 N

u=0

2 N

u = 1,Κ , N − 1

316

IDCT – 1D

(2 x + 1)uπ ⎤ ⎡ f ( x) = ∑ a (u )C (u ) cos ⎢⎣ 2 N ⎥⎦ u =0 N −1

2/22/2009

317

1D Basis functions N=8 u=0

u=1

u=2

u=3

1.0

1.0

1.0

1.0

0.5

0.5

0.5

0.5

0

0

0

0

-0.5

-0.5

-0.5

-0.5

-1.0

-1.0

-1.0

-1.0

u=4

u=5

u=6

u=7

1.0

1.0

1.0

1.0

0.5

0.5

0.5

0.5

0

0

0

0

-0.5

-0.5

-0.5

-0.5

-1.0

-1.0

-1.0

-1.0

2/22/2009

318

1D Basis functions N=16

2/22/2009

319

Example : 1D signal

2/22/2009

320

DCT

2/22/2009

321

2-D DCT

( 2 x + 1)uπ ⎤ ( 2 y + 1)vπ ⎤ ⎡ ⎡ cos C (u , v ) = a (u ) a (v ) ∑ ∑ f ( x, y ) cos ⎥⎦ ⎥⎦ x =0 y =0 2N 2N ⎣⎢ ⎣⎢ N −1N −1

(2 y + 1)vπ ⎤ ( 2 x + 1)uπ ⎤ ⎡ ⎡ cos f ( x, y ) = ∑ ∑ a (u )a (v )C (u , v) cos ⎢⎣ 2 N ⎥⎦ ⎢⎣ 2 N ⎥⎦ u =0 v =0 N −1N −1

u, v = 0,1,Κ , N − 1 2/22/2009

322

Advantages „

„

„

Notice that the DCT is a real transform. The DCT has excellent energy compaction properties. There are fast algorithms to compute the DCT similar to the FFT. 2/22/2009

323

2-D Basis functions N=4 v

0

1

2

3

u 0

1

2

2/22/2009

3

324

2-D Basis functions N=8

2/22/2009

325

Separable

2/22/2009

326

Example: Energy Compaction

2/22/2009

327

Relation between DCT & DFT g ( x ) = f ( x ) + f (2 N − 1 − x ) 0 ≤ x ≤ N −1 ⎧ f ( x ), =⎨ ⎩ f ( 2 N − 1 − x ), N ≤ x ≤ 2 N − 1 N − point 2 N − point f ( x) → g( x)

2/22/2009

DFT 2 N − point → G (u ) →

N − point C f (u )

328

DFT to DCT F ro m D F T to D C T (C o n t.) D C T has a higher com pression ration than D FT - D C T avoids the generation of spurious spectral com ponents

2/22/2009

329

December 21 1807 illusion “An arbitrary function, continuous or with discontinuities, defined in a finite interval by an arbitrarily capricious graph can always be expressed as a sum of sinusoids” J.B.J. Fourier Jean B. Joseph Fourier (1768-1830) 2/22/2009

330

Frequency analysis „

Frequency Spectrum

– Be basically the frequency components (spectral components) of that signal – Show what frequencies exists in the signal

„

Fourier Transform (FT)

– One way to find the frequency content – Tells how much of each frequency exists in a signal N −1

∑ x (n + 1 ) ⋅ W

kn N

X(f )=

1 x (n + 1) = ∑ X (k + 1) ⋅W N− kn N k =0



X (k + 1 ) =

n=0 N −1

2/22/2009

w

N

= e

2π ⎞ − j ⎛⎜ ⎟ ⎝ N ⎠

x(t ) =



∫ x(t ) ⋅ e

− 2 jπft

dt

−∞

2 jπft ( ) X f ⋅ e df ∫

−∞

331

Complex Function = ∑ (weight )i • (Simple Function )i „

i

Complex function representation through simple building blocks – Basis functions

„

„

Using only a few blocks Î Compressed representation Using sinusoids as building blocks Î Fourier transform – Frequency domain representation of the function

− jωt

F(ω) = ∫ f (t)e 2/22/2009

dt

1 jωt f (t) = ∫ F(ω)e dω 2π

332

How does it work Anyway?

„

„

„

Recall that FT uses complex exponentials (sinusoids) as building blocks. e j ω t = cos (ω t ) + j sin (ω t ) For each frequency of complex exponential, the sinusoid at that frequency is compared to the signal. If the signal consists of that frequency, the correlation is high Æ large FT coefficients. F (ω ) = ∫ f (t )e

„

− jω t

dt

1 j ωt f (t ) = F ( ω ) e dω ∫ 2π

If the signal does not have any spectral component at a frequency, the correlation at that frequency is low / zero, Æ small / zero FT coefficient. 2/22/2009

333

FT At work x1 (t ) = cos(2π ⋅ 5 ⋅ t )

x2 (t ) = cos(2π ⋅ 25 ⋅ t )

x3 (t ) = cos(2π ⋅ 50 ⋅ t ) 2/22/2009

334

FT At work x1 (t )

F

X 1 (ω )

x2 (t )

F

X 2 (ω )

x3 (t )

F

X 3 (ω )

2/22/2009

335

FT At work x4 (t ) = cos(2π ⋅ 5 ⋅ t ) + cos(2π ⋅ 25 ⋅ t ) + cos(2π ⋅ 50 ⋅ t )

x4 (t )

F

2/22/2009

X 4 (ω ) 336

FT At work Complex exponentials (sinusoids) as basis functions: F (ω ) =



∫ f (t ) ⋅ e

− jωt

dt

−∞

1 f (t ) = 2π



∫ F (ω ) ⋅ e

−∞

F

2/22/2009

337

jωt

dt

Stationarity of Signal „

Stationary Signal – Signals with frequency content unchanged in time – All frequency components exist at all times

„

Non-stationary Signal – Frequency changes in time – One example: the “Chirp Signal”

2/22/2009

338

Stationary & Non Stationary Signals „

„

„

FT identifies all spectral components present in the signal, however it does not provide any information regarding the temporal (time) localization of these components. Why? Stationary signals consist of spectral components that do not change in time – all spectral components exist at all times – no need to know any time information – FT works well for stationary signals However, non-stationary signals consists of time varying spectral components – How do we find out which spectral component appears when? – FT only provides what spectral components exist , not where in time they are located. – Need some other ways to determine time localization

components 2/22/2009

of spectral 339

Stationary & Non Stationary Signals

„

Stationary signals’ spectral characteristics do not change with time x4 (t ) = cos(2π ⋅ 5 ⋅ t ) + cos(2π ⋅ 25 ⋅ t ) + cos(2π ⋅ 50 ⋅ t )

„

Non-stationary signals have time varying spectra x5 (t ) = [ x1 ⊕ x2 ⊕ x3 ] ⊕ Concatenation 2/22/2009

340

Stationary & Nonstationary signals 3

6 0 0

2

5 0 0

Magnitud e

Stationary

Magnitud e

2 Hz + 10 Hz + 20Hz

1

4 0 0

0

3 0 0

-1

2 0 0

-2

1 0 0

-3

0

0 .2

0 .4

0 .6

0 .8

0

1

0

5

Time

2 0

2 5

2 5 0

0 .8 0 .6

2 0 0

0 .4 0 .2

1 5 0

0 -0 .2

1 0 0

-0 .4 -0 .6

2/22/2009

1 5

Magnitud e

NonStationary

1

Magnitud e

0.0-0.4: 2 Hz + 0.4-0.7: 10 Hz + 0.7-1.0: 20Hz

1 0

Frequency (Hz)

5 0

341

-0 .8 -1

0

0 .5

Time

1

0

0

5

1 0

1 5

Frequency (Hz)

2 0

2 5

Chirp signals Frequency: 2 Hz to 20 Hz Different in Time DomainFrequency: 20 Hz to 2 Hz 150

1

0.8

0.8

0.6

0.6

Magnitud e

0.4

100

Magnitud e

Magnitud e

0.4 0.2 0

-0.2

100

0.2 0

-0.2

50

-0.4

50

-0.4

-0.6

-0.6

-0.8

-0.8

-1

150

Magnitud e

1

0

0.5

Time

1

0

0

5

10

15

20

Frequency (Hz)

25

-1

0

0.5

Time

1

0

0

5

10

15

20

Frequency (Hz)

Same in Frequency Domain 2/22/2009

Atwhat whattime timethe thefrequency frequencycomponents componentsoccur? occur? FT FTcan cannot nottell! tell! At

342

25

Stationary & Non Stationary Signals

Perfect knowledge of what frequencies exist, but no information about where these frequencies are located in time

2/22/2009

343

FFT Vs Wavelet „ „

„

„

„

FFT, basis functions: sinusoids Wavelet transforms: small waves, called wavelet FFT can only offer frequency information Wavelet: frequency + temporal information Fourier analysis doesn’t work well on discontinuous, “bursty” data 2/22/2009

344

Fourier Vs. Wavelet „

Fourier – Loses time (location) coordinate completely – Analyses the whole signal – Short pieces lose “frequency” meaning

„

Wavelets – Localized time-frequency analysis – Short signal pieces also have significance – Scale = Frequency band

2/22/2009

345

Shortcomings of FT y

Sinusoids and exponentials – Stretch into infinity in time, no time localization – Instantaneous in frequency, localization

perfect spectral

– Global analysis does not allow analysis of non-stationary signals y

Need a local analysis scheme for a time-frequency representation (TFR) of nonstationary signals – Windowed F.T. or Short Time F.T. (STFT) : Segmenting the signal into narrow time intervals, narrow enough to be considered stationary, and then take the Fourier transform of each segment, Gabor 1946. – 2/22/2009 Followed by other TFRs, which differed from each other 346 by the selection of the windowing function

Nothing More Nothing less „ „ „

FT Only Gives what Frequency Components Exist in the Signal The Time and Frequency Information can not be Seen at the Same Time Time-frequency Representation of the Signal is Needed

Most of Transportation Signals are Non-stationary. (We need to know whether and also when an incident was happened.) ONE EARLIER SOLUTION: SHORT-TIME FOURIER TRANSFORM (STFT)347 2/22/2009

Short Time Fourier Transform -STFT 1. 2. 3. 4. 5. 6. „

Choose a window function of finite length Place the window on top of the signal at t=0 Truncate the signal using this window Compute the FT of the truncated signal, save. Incrementally slide the window to the right Go to step 3, until window reaches the end of the signal For each time location where the window is centered, we obtain a different FT – Hence, each FT provides the spectral information 2/22/2009 of a separate time-slice of the signal, providing 348 simultaneous time and frequency information

STFT Time parameter

Frequency parameter

ω ′ STFTx (t , ω ) =

Signal to be analyzed

FT Kernel (basis function)

− j ωt ′ ∫ [x(t ) ⋅W (t − t )]⋅ e dt t

STFT of signal x(t): Computed for each window centered at t=t’ 2/22/2009

Windowing function

Windowing function centered at t=t’ 349

STFT

2/22/2009

t’=-8

t’=-2

t’=4

t’=8

350

STFT „

STFT provides the time information by computing a different FTs for consecutive time intervals, and then putting them together – –

„

„

Time-Frequency Representation (TFR) Maps 1-D time domain signals to 2-D time-frequency signals

Consecutive time intervals of the signal are obtained by truncating the signal using a sliding windowing function How to choose the windowing function? – What shape? Rectangular, Gaussian, Elliptic…? – How wide? Wider window require less time steps Æ low time resolution „ Also, window should be narrow enough to make sure that the portion of the signal falling within the window is stationary „ Can we choose an arbitrarily narrow window…? 2/22/2009 351 „

Selection of STFT Window STFTxω (t ′, ω ) = ∫ [x(t ) ⋅ W (t − t ′)]⋅ e − jωt dt t

Two extreme cases: Î „ W(t) infinitely long: W (t ) = 1 STFT turns into FT, providing excellent frequency information (good frequency resolution), but no time information „ W(t) infinitely short: W (t ) = δ (t ) STFTxω (t ′, ω ) = ∫ [x(t ) ⋅ δ (t − t ′)]⋅ e − jωt dt = x(t ′) ⋅ e − jωt ′ t

Î STFT then gives the time signal back, with a phase factor. Excellent time information (good time resolution), but no frequency information

2/22/2009

352

Drawbacks of STFT

2/22/2009

353

Drawbacks of STFT „ „

„

Unchanged Window Dilemma of Resolution – Narrow window -> poor frequency resolution – Wide window -> poor time resolution Heisenberg Uncertainty Principle – Cannot know what frequency exists at what time intervals

2/22/2009

354

Heisenberg principle Δt ⋅ Δf ≥

Time resolution: How well two spikes in time can be separated from each other in the transform domain

1 4π

Frequency resolution: How well two spectral components can be separated from each other in the transform domain

Both time and frequency resolutions cannot be arbitrarily high!!! Î ÎWe cannot precisely know at what time instance a frequency component is located. We can only know what interval of frequencies are present in which time intervals 2/22/2009

355

Drawbacks of STFT

F

2/22/2009

T

356

Multiresolution analysis „

Wavelet Transform

– An alternative approach to the short time Fourier transform to overcome the resolution problem – Similar to STFT: signal is multiplied with a function

„

Multiresolution Analysis

– Analyze the signal at different frequencies with different resolutions – Good time resolution and poor frequency resolution at high frequencies – Good frequency resolution and poor time resolution at low frequencies – More suitable for short duration of higher frequency; and longer duration of lower frequency components

2/22/2009

357

Wavelet Definition „

“The wavelet transform is a tool that cuts up data, functions or operators into different frequency components, and then studies each component with a resolution matched to its scale”

2/22/2009

358

Principles of wavelet transform „

„

„

Split Up the Signal into a Bunch of Signals Representing the Same Signal, but all Corresponding to Different Frequency Bands Only Providing What Frequency Bands Exists at What Time Intervals

2/22/2009

359

The wavelet transform „ „

„

„ „

Overcomes the preset resolution problem of the STFT by using a variable length window Analysis windows of different lengths are used for different frequencies: – Analysis of high frequenciesÎ Use narrower windows for better time resolution – Analysis of low frequencies Î Use wider windows for better frequency resolution This works well, if the signal to be analyzed mainly consists of slowly varying characteristics with occasional short high frequency bursts. Heisenberg principle still holds!!! The function used to window the signal is called the

wavelet 2/22/2009

360

Wavelet transform „ „ „

Scale and shift original waveform Compare to a wavelet Assign a coefficient of similarity

2/22/2009

361

Definition of continuous wavelet transform

A normalization Translation parameter, Scale parameter, Signal to be (location of window) measure of frequency constant analyzed

ψ

ψ

CWTx (τ , s ) = Ψx (τ , s ) =

Continuous wavelet transform of the signal x(t) using the analysis wavelet ψ(.) „

Wavelet

1

∫ x(t )ψ

s t

∗⎛ t −τ ⎞

⎜ ⎟dt ⎝ s ⎠

The mother wavelet. All kernels are obtained by translating (shifting) and/or scaling the mother wavelet

Scale = 1/frequency

– Small wave – Means the window function is of finite length „

Mother Wavelet – A prototype for generating the other window functions – 2/22/2009 All the used windows are its dilated or compressed and shifted versions

362

CWT „ „ „

„ „

for each Scale for each Position Coefficient (S,P) = Signal x all time Wavelet (S,P) end end Scale



Coefficient

2/22/2009

363

Scaling-- value of “stretch” „

Scaling a wavelet simply means stretching (or compressing) it.

f(t) = sin(t) scale factor1

2/22/2009

f(t) = sin(2t) scale factor 2

f(t) = sin(3t) scale factor 3

364

Scale •It lets you either narrow down the frequency band of interest, or •determine the frequency content in a narrower time interval •Scaling = frequency band •Good for non-stationary data „

Scale – S>1: dilate the signal – S<1: compress the signal

„

„

„

High Scale -> a Stretched wavelet -> Non-detailed Global View of Signal -> Span Entire Signal –> Low Frequency -> Slowly changing, coarse features Low Scale -> a Compressed Wavelet -> Rapidly Changing details -> High Frequency -> Detailed View Last in Short Time Only Limited Interval of Scales is Necessary

2/22/2009

365

Scale is (sort of) like frequency Small scale -Rapidly changing details, -Like high frequency

Large scale -Slowly changing details -Like low frequency 2/22/2009

366

Scale is (sort of ) like frequency

The scale factor works exactly the same with wavelets. The smaller the scale factor, the more "compressed" the wavelet. 2/22/2009

367

Shifting Shifting a wavelet simply means delaying (or hastening) its onset. Mathematically, delaying a function f(t) by k is represented by f(t-k)

2/22/2009

368

Shifting

C = 0.0004

C = 0.0034 2/22/2009

369

Computation of CWT ψ

ψ

CWTx (τ , s ) = Ψx (τ , s ) =

1

∫ x(t )ψ

s t

∗⎛ t −τ ⎞

⎜ ⎟dt ⎝ s ⎠

Step 1: The wavelet is placed at the beginning of the signal, and set s=1 (the most compressed wavelet); Step 2: The wavelet function at scale “1” is multiplied by the signal, and integrated over all times; then multiplied by ; Step 3: Shift the wavelet to t= , and get the transform value at t= and s=1; Step 4: Repeat the procedure until the wavelet reaches the end of the signal; Step 5: Scale s is increased by a sufficiently small value, the above procedure is repeated for all s; Step 6: Each computation for a given s fills the single row of the time-scale plane; Step 7: CWT is obtained if all s are calculated. 2/22/2009

370

Simple steps for CWT 1. Take a wavelet and compare it to a section at the start of the original signal. 2. Calculate a correlation coefficient c

2/22/2009

371

Simple steps to CWT 3. Shift the wavelet to the right and repeat steps 1 and 2 until you've covered the whole signal. 4. Scale (stretch) the wavelet and repeat steps 1 through 3. 5. Repeat steps 1 through 4 for all scales.

2/22/2009

372

CWTxψ (τ , s ) = Ψψ x (τ , s ) =

∗⎛ t −τ ⎞ ( ) x t ψ ⎜ ⎟dt ∫ s ⎝ ⎠ s t

1

WT At work

Lowfrequency frequency(large (large scale) Low scale)

2/22/2009

373

WT At work

2/22/2009

374

WT At work

2/22/2009

375

WT At work

2/22/2009

376

Resolution of Time & Frequency Better time resolution; Poor frequency resolution

Frequenc y

Better frequency resolution; Poor time resolution 2/22/2009

Time • Each box represents a equal portion • Resolution in STFT is selected once for entire analysis

377

Comparison of Transformations

2/22/2009

From http://www.cerm.unifi.it/EUcourse2001/Gunther_lecturenotes.pdf, p.10

378

Discretization of CWT

„

It is Necessary to Sample the Time-Frequency (scale) Plane. At High Scale s (Lower Frequency f ), the Sampling Rate N can be Decreased. The Scale Parameter s is Normally Discretized on a Logarithmic Grid. The most Common Value is 2. The Discretized CWT is not a True Discrete Transform

„

Discrete Wavelet Transform (DWT)

„ „ „ „

– – – – –

2/22/2009

Provides sufficient information both for analysis and synthesis Reduce the computation time sufficiently Easier to implement Analyze the signal at different frequency bands with different resolutions Decompose the signal into a coarse approximation and detail information

379

Discrete Wavelet transforms „ „

„ „

CWT computed by computers is really not CWT, it is a discretized version of the CWT. The resolution of the time-frequency grid can be controlled (within Heisenberg’s inequality), can be controlled by time and scale step sizes. Often this results in a very redundant representation How to discretize the continuous time-frequency plane, so that the representation is non-redundant? – Sample the time-frequency plane on a dyadic (octave) grid ψ 1 ψ ∗⎛ t −τ ⎞

CWTx (τ , s ) = Ψx (τ , s ) = 2/22/2009

∫ x(t )ψ

s t

⎜ ⎟dt s ⎝ ⎠

(

ψ kn (t ) = 2 − k ψ 2 − k − n

)

k , n380 ∈Z

Multiresolution analysis „

„

Analyzing a signal both in time domain and frequency domain is needed many a times – But resolutions in both domains is limited by Heisenberg uncertainty principle Analysis (MRA) overcomes this , how? – Gives good time resolution and poor frequency resolution at high frequencies and good frequency resolution and poor time resolution at low frequencies – This helps as most natural signals have low frequency content spread over long duration and high frequency content for short durations

2/22/2009

381

Discrete wavelet transform signal lowpass

highpass filters

Approximation (a) 2/22/2009

Details (d) 382

Discrete wavelet transform „

Dyadic sampling of the time –frequency plane results in a very efficient algorithm for computing DWT: – Subband coding using multiresolution analysis – Dyadic sampling and multiresolution is achieved through a series of filtering and up/down sampling operations x[n] y[n] H y[n ] = x[n ] * h[n ] = h[n ] * x[n ] N

= ∑ x[k ] ⋅ h[n − k ] k =1 N

2/22/2009

= ∑ h[k ] ⋅ x[n − k ] k =1

383

DWT implementation ~

∑ yhigh [k ] ⋅ g[−n + 2k ]

yhigh [k ] = ∑ x[n] g[− n + 2k ]

x[n]

k

n

~ G

2

~ H

2

~

ylow[k ] = ∑ x[n] h[− n + 2k ]

~ G

2

2

G

~ H

2

2

H

+

2

G

2

H

+

∑ yhigh [k ] ⋅ g[−n + 2k ] k

n

Decomposition

x[n]

Reconstruction

G

Half band high pass filter

2 Down-sampling

H

Half band low pass filter

2 Up-sampling

2-2/22/2009 level DWT decomposition. The decomposition can be continues as long 384 as there are enough samples for down-sampling.

DWT - Demystified g[n] Length: 256 B: π/2 ~ π Hz

h[n]

2 d1: Level 1 DWT Coeff.

Length: 128 B: π/4 ~ π/2 Hz

a1

g[n]

2/22/2009

|G(jw)|

h[n]

2

2

Length: 128 B: 0 ~ π /4 Hz

a2 d2: Level 2 DWT Coeff.

Length: 64 B: π/8 ~ π/4 Hz

Length: 256 B: 0 ~ π/2 Hz

2

g[n] 2 d3: Level 3 DWT Coeff.



-π/2

π/2

π

h[n] 2

Length: 64 B: 0 ~ π/8 Hz

…a3….Level 3 approximation Coefficients

385

w

2D DWT

Generalization of concept to 2D 2D functions ÍÎ images f(x,y) ÍÎ I[m,n] intensity function

„ „

„

Why would we want to take 2D-DWT of an image anyway? – Compression – Denoising – Feature extraction

„

Mathematical ∞form ∞ f o ( x , y ) = ∑ ∑ a o ( i , j ) ⋅ sφφ ( x − i , y −

j)

i = −∞ j = −∞

2/22/2009

a o ( i , j ) =< f ( x , y ), sφφ ( x − i , y − j ) >

sφφ ( x, y ) = φ ( x) ⋅ φ ( y )

sψψ ( x, y ) = ψ ( x) ⋅ψ ( y )

386

Implementation of 2d DWT COLUMNS ROWS

……

ROWS

COLUMNS ……

~ H

1 2

LL Ak +1

~ G

1 2

( h) D LH k +1

~ H

1 2

HL Dk(v+)1

1 2

(d ) D HH k +1

2 1

INPUT IMAGE

COLUMNS

ROWS

~ G

2 1 COLUMNS

LL

LH

HL

HH

INPUT IMAGE 2/22/2009

~ H

~ G

LLL LLH LHL LHH

HL

LH HH

LLH LHL LHH LL

HL

LH HH

387

Up and down …. Up and down 2 1

Downsample columns along the rows: For each row, keep the even indexed columns, discard the odd indexed columns

1 2

Downsample columns along the rows: For each column, keep the even indexed rows, discard the odd indexed rows

2 1

Upsample columns along the rows: For each row, insert zeros at between every other sample (column)

1 2

Upsample rows along the columns: For each column, insert zeros at between every other sample (row)

2/22/2009

388

Reconstruction LL Ak +1

1 2

H 2 1

LH D ( h) k +1

1 2

G

HL D (v ) k +1

1 2

H

HH Dk( d+)1 2/22/2009

ORIGINAL IMAGE

2 1 1 2

H

G

G

389

Subband coding algorithm „ „

Halves the Time Resolution –

Only half number of samples resulted



The spanned frequency band halved

Doubles the Frequency Resolution

0-1000 Hz

X[n] 512

Filter 1

256 S S

D2

A2 A3

2/22/2009

D3

Filter 2

A1 D1

A1

256

D1: 500-1000 Hz

128

D2: 250-500 Hz

128

D3: 125-250 Hz

Filter 3

64

A2

64

A3: 0-125 Hz390

Application of wavelets „ „ „ „ „ „

Compression De-noising Feature Extraction Discontinuity Detection Distribution Estimation Data analysis – Biological data – NDE data – Financial data 2/22/2009

391

Fingerprint compression Wavelet: Haar Level:3

2/22/2009

392

Image denoising using wavelet transform

Image de-noising using wavelet transform: Utilize the same principles as for signal decomposition and de-noising. Each column of an image matrix is convolved with high-pass and low-pass filter followed by downsampling. The same process is applied to image matrix rows. The choice of threshold limits δ for each decomposition level and modification of its coefficients for k=0, 1, … N-1 Backward image

if reconstruction | c(k ) |> δ out of ifmodified | c(kwavelet ) |> δ transform coefficients

2/22/2009

393

Image enhancement using wavelet transform

2/22/2009

394

Related Documents

Introduction To Dip
December 2019 38
Dip
October 2019 38
Dip
June 2020 18
Introduction To
November 2019 56
Dip Qiuzques
November 2019 23
Programa Dip
June 2020 11

More Documents from ""