Image Processing and the Applications
Rajashekhara
2/22/2009
1
What is an Image?
a representation, likeness, or imitation of an object or thing a vivid or graphic description something introduced to represent something else
2/22/2009
2
Where are we? Display/Printing? Computer Vision?
Imaging?
Digital Image Processing
Computer Graphics? 2/22/2009
Biological Vision? 3
Computer vision Vs computer graphics Transformation(Computer Graphics)
3D information
Image or Display
Extraction(Computer Vision)
2/22/2009
4
Computer vision, Computer graphics, Image processing
Computer vision estimates 3D data from one or more 2D images. Computer graphics generates 2D/3D images from the 3D (from the mathematical functions) of an object. Computer vision and Computer graphics are inverse operations each other. They both use image processing which is there fore regarded as low level (or basic) operation for computer vision and computer graphics. Note that Computer Vision and Computer graphics and Image processing are normally considered as three overlapping areas but none them are subset of the other
2/22/2009
5
Computer Vision Means
Machine Vision Robot Vision Scene Analysis Image Understanding Image Analysis
2/22/2009
6
Image processing Means
Image processing refers to a set of computational techniques which accept images as input. The results of the processing can be new images or information extracted from the input images. Video is just time sequence of images called frames. All image processing techniques can be applied to frames. Image processing has many applications
2/22/2009
7
Why Image processing?
Why? – Coding/compression – Enhancement, restoration, reconstruction – Analysis, detection, recognition, understanding – Visualization 2/22/2009
8
What do we do? Image Processing/ Manipulation
Digital Image Processing
Image Analysis/ Interpretation 2/22/2009
Image Coding/ Communication 9
Digital Image
2/22/2009
10
What is an Image?
a visual representation of objects, their components, properties, relationships, Mapping 3D scene on to a 2D plane.
2/22/2009
11
What is Digital Image ?
A digital image contains a fixed number of rows and columns of integer numbers. Each integer number is called pixel, picture elements, representing brightness at the points of the image. R
2/22/2009
34
58
98
13
25
39
88
47
17
12
Digital Image Digital image = a multidimensional array of numbers (such as intensity image) or vectors (such as color image)
Each component in the image called pixel associates with the pixel value (a single number in 2/22/2009 the case of intensity images or a vector in the case of color images).
⎡10 10 16 28⎤ 56 ⎥ 43⎤ ⎢ 9 ⎡656 70 26 37 ⎥ 78 ⎥ 67 ⎤ ⎢ ⎢32 ⎡99 54 70 96 56 ⎢ ⎥ ⎥ 67⎥ ⎢15 25⎢6013 902296 ⎢ ⎢ 21 ⎢ 54 47 ⎥ 42⎥ ⎥ ⎢32 ⎢ 15⎢8587 8539⎥4313⎥ 92⎥ ⎢54 ⎢ 65 65 39⎥ ⎥ ⎢32 65 87 99⎥
Digital Image Types : Binary Image Binary image or black and white image Each pixel contains one bit : 1 represent white 0 represents black
Binary data
2/22/2009
⎡0 ⎢0 ⎢ ⎢1 ⎢ ⎢1
0 0 1 1
0 0 1 1
0⎤ 0⎥ ⎥ 1⎥ ⎥ 1⎥
14
Digital Image Types : Intensity or Gray image Intensity image or monochrome image each pixel corresponds to light intensity normally represented in gray scale (gray level).
Gray scale values
2/22/2009
⎡10 10 16 28⎤ ⎢ 9 6 26 37⎥ ⎥ ⎢ ⎢15 25 13 22⎥ ⎥ ⎢ ⎢32 15 87 39⎥
15
Digital Image Types : RGB image Color image or RGB image: each pixel contains a vector representing red, green and blue components.
RGB components
2/22/2009
⎡10 10 16 28⎤ 56 ⎥ 43⎤ ⎢ 9 ⎡656 70 26 37 ⎥ 78 ⎥ 67 ⎤ ⎢ ⎢32 ⎡99 54 70 96 56 ⎢ ⎥ ⎥ 67⎥ ⎢15 25⎢6013 902296 ⎢ ⎢ 21 ⎢ 54 47 ⎥ 42⎥ ⎥ ⎢32 ⎢ 15⎢8587 8539⎥43 ⎥ 92⎥ 39⎥ ⎢54 ⎢ 65 65 16 ⎥ ⎢32 65 87 99⎥
Digital Image Types : Index Image Index image Each pixel contains index number pointing to a color in a color table Color Table Index No.
⎡1 4 9 ⎤ ⎢6 4 7 ⎥ ⎥ ⎢ ⎢⎢6 5 2⎥⎥ 2/22/2009
Index value
…
Red
Green
Blue
component
component
component
1
0.1
0.5
0.3
2
1.0
0.0
0.0
3
0.0
1.0
0.0
4
0.5
0.5
0.5
5
0.2
0.8
0.9
…
…
…
17
Human Vision & Image Visualization In the beginning…
we’ll have a look at the human eye
2/22/2009
18
Cross section of Human Eye
2/22/2009
19
Visual perception : Human eye 1.
The lens contains 60-70% water, 6% of fat.
2. 3.
The iris diaphragm controls amount of light that enters the eye. Light receptors in the retina - About 6-7 millions cones for bright light vision called photopic - Density of cones is about 150,000 elements/mm2. - Cones involve in color vision. - Cones are concentrated in fovea about 1.5x1.5 mm2. - About 75-150 millions rods for dim light vision called scotopic - Rods are sensitive to low level of light and are not involved color vision.
4. Blind spot is the region of emergence of the optic nerve from the eye. 20
2/22/2009
20
Electromagnetic Spectrum
The whole electromagnetic spectrum is used by “imagers” The human eye is sensible to electromagnetic waves in the ‘visible spectrum’ : Electromagnetic Spectrum
cosmic rays
-4
10
gamma rays
-2
10
radio frequency microwave (SAR)
visible X-Rays
1
UV
10
2
IR
10
4
10
6
8
10
10
10
12
10
wavelength (Angstroms) 1 Å = 10
2/22/2009
-10
m
21
The human eye is sensible to electromagnetic waves in the ‘visible spectrum’ , which is around a wavelength of
0.000001 m = 0.001 mm
2/22/2009
22
The human eye
•IIs able to perceive electromagnetic waves in a certain spectrum
•IIs able to distinguish between wavelengths in this spectrum (colors)
•HHas a higher density of receptors in the center
•Maps our 3D reality to a 2 dimensional 2/22/2009
23
image !
The retinal model is mathematically hard to handle (e.g. neighborhood ?) Easier: 2D array of cells, modelling the cones/rods
Each cell contains a numerical value (e.g. between 0-255) 2/22/2009
24
•TThe position of each cell defines the position of the receptor
•TThe numerical value of the cell represents the illumination received by the receptor
5 7 1 0 12 4 ………
2/22/2009
25
•WWith this model, we can create GRAYVALUE images
•VValue = 0: BLACK (no illumination / energy) •VValue = 255: White (max. illumination / energy)
2/22/2009
26
What is light?
• TThe visible portion of the electromagnetic (EM) spectrum. • IIt occurs between wavelengths of approximately 400 and 700 nanometers.
2/22/2009
27
Short wavelengths
• DDifferent wavelengths of radiation have different properties. • TThe x-ray region of the spectrum, it carries sufficient energy to penetrate a significant volume or material.
2/22/2009
28
Long wavelengths
• CCopious quantities of infrared (IR) radiation are emitted from warm objects (e.g., locate people in total darkness).
2/22/2009
29
Long wavelengths
• “Synthetic aperture radar” (SAR) imaging techniques use an artificially generated source of microwaves to probe a scene. • SAR is unaffected by weather conditions and clouds (e.g., has provided us images of the surface of Venus).
2/22/2009
30
Range images
• AAn array of distances to the objects in the scene. • TThey can be produced by sonar or by using laser rangefinders.
2/22/2009
31
Sonic images
• PProduced by the reflection of sound waves off an object. • HHigh sound frequencies are used to improve resolution.
2/22/2009
32
Image formation
Image is two dimensional pattern of brightness
What to do to get info on 3D world? -Study image formation process - Understand how the brightness pattern is produced. Two important tasks - Where the image of some point will appear? - How bright the image of some surface will be ? 2/22/2009
33
A simple model of image formation
The scene is illuminated by a single source. The scene reflects radiation towards the camera. The camera senses it via chemicals on film.
Light reaches surfaces in 3D. Surfaces reflect. Sensor element receives light energy. Intensity is important. Angles are important. 2/22/2009 Material is important.
34
Geometry and physics – The geometry of image formation, which determines where in the image plane the projection of a point in the scene will be located. – The physics of light, which determines the brightness of a point in the image plane as a function of illumination and surface properties. 2/22/2009
35
Image Formation
Digital image generation is first step in any image processing or computer vision method. The generated image is function of many parameters like the reflection characteristics of object surface, sensor characteristics of the camera, the optical characteristics of lens, the analog to digital converter, characteristics of light source, geometric laws on the basis of which image is acquired. 2/22/2009
36
Image formation – The first task is related to the camera projection which can be either a perspective projection or an orthographic projection. The perspective projection is more general than the orthographic projection, but requires calculations. – The second task is related to surface reflection properties, illumination conditions, and surface orientation with respect to camera and light sources. 2/22/2009
37
Geometric camera models The projection of surface point of 3 dimensional scene into 2 dimensional image plane can be described by a perspective or orthographic projection
Pine hole camera A camera with zero aperture size All rays from the 3D scene points always pass through optical center of the lens ©
2/22/2009
38
Coordinate system
In computer vision, we deal with three kinds of coordinates systems – Image coordinate system, Camera coordinate system, and World coordinate system. Image coordinate is basically two-dimensional image plane. Camera coordinate is one, that is adjusted to the camera. It can be either camera centered or image centered. In the camera-centered coordinate system the origin is the focal point and the optical axis is Z axis. In the image – centered system the origin is positioned in the XY image plane. World coordinate system is general coordinate system with some reference axes. 2/22/2009
39
Perspective projection set up Consider pinhole camera model
2/22/2009
Projection of a scene point P of the XYZ space onto the image point P’ in the xy image plane is perspective projection The optical axis defined to be perpendicular from the pinhole C to the image plane The distance f between C and the image plane is focal length The coordinate system of XYZ space is defined such that the XY plane is parallel to the image plane, and origin is at the pinhole C, then the Z axis lies along the optical axis 40
Perspective projection equations
2/22/2009
41
Perspective projection equations From the similar triangles (CA’P’) and (CAP
From the two similar triangles (A’B’P’) and (ABP)
From the last two equations, perspective projection equations are obtained: 2/22/2009
42
Perspective projection Points go to points Line go to lines Planes go to whole image or half planes Polygons go to polygons
Long focal length -> narrow field of view Small focal length -> large (wide) field of view-wide angle cameras 2/22/2009
43
Perspective projection • Produces a view where the object’s size depends on the distance from the viewer • An object farther away becomes smaller
2/22/2009
44
Perspective projection
Horizon – observer’s eye level Ground Line – plane on which object rests Vanishing point – position on horizon where depth projectors converge Projection plane – plane upon which object is projected
2/22/2009
45
Vanishing points
Object edges parallel to projection plane remain parallel in a perspective projection Object edges not parallel to projection plane converge to a single point in a perspective projection Æ vanishing point
(vp)
2/22/2009
46
Camera with aperture
In practice, the aperture must be larger to admit more light. Lenses are placed to in the aperture to focus the bundle of rays from each scene point onto the corresponding point in the image plane
2/22/2009
47
Orthographic projection
Orthographic projection is modeled by rays parallel to optical axis rather than passing through the optical center Suppose that the image of a plane lying at Z=Zo parallel to the image plane is formed. The magnification m can be defined as the ratio of distance between two points in the image to distance between their corresponding points on the image plane
2/22/2009
48
Orthographic projection
2/22/2009
49
Orthographic projection For an object located at average distance –Zo and variations in Z over its visible surface is not significant compared to –Zo (when distance between camera and object is very large relative to the variations in the object depth) then the image this object will be magnified by a factor m. For all the visible points of object , projection equations are
The scaling factor m is usually set to 1 or –1 for convenience. Simple projection equations are x=X, y=Y 2/22/2009
50
Radiometry basics
What determines the brightness of an image pixel? Light source properties
Exposure
2/22/2009
Surface shape
Optics
Surface reflectance properties 51
Radiometry basics
Foreshortening and Solid angle Measuring light : radiance Incoming Outgoing Light at surface : interaction between light and surface – – –
irradiance = light arriving at surface BRDF outgoing radiance
Special cases and simplifications : Lambertain, specular, parametric and non-parametric models 2/22/2009
52
Foreshortening Two sources that look the same to a receiver must have same effect on the receiver; Two receivers that look the same to a source must receive the same energy.
2/22/2009
53
Solid Angle
By analogy with angle (in radians), the solid angle subtended by a region at a point is the area projected on a unit sphere centered at that point The solid angle dω subtended by a patch of area dA is given by:
dA cos θ dω = 2 r Measured in steradians (sr) Foreshortening : patches that look the same, same solid angle. 2/22/2009
A
54
Radiometry basics
Radiometry is a branch of physics that deals with the measurement of the flow and transfer of radiant energy. Radiance is the power of light that is emitted from a unit surface area into some spatial angle; the corresponding photometric term is brightness. Irradiance is the amount of energy that an image capturing device gets per unit of an efficient sensitive area of the camera. Quantizing it gives image gray tones.c 2/22/2009
55
Radiometry basics
Radiance (L): energy carried by a ray – Power per unit area perpendicular to the direction of travel, per unit solid angle – Units: Watts per square meter per steradian (W m-2 sr-1)
Irradiance (E): energy arriving at a surface – Incident power in a given direction per unit area – Units: W m-2 – For a surface receiving radiance L(x,θ,φ) coming in from dω the corresponding irradiance is n
dω
E (θ , φ ) = L(θ , φ ) cos θ dω θ
2/22/2009
dA
dA cos θ
56
Radiance –emitted light Radiance = power traveling at some point in a direction per unit area perp to direction of travel, per solid angle
unit = watts/(m2sr) constant along a ray
P L(x, θ , φ ) = (dA cos θ )dω
A
θ
A cos θ
dω
P
dA
Radiance transfer : Power received at dA2 at dist r from emitting area dA1
P1→2 = LdA1 cos θ1 ( 2/22/2009
dA2 cos θ 2 ) 2 r dω21
P1→2 = P2→1
θ1
r
dA1
θ2
dA2
57
Light at surface : irradiance Irradiance = unit for light arriving at the surface
φ
dE (x) = L(x, θ , φ ) cos θdω
θ
x
Total power = integrate irradiance over all incoming angles 2π π / 2
E ( x) =
∫ ∫ L(x,θ , φ ) cosθ sin θdθdφ 0 0
2/22/2009
x
dω
58
Bidirectional reflectance distribution
Model of local reflection that tells how bright a surface appears when viewed from one direction when light falls on it from another Definition: ratio of the radiance in the outgoing direction to irradiance in the incident direction surface normal
ρ (θ i , φi ,θ e , φe ) =
Le (θ e , φe ) Le (θ e , φe ) = Ei (θ i , φi ) Li (θ i , φi ) cos θ i dω
Radiance leaving a surface in a particular direction: add contributions from every incoming direction
∫ ρ (θ , φ ,θ , φ ,)L (θ , φ )cosθ dω i
2/22/2009
Ω
i
e
e
i
i
i
i
i 59
Light leaving surface : BRDF
?
x
many effects :
Assume:
transmitted - glass reflected - mirror
• surfaces don’t fluorescent
scattered – marble, skin
• cool surfaces
travel along a surface, leave some other
• light leaving a surface due to light arriving
absorbed - sweaty skin
BRDF = Bi-directional reflectance distribution function Measures, for a given wavelength, the fraction of incoming irradiance from a direction ωi in the outgoing direction ωo [Nicodemus 70]
ρ (x,θ i , φi ,θ e , φe ) =
Le (x, θ e , φe ) Li (x, θ i , φi ) cos θ i dω
Reflectance equation : measured radiance (radiosity = power/unit area leaving surface
Lo ( x, θ o2/22/2009 , φo ) = ∫ ρ ( x, θ i , φi ,θ o , φo ) L (θ i , φi ) cos(θ i ) dωi Ω
Correction 60
Reflection as convolution Reflectance L (x, θ , φ ) = ρ (x, θ ' , φ ' ,θ ' , φe' ) L(θ , φ ) cos(θ )dω o e e i i e i i i i ∫Ω' equation
= ∫ ρ (x, θ i ' , φi ' ,θe' , φe ' ) L( Rα , β (θ i ' , φi ' )) cos(θ i )dωi Ω
Reflection behaves like a convolution in the angular domain BRDF – filter Light - signal
2/22/2009
61
Lambertian BRDF Emitted radiance constant in all directions Models – perfect diffuse surfaces : clay, mate paper, … BRDF = constant = albedo One light source = dot product normal and light direction
Lo (x) = ρLi (x,θ i , φi ) cos θ i = ρ (N • L i ) albedo
normal
light dir
Diffuse reflectance acts like a low pass filter on the incident illumination.
Lo ( x, θo , φo ) = ∫ ρ L(θi , φi ) cos(θi )dωi Ω'
2/22/2009
62
BRDF for Lambertian surface Image irradiance = 1 * scene Π radiance
2/22/2009
63
How to represent surface ? Z Y ɵ r=1 X
ɸ
Surface normal ( n ) – It is directional vector with magnitude unity Camera in z direction we see hemisphere Depth representation would be z=z(x,y) 2/22/2009 64 ^
How to represent surface ? Equation of the sphere (surface) x^2+y^2+z^2=a^2 Z=+sqrt(a^2-x^2-y^2) If the surface is well behaved
z = z( x, y) = z( xo , yo ) + ∂xT ∇z = z( xo , yo ) + ( x − xo ) ∂z
∂x
+ ( y − yo) ∂z
∂y
+ ........higherorderterms
If the surface is smooth, simply neglect higher order terms i.e. small neighborhood you can consider it as plane called planar approximation
z = ( x − xo ) ∂z
+ ( y − yo) ∂z
= p∂x + q∂y ∂x ∂y First order approximation of surface, p and q relate to 2/22/2009 65 The gradient of surface
Surface normal z ∧
n
y
∧
n
B
−
OA = (∂x,0, p , ∂x ) = (1,0, p )
q∂y
−
A
∂y
p∂x
∂x
OB = (0,1, q ) x
Surface normal perpendicular to tangent plane p =slope of surface in x direction q =slope of surface in y direction Cross product of OA & OB vectors gives the surface normal ∧
n = OA × OB ∧
n = 2/22/2009
( − p , − q ,1 ) 1+ p
2
+ q
2
(p,q) is known, n∧ is known and henc surface normal 66
Specular reflection Smooth specular surfaces Mirror like surfaces Light reflected along specular direction Some part absorbed
Rough specular surfaces Lobe of directions around the specular direction Microfacets
Lobe 2/22/2009
Very small – mirror Small – blurry mirror 67 Bigger – see only light sources Very big – fait specularities
Diffuse reflection
Dull, matte surfaces like chalk or latex paint Microfacets scatter incoming light randomly Light is reflected equally in all directions: BRDF is constant Albedo: fraction of incident irradiance reflected by the surface Radiosity: total power leaving the surface per unit area (regardless of direction) 2/22/2009
68
Radiosity -summary Radiance
Light energy
Irradiance
Unit incoming light
dE (x) = L(x, θ , φ ) cos θdω
Total Energy incoming Radiosity
Energy at surface
Ei (x) = ∫ L(x, θ , φ ) cos θdω
Unit outgoing radiance
Lo (x, θ e , φe ) = ∫ ρ (x, θ i , φi ,θ e , φe ) L(θ i , φi ) cos(θ i )dωi
Total energy 2/22/2009 leaving
Energy leaving the surface
L(θ , φ ) =
P (dA cos θ )dω
ω
Ω
Eo =
∫
Ωo
⎡ ⎤ ρ ( x , θ , φ , θ , φ ) L ( θ , φ ) cos( θ ) d ω ⎢∫ i i e e i i i i ⎥ cos(θ e ) dωe ⎢⎣Ωi ⎥⎦ 69
Interaction of light and matter What happens when a light ray hits a point on an
object?
– Some of the light gets absorbed converted to other forms of energy (e.g., heat) – Some gets transmitted through the object possibly bent, through “refraction” – Some gets reflected possibly in multiple directions at once – Really complicated things can happen fluorescence
Let’s consider the case of reflection in detail – In the most general case, a single incoming ray could be reflected in all directions. How can we describe the amount of light reflected in each direction?
2/22/2009
70
Image formation system Relation between what camera captures and what the surface reflects
2/22/2009
71
Image formation system
-Consists of a thin lens and an image plane The diameter of the lens is d and the value of the focal length is fp. The system is assumed to be focused, rays originating from a particular point on the object meet at single point in the image plane rays originating from infinitesimal area dAo on the object are projected into some area dAp in the image plane and no rays from outside the area dAo will reach dAp When a camera captures the image of an object, the measured gray value is proportional to the image irradiance which is related to the reflection properties of the object surface. 2/22/2009
72
Image formation system
How to calculate image irradiance in an image forming system - The radiant flux dɸ that is emitted from the surface patch dAo and passes through the entrance aperture can be calculated by
Where integration is over solid angle occupied by the entrance aperture as seen from the surface patch By assuming that there is no power loss in the medium, the image area dAp will receive the same flux dɸ that is emitted from dAo By definition, the image irradiance is the incident flux per unit area 2/22/2009
73
Image formation system From the previous equations
- Let Ѳr’ be the angle between the surface normal and the line to the entrance aperture, and let α be the angle between this line and the optical axis. The solid angle occupied by the surface patch dAo and seen from the entrance aperture equals the solid angle occupied by the image area dAp : From the previous equations
2/22/2009
74
Image formation system
If the size of the lens is small relative to the distance between the lens and the object, then the values of angle Ѳr’ in the previous integral can be approximated by Ѳr’ and the reflectance Lr tends to be constant and can be removed from the integral which leads to
The solid angle occupied by the lens as seen from the surface patch is approximately equal d Π ( ) cos( α ) divided to the foreshortened area 2 by the distance Cos f ( α ) 2
o
2/22/2009
75
Image formation system Finally the expression of the image irradiance is obtained as That is, the image irradiance is proportional to the scene radiance and the factor of proportionality is a function of the off-axis angle.
fp
d
- >F stop number of camera.
2/22/2009
76
Image formation ⎡π ⎛ d ⎞2 ⎤ E = ⎢ ⎜ ⎟ cos 4 α ⎥ L ⎢ 4 ⎜⎝ f p ⎟⎠ ⎥ ⎣ ⎦
Image irradiance is linearly related to scene radiance Irradiance is proportional to the area of the lens and inversely proportional to the squared distance between the lens and the image plane The irradiance falls off as the angle between the viewing ray and the optical axis increases 2/22/2009
77
What happens on Image plane ? (CCD camera plane)
Lens collects light arrays. Array of small fixed elements replace chemicals of film. Each element generates a voltage signal based on irradiance value
2/22/2009
78
Digitization
Analog images are continuous representations of color This is somewhat of a problem for computers, which like discrete measurements
2/22/2009
79
Digitization
object
Imaging systems
Sample and quantize
Digital storage (disk)
Digital computer
On-line buffer
observe
digitize
store
process
Refresh /store
2/22/2009
Display output
Record
80
Digital image acquisition process
2/22/2009
81
Image sampling & quantization
y
Grayscale image – A grayscale image is a function I(x,y) of the two spatial coordinates of the image plane. – I(x,y) is the intensity of the image at the point (x,y) on the image plane.
I(x,y) takes non-negative values assume the image is bounded by a rectangle [0,a]×[0,b]
I: [0, a] × [0, b] → [0, inf ) z
250
Color image
intensity
x
200 150 100
50 I(x,y)
– Can be represented by three functions, R(x,y) for red, G(x,y) for green, and B(x,y) for blue.
0 100 80
y
rows 2/22/2009
60
50
40 20 0
0
columns
x 82
Image sampling & Quantization
The analog signal representing a continuous image is sampled to produce discrete values which can be stored by a computer The frequency of digital samples greatly affects the quality of the digital image
2/22/2009
83
Image sampling & Quantization
To create a digital image, we need to convert continuous sensed data into digital form. This involves two processes: sampling and quantisation The basic idea behind sampling and quantization is illustrated in Fig. 3.1.
2/22/2009
84
Image sampling & quantization
Computer handles “discrete” data. Sampling
– Sample the value of the image at the nodes of a regular grid on the image plane. – A pixel (picture element) at (i, j) is the image intensity value at grid point indexed by the integer coordinate (i, j).
255 (white)
Quantization – Is a process of transforming a real valued sampled image to one taking only a finite number of distinct values. – Each sampled value in a 256-level 2/22/2009 grayscale image is represented by 8 bits.
0 (black) 85
How sampling works ?
The original analog representation
2/22/2009
Measurements are made at equal intervals
Discrete samples are taken from the measurements
86
Image sampling & quantization
Figure 3.1(a) shows a continuous image, f (x, y), that we want to convert to digital form. To convert it to digital form, we have to sample the function in both coordinates and in amplitude. An image may be continuous with respect to the x- and y-coordinates and also in amplitude. 2/22/2009
87
Image sampling & quantization
Digitizing the coordinate values is called
Digitizing the amplitude values is called
sampling.
quantization.
2/22/2009
88
Image sampling & quantization
Fig 3.1 Generating a digital image (a) Continuous image. (b) A scan line from A to B in the continuous image. (c) Sampling & quantisation. (d) Digital scan 2/22/2009 89 line.
Image sampling & quantization
The one-dimensional function shown in Fig. 3.1(b) is a plot of amplitute (gray level) values of the continuous image along the line segment AB in Fig. 3.1(a). To sample this function, we take equally spaced samples along line AB, as shown in Fig. 3.1(c). Location of each sample is given by a vertical tick mark in the bottom part of the figure. 2/22/2009
90
Image sampling & quantization
The samples are shown as small white squares superimposed on the function. The set of these discrete locations gives the sampled function. However, the values of the samples still span (vertically) a continuous range of gray-level values. In order to form a digital function, the gray-level values also must be converted (quantized) into discrete quantities. 2/22/2009
91
Image sampling & quantization
The right side of Fig. 3.1(c) shows the gray-level scale divided into eight discrete levels, ranging from black to white. The vertical tick marks indicate the specific value assigned to each of eight gray levels. The continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to each sample. 2/22/2009
92
Image sampling & quantization
The assignment is made depending on the vertical proximity of a sample to a vertical tick mark. The digital samples resulting from both sampling and quantization are shown in Fig. 3.1(d) and Fig 3.2 (b).
2/22/2009
93
Original image
How to choose the spatial resolution : Nyquist rate
1mm
Sampled image
2mm
No detail is lost! Minimum Period
Spatial resolution (sampling rate)
= Sampling locations 2/22/2009
Nyquist Rate: Spatial resolution must be less or equal half of the minimum period of the image or sampling frequency must be greater or Equal twice of the maximum frequency. 94 94
Aliased frequencyx (t ) = sin(2πt), 1
f =1
x2 (t ) = sin(12πt ),
1
f =6
0.5 0 -0.5 -1
0
0.5
1
1.5
2
0
0.5
1
1.5
2
Sampling rate: 5 samples/sec
1 0.5 0 -0.5 -1
Two2/22/2009 different frequencies but the same results !
95
Image sampling & quantization
Fig. 3.2 (a) Continuous image projected onto a sensor array. 2/22/2009 (b) Result of image sampling and quantisation
96
Image digitization
• SSampling means measuring the value of an
image at a finite number of points. • QQuantization is the representation of the measured value at the sampled point by an integer. 2/22/2009
97
Image digitization
2/22/2009
98
Image sampling & quantization
Fig. 3.3. Coordinate convention used to represent 2/22/2009 99 digital images
Image sampling & quantization
Fig. 3.4. A digital image of size M x N 2/22/2009
100
Image sampling & quantization
It is advantageous to use a more traditional matrix notation to denote a digital image and its elements.
2/22/2009
Fig. 3.5 A digital image
101
Image sampling & quantization
The number of bits required to store a digitised image is b=MxNxk Where M & N are the number of rows and columns, respectively. The number of gray levels is an integer power of 2: L = 2k where k =1,2,…24 It is common practice to refer to the image as a “k-bit image” 2/22/2009
102
Image sampling & quantization
The spatial resolution of an image is the physical size of a pixel in that image; i.e., the area in the scene that is represented by a single pixel in that image. It is smallest discernible detail in an image. Sampling is the principal factor determining spatial resolution. Gray level resolution refers to smallest discernible change in gray level (often power of 2) Dense sampling will produce a high resolution image in which there are many pixels, each of which represents of a small part of the scene. Coarse sampling, will produce a low resolution image in which there are a few pixels, each of which represents of a relatively large part of the scene. 2/22/2009
103
Image sampling & quantization
Fig. 3.6 Effect of resolution on image interpretation (a) 8x8 2/22/2009 image. (b) 32x32 image © 256x256 image
104
Effect of sampling
256x256 64x64 16x16 2/22/2009
105
Examples of Sampling
256x256 pixels
2/22/2009
64x64 pixels
128x128 pixels
32x32 pixels
106
Effect of spatial resolution
2/22/2009
107
Effect of spatial resolution
2/22/2009
108
Can we increase spatial resolution by interpolation ?
2/22/2009
109
Down sampling is an irreversible process.
Image Sampling original image
sampled by a factor of 4
2/22/2009
sampled by a factor of 2
sampled by a factor of 8
110
Image sampling & quantization
Fig.3.7 Effect of quantisation on image interpretation. (a) 4 2/22/2009 levels. (b) 16 levels. (c) 256 levels
111
Effect of Quantization
8 bits / pixel 4 bits / pixel 2 bits / pixel 2/22/2009
112
Effect of quantization levels
256 levels
128 levels
2/22/2009
113
64 levels
32 levels
Effect of quantization
16 levels
8 levels
4 levels
2 levels
In this image, it is easy to see false contour.
2/22/2009
114
Image quantization •
256 gray levels (8bits/pixel)
32 gray levels (5 bits/pixel)
16 gray levels (4 bits/pixel)
•
8 gray levels (3 bits/pixel)
4 gray levels (2 bits/pixel)
2 gray levels (1 bit/pixel)
2/22/2009
115
Image representation
The result of sampling and quantisation is a matrix of integer numbers as shown in Fig.3.3, Fig.3.4. and Fig 3.5. The values of the coordinates at the origin are (x,y) = (0,0). The next coordinate values along the first row are (x,y) = (0,1). The notation (0,1) is used to signify the 2nd sample along the 1st row. 2/22/2009
116
Image representation
Images can be represented by 2D functions of the form f(x,y). The physical meaning of the value of f at spatial coordinates (x,y) is determined by the source of the image. 2/22/2009
f(x,y) y x
117
Image representation
In a digital image, both the coordinates and the image value become discrete quantities. Images can now be represented as 2D arrays (matrices) of integer values: I[i,j] (or I[r,c]). The term gray level is used to describe monochromatic intensity.
2/22/2009
62
79
23
119
120
105
4
0
10
10
9
62
12
78
34
0
10
58
197
46
46
0
0
48
176
135
5
188
191
68
0
49
2
1
1
29
26
37
0
77
0
89
144
147
187
102
62
208
255
252
0
166
123
62
0
31
166
63
127
17
1
0
99
30
118
How to select the suitable size and pixel depth of images The word “suitable” is subjective: depending on “subject”.
Low detail image Lena image
Medium detail image High detail image Cameraman image
To satisfy human mind 1. For images of the same size, the low detail image may need more pixel depth. 2. As an image size increase, fewer gray levels may be needed. 2/22/2009
119
The pixel
Sample location and sample values combine to make the picture element or
pixel
3 color samples per pixel: – 1 RED sample – 1 GREEN sample – 1 BLUE sample
Information about pixels is stored in a rectangular pattern and displayed to the screen in rows called rasters (from Spalter).
2/22/2009
120
The pixel
Monitor pixels are actually circular light representations of red, green and blue phosphors Pixel density is measured using Dots Per
Inch (DPI)
Pixel size is measured using Dot Pitch DPI and Dot Pitch have an inverse relationship ( DPI = Dot Pitch)
2/22/2009
121
Image characteristics Each pixel is assigned a numeric value (bit depth) that represents a shade of gray based on the attenuation characteristics of the volume of tissue imaged
2/22/2009
122
Pixel depth
TThe number of bits determines the number of shades of gray the system is capable of displaying on the digital images.
110- and 12- bit pixel can display 1024 and 4096 shades of gray, respectively. IIncreasing pixel bit depth improves image quality
2/22/2009
123
Bit-Depth
Number of bits to represent pixel color
Expression
Name
Colors
21
2-bit
2
24
4-bit
16
26
6-bit
64
2/22/2009
124
Bit-Depth
Number of bits to represent pixel color
Expression
Name
Colors
28
8-bit
256
216
16-bit
65, 536
224
24-bit (True Color)
About 16-million
2/22/2009
125
Digital image characteristics AA digital image is
2/22/2009
displayed as a combination of rows and columns known as matrix The smallest component of the matrix is the pixel (picture element) The location of the pixel within the image matrix corresponds to an area within the patient or volume of tissue referred to as voxel 126
Matrix size For a given field of view, a larger matrix size includes a greater number of smaller pixels. 2/22/2009
127
Color Fundamentals UUsed heavily in human vision. VVisible spectrum for humans is 400 nm (blue) to 700 nm (red). MMachines can “see” much more; e.g., X-rays, infrared, radio waves.
2/22/2009
128
HVS
CColor perception Llight hits the retina, which contains photosensitive cells. TThese cells convert the spectrum into a few discrete values.
2/22/2009
129
HVS
TThere are two types of Pphotosensitive cells: CCones : Cones are sensitive to ccolored light, but not very sensitive tto dim light. RRods : Sensitive to achromatic light. TCones perceive color using three different types of cones. Each one is sensitive in a different region of the spectrum. 445 nm (blue), 535 nm (green), 575 nm (Red). Have different sensitivities. We are more sensitive to green than red. (
2/22/2009
130
W WHumans discern thousands of color shades and ColorcanFundamentals intensities compared to about only two dozen shades of gray. WWhen a beam of sunlight passes through a glass prism Emerging beam of light is continuous spectrum of colors ranging from violet to at one end to red at the other. If the light is achromatic its only attribute is its intensity, or amount. What can be seen on black and white television set. Gray level refers to scale measure Intensity that ranges from black, to grays, and finally to white. 2/22/2009
131
Color fundamentals Chromatic light spans the electromagnetic spectrum from 400 to 700nm. Quantities to describe quality of chromatic light source: radiance, luminance, and brightness. Radiance is the total amount of energy that flows from the light source. Luminance is the amount of energy perceived by the observer. Brightness is subjective measure that is practically impossible to measure. It embodies the achromatic notion of intensity. Human eye contains three types of cones red, green and blue. Due to the absorption characteristics of the human eye, colors are seen as variable combinations of the so called primary colors. R(red), green(G), and blue(B). Wavelengths of these colors are 700nm, 546.1nm, and 435.8nm, respectively (as per CIE standard). The primary colors can be added to produce the secondary colors of light – magneta(red plus blue), cyan(green plus blue), and yellow(red plus green). 2/22/2009 132
Color fundamentals The characteristics that distinguish one color from another are brightness, hue, and saturation. Brightness embodies the notion of achromatic intensity. Hue is an attribute associated with the dominant wavelength in a mixture of light waves. Hue is dominant color perceived by an observer. When we call an object is red, blue, orange, yellow we are referring to its hue. Saturation means relative purity or amount of whit light mixed with a hue. Degree of saturation is inversely proportional to the amount of white light added. Hue and saturation taken together are called chromaticity. Color may be characterized by its brightness and chromaticity. The amount of red, green, blue needed to form any particular color are referred to as tristimulus values and denoted X, Y, and Z, respectively. A color is characterized by its trichromatic coefficients, defined as 2/22/2009
x
=
y
=
z
=
X
+
X
+
X
+
X Y Y Y Z Y
+
Z
+
Z
+
Z
133
Color fundamentals
It is noted from these equations x+y+z=1. Another approach for specifying colors is to use CIE chromaticity diagram (Fig), which shows color composition as function of x(red) and y(green). For any value of x and y, the corresponding value of z(blue) is obtained as z=1-x-y. The point marked in figure has approximately 62% green and 25% red content. Composition of blue is 13%. The position of the various spectrum colors – from voilet 380nm to red at 780nm –are indicated around the boundary of the tongue shaped chromaticity diagram. At any point within the boundary represents some mixture of spectrum colors. The point of equal energy corresponds to equal fractions of three primary colors: it represents CIE standard for white light. Any point on the boundary of chromaticity chart is fully saturated. As we progress towards point of equal energy more and more white is added to the color and becomes less saturated. Saturation is zero at the point of equal energy. 134 2/22/2009
CIE Chromaticity model TThe Commission Internationale de l’Eclairage
defined 3 standard primaries: X, Y, Z that can be added to form all visible colors. Y It was chosen so that its color matching function matches the sum of the 3 human cone responses. ⎡ X ⎤ ⎡0.6067 0.1736 0.2001⎤ ⎡ R ⎤ ⎢Y ⎥ = ⎢0.2988 0.5868 0.1143⎥ ⎢G ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢⎣ Z ⎥⎦ ⎢⎣0.0000 0.0661 1.1149 ⎥⎦ ⎢⎣ B ⎥⎦
2/22/2009
⎡ R⎤ ⎡ 1.9107 − 0.5326 − 0.2883⎤⎡ X ⎤ ⎢G⎥ = ⎢− 0.9843 1.9984 − 0.0283⎥⎢ Y ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢⎣ B⎥⎦ ⎢⎣ 0.0583 − 0.1185 0.8986 ⎥⎦⎢⎣ Z ⎥⎦
135
CIE Chromaticity model
xx, y, z normalize X, Y, Z such that x + y + z = 1. Aactually only x and y are needed because z = 1 - x - y. Ppure colors are at the curved boundary. Wwhite is (1/3, 1/3, 1/3).
2/22/2009
136
Color fundamentals TThey provide a standard way of specifying a
particular color using a 3D coordinate system. HHardware oriented
RRGB: additive system (add colors to black) used for displays. CCMY: subtractive system used for printing. YYIQ: used for TV and is good for compression.
IImage processing oriented
HHSI: good for perceptual space for art, psychology and recognition.
2/22/2009
137
Color fundamentals • Primary Colors
2/22/2009
138
Color fundamentals The RGB model, each color appears in its primary spectral components of red green, and blue. This model is based on cartesian coordinate system. The color subspace of interest is the cube. RGB are the three primary colors and secondary colors cyan, magnet, and yellow are located at the corners of cube. In this model the gray scale extends from black to white along the line that joins origin to (1,1,1). All the values of RGB are assumed to be in the range [0 1]. Image represented in RGB color consist of three component images, one for each primary color. When fed to RGB monitor, these three images combine to produce composite color. The number of pixels used to represent each pixel in RGB space is called pixel depth. Each RGB pixel has depth of 24 bits. 2/22/2009
139
Color fundamentals
Secondary colors (additive synthesis):
2/22/2009
140
Color fundamentals
Secondary colors (additive synthesis): – adding primary colors: R + G + B = black R + G + B = blue R + G + B = green R + G + B = cyan R + G + B = red R + G + B = magenta R + G + B = yellow R+G+B= = white white
2/22/2009
141
Color fundamentals
Secondary colors (additive synthesis): – weighted adding of primary colors: 0.5 ··R R + 0.5 ··G G + 0.5 ··B B = grey 1.0 ··R R + 0.2 ··G G + 0.2 ··B B = brown 0.5 ··R R + 1.0 ··G G + 0.0 ··B B = lime 1.0 ··R R + 0.5 ··G G + 0.0 ··B B = orange
2/22/2009
142
Color fundamentals Color images can be represented by 3D Arrays (e.g. 320 x 240 x 3)
2/22/2009
143
AAdditive fundamentals model. Color - RGB AAn image consists of 3 bands, one for each primary color. AAppropriate for image displays.
2/22/2009
144
Color fundamentals - CMY
Primary colors (subtractive synthesis):
2/22/2009
145
Color fundamentals Cyan, Magneta, and Yellow are the secondary colors of light or alternatively primary colors of pigments. For example when a surface coated with cyan pigment is illuminated with white light, no red light is reflected from the surface. That is Cyan subtracts red light from reflected white light, which itself is composed of equal amounts of red, green, blue light. Most devices that deposit colored pigments on paper, such as color printers and copiers, require CMY data input or perform an RGB to CMY conversion internally. This conversion is performed using the simple operation 2/22/2009
146
CMY model
CCyan-Magenta-Yellow is a subtractive model which is good to model absorption of colors. AAppropriate for paper printing. Assumption here is all the color values are normalized to the raneg [0 1]
⎡ C ⎤ ⎡1⎤ ⎡ R ⎤ ⎢ M ⎥ = ⎢1⎥ − ⎢G ⎥ ⎢ ⎥ ⎢⎥ ⎢ ⎥ ⎢⎣ Y ⎥⎦ ⎢⎣1⎥⎦ ⎢⎣ B ⎥⎦ 2/22/2009
147
Color fundamentals - CMYK Equal amounts of the pigment primaries, cyan, magneta, and yellow should produce black. In practice, combining these colors for printing produces a muddy-looking black. So in order to produce black, a fourth color black is added giving rise to the CMYK color model. When publishers talk about “four-color printing” they are referring to the three colors of CMY color model plus black.
2/22/2009
148
Color fundamentals - HSI RGB and CMY color models are ideally suitable for hardware implementations. RGB strongly matches with the fact that human eye strongly perceptive to red, green, and blue components. Unfortunately these color models and similar other models are not well suited for describing colors in terms that are practical for human interpretation. For example one does not refer to the color of an object by giving the percentage of each of the primaries composing its color. In other words we do not think of color images as being composed of three primary images that combine to form that single image. 2/22/2009
149
Color fundamentals - HSI When humans view a color object, we describe it by its hue, saturation, and brightness. Hue is a color attribute that describes pure color (pure red, orange, or yellow). Saturation give measure of the degree to which a pure color is diluted by white light. Brightness is subjective descriptor that is practically impossible to measure. It embodies achromatic notion of intensity and is one of key factors for color sensation. We do know that the intensity (grey level) is most useful descriptor of monochromatic images. This quantity is easily measurable and interpretable. One such model that decouples the intensity component from the color-carrying information (hue and saturation) in a color image is HSI. As a result HIS model is an ideal tool for developing image processing algorithms based on color descriptions that are natural and intuitive to humans.150 2/22/2009
Color fundamentals To summarize RGB is ideal for image generation (image capture by a color camera or image display on monitor screen), but its use for color description is much more limited. RGB color image can be viewed as three monochromatic intensity images. In the RGB model the line joining black and white vertex represents intensity axis. To determine intensity of any color point just pass a plane perpendicular to the intensity axis. That gives us intensity value in the range [0 1]. Saturation of color increases as a function of distance from the intensity axis. Saturation of points on the intensity axis is zero. It is length of the vector from the origin to the point. Note that origin is defined by intersection of color plane with intensity axis. Hue can also be determined from the RGB point. It is a plane formed by three points (black, white , cyan). All the colors generated by three colors lie in triangle defined by those colors. Usually hue of some point is determined by an angle from some reference point. Usually an angle of 0 from the red axis 151 2/22/2009 designates 0 hue and increases counterclockwise from there.
Color fundamentals
UUniform: equal (small) steps give the same perceived color changes. HHue is encoded as an angle (0 to 2π). SSaturation is the distance to the vertical axis (0 to 1). IIntensity is the height along the vertical axis (0 to 1).
2/22/2009
152
Color fundamentals - HSI
The three important components of the HSI color space are the vertical intensity axis, the length of the vector to the color point, and the angle this vector makes with the red axis. HTo summarize HSI: Hue, saturation, value are non-linear functions of RGB. Hue relations are naturally expressed in a circle. ( R+G+B) I= 3 min( R, G, B) S = 1− I ⎧ 1 / 2[( R − G )+( R − B)] ⎫⎪ −1 ⎪ H = cos ⎨ ⎬ if BG153 2 ⎪⎩ ( R − G ) +( R − B )(G − B) ⎪⎭
[
2/22/2009
]
[
]
Color fundamentals - HSI
(Left) Image of food originating from a digital camera. (Center) Saturation value of each pixel decreased 20%. (Right) Saturation value of each pixel increased 40%. 2/22/2009
154
Color fundamentals -YIQ HHave better compression properties. LLuminance Y is encoded using more bits than chrominance values I and Q (humans are more sensitive to Y than I and Q). LLuminance used by black/white TVs. AAll 3 values used by color TVs.
2/22/2009
0.114 ⎤ ⎡ R ⎤ ⎡Y ⎤ ⎡0.299 0.587 ⎢ I ⎥ = ⎢0.596 − 0.275 − 0.321⎥ ⎢G ⎥ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎢⎣Q ⎥⎦ ⎢⎣0.212 − 0.532 0.311 ⎥⎦ ⎢⎣ B ⎥⎦
155
Color fundamentals Summary TTo print (RGB Æ CMY or grayscale)
TTo compress images (RGB Æ YUV) CColor description (RGB Æ HSI) CColor information (U,V) can be compressed 4 times without significant degradation in perceptual quality. TTo compare images (RGB Æ CIE Lab) CCIE Lab space is more perceptually uniform. EEuclidean distance in Lab space hence meaningful.
2/22/2009
156
Storing Images With the traditional cameras, the film is used both to record and store the image. With digital cameras, separate devices perform these two functions. The image is captured by the image sensor, then stored in the camera on a storage device of some kind. We look at many of the storage devices currently used. Removable Vs. Fixed Storage Older and less expensive cameras have built-in fixed storage that can’t be removed or increased. This greatly reduces the number of photos you can take before having to erase to make room for new ones. Allmost all newer digital cameras use some form of removable storage media, usually flash, memory cards, but occasionally small hard disks, and even CDs, and the variations of the floppy disk. Whatever its form, removable media let’s you remove one storage 2/22/2009 157 device when it is full and insert another.
Storing Images The number of images you take is limited only by the number of storage devices you have and the capacity of each. The number of images that you can store in a camera depends on a variety of factors including (1) The capacity of the storage device (expressed in Megabytes) (2) The resolution at which pictures are taken (3) The amount of compression used Number you can store is important because once you reach the limit you have no choice but to quit taking pictures or erase some existing ones to make room for new ones. How much storage capacity you need depends partly on what you use camera for. 2/22/2009
158
Storing images
The advantages of removable storage are many. They include the following: (1) They are erasable and reusable (2) They are usually removable, so you can remove one and plug in another so storage is limited only by the number of devices you have. (3) They can be removed from the camera and plugged into the computer or printer to transfer the images.
2/22/2009
159
Storing Images
Flash Card Storage:
As the popularity of digital cameras and other hand held devices has increased, so has the need for small, inexpensive memory devices. The type that is caught on is flash memory which uses solid state chips to store your image files. Although flash memory chips are similar to RAM chips that are used inside your computer there is one important difference. They don’t require batteries and don’t loose images when power is turned off. Your photographs are retained indefinitely without any power to the flash memory components. These chips are packaged inside a case equipped with electrical connectors and the sealed unit is called a card. Flash memory cards consume little power, take up little space, and are very rugged. They are also very convenient. You can carry lots of them with you and change them as needed. 2/22/2009
160
Storing images Until recently, most flash cards have been in the standard PC card format that is widely used in the network computers. However, with the growth of the digital camera and other markets, a number of smaller formats have been introduced. As a result of this competition, camera support confusing variety of incompatible flash memory cards including the following : PC Cards : CompactFlash: Smart media Memory sticks xD-picture cards Each of these formats is supported by its own group of companies 2/22/2009 161 and has its own following.
Storing images PC Cards : PC Cards have the highest storage capacities but their large size has led to their being used mainly in professional cameras CompactFlashCads They are generally the most advanced flash storage devices for consumer level digital cameras CompactFlash Terminology Compactflash cards and slots that are 3.3mm thick are called CompactFalsh(abbreviated CF) or CompactFlash Type 1(abbreviated CF-1) and that are 5mm thick are called Type II Smart media cards They are smaller than compactflash cards and generally don’t come 2/22/2009 162 with storage capacities quite as high
Storing Images Sony memory sticks, shaped something like stick of a gum, currently used mainly in sony products xD picture cards The xD picture cards are the smallest of the memory cards and used in very small cameras. It was developed jointly by Fuji and Olympus Memory card storage cases Cards are easy to misplace and the smaller they are, the easier they are to lose. If you don’t find a way to store them safely. One way to keep them safe is to use an inexpensive storage case.
2/22/2009
163
Storing Images Hard disk storage One of the current drawbacks of compact memory flash cards is their limited storage capacity. For high resolution cameras this is a reall drawback. One solution is high speed, high capacity hard disk drives. Untill recently these drives were too large and expensive to be mounted inside cameras, but that changed with IBM’s introduction of Microdrive hard disk drives. These drives now owned by Hitachi are smaller in volume lighter in volume than a roll of film. Infact, they are so small that they can be plugged into a Type II compact flash slot into digital camera or flash card reader. The Hitachi Microdrive fits a CF-II slot and is a marvel of engineering.
2/22/2009
164
Storing Images
Optical storage disks
CDs are used in few cameras and have the advantage that they can be read in any system with a CD drive. The disks are write once with archival quality with no danger of important files being deleted or written over. Sony’s line of Marvicas use CD discs for storage. Temporary storage : Portable digital image storage and viewing devices are advancing rapidly that’s good because they meet real need. When our photographing, if our storage device becomes filled with images, you need to place to temporarily store the images until images are transferred to main system. One device used for this is notebook computer. Not only many do have one of these, but their large screen and ability to run any of software. However, a notebook computer is not always the ideal temporary storage 2/22/2009 165 device, because of its weight, short battery life, and long startup time. Hence the introduction of portable hard drive.
Storing Images PC Cards : FlashTrax from SmartDisk is one of the new multimedia storage/viewer devices. To use one of these devices you insert your memory card into a slot, often using an adapter, and quickly transfer your images. You can erase your camera’s storage device to make room for new images and resume shooting. When you get back to your permanent set up, you copy or move your images from the intermediate storage device to the system you use for editing, printing, and distributing them. The speed with which you transfer depends on connections supported by the device. Most support USB 2 and some support FireWire. The latest trend is to incorporate image storage into multipurpose devices. Many of these devices let you review the stored images on the device itself or on connected TV. Some also let you to print the images 2/22/2009 166 directly on the printer without using computer.
Storing Images The ternd is to go even farther and combine digital photos, digital videos, and MP3 music, in the same device. With a device like this one will be able to create slide shows with special transitions, pans, and accompanying music and play them back anywhere. One way to eliminate or reduce the need for intermediate storage is to use a higher capacity storage device in the camera. For example, some devices store many gigabytes of data, enough to store hundreds of large photos.
2/22/2009
167
Storing Images The key questions to ask when considering an intermediate storage devices are: (1) What is its storage capacity ? What is the cost per megabyte of storage? (2) Does it have slots or adapters for the storage devices you use ? (3) Does it support image formats you use ? Many support common image formats like JPEG but not proprietary formats such as Canon’s RAW and Nikon’s NEF format. (4) Does it support video and MP3 music playback? Does it support camera’s movie format if it has one ? (5) What is the transfer rate and how long does it take to transfer the images from card to the device? (6) 2/22/2009 Can it display images on a TV set or be connected directly168to a printer ?
Storing Images (7) If it connects to a TV does it have remote control? (8) Can you view images on devices own screen ? (9) Are there ways to rotate, zoom in/out and scroll ?
2/22/2009
169
Introduction to double buffering
BitBlt - > It Stands for Bit-Block Transfer. It means that a “block” of bits, describing a rectangle in an image, is copied in one operation. Usually the graphics card supports this command in hardware. There is a function in the Win32 API of this name, which also occurs in MFC, but the FCL does not provide this function to you directly. Nevertheless it is essentially packaged for use as the Graphics.DrawImage method
2/22/2009
170
Introduction to double buffering Memory DC - > DC means “device context”. This is represented in the FCL as a Graphics object. So far, the Graphics objects we have used in our programs usually corresponded to the screen; and in one lecture, we used a Graphics object that corresponded to the printer. But it is possible to create a Graphics object that does not correspond to a physical device. Instead, it just has an area of RAM (called a buffer) that it writes to instead of writing to video RAM on the graphics card. When you (for example) draw a line or fill a rectangle in this Graphics object, nothing changes on the screen (even if you call Invalidate), since the memory area being changed by the graphics calls is not actually video RAM and has no connection with the monitor. This “virtual Graphics object” is loosely referred to as a memory DC. It is a “device” that exists only in memory; but usually its pixel format does correspond to a physical device such as the screen, so that when data is copied from this2/22/2009 buffer to video memory, it is correctly formatted. 171 2/22/2009
171
Double buffering
We can use the images as offscreen drawing surfaces according to storing them as pictures This allows us to render any image, including text and graphics,
The advantage of doing this is that the images is seen only
to an offscreen buffer that we can display at a later time. when it is complete
Drawing a complicated image could take several milliseconds or more
which can be seen by the user as flashing and flickering This flashing is distracting 9
causes the user to perceive his rendering as slower than actually is
Usage of an offscreen image to reduce flickers is called double buffering. Because:
the screen is considered a buffer for pixels, and the offscreen image is the second buffer, 9
2/22/2009
where we can prepare pixels for display. 172
What is double buffering ? Double buffering - > “Double buffering” refers to the technique of writing into a memory DC and then BitBlt-ing the memory DC to the screen. This works as follows: your program can take its own sweet time writing to a memory DC, without producing any delay or flicker on the screen. When the picture is finally complete, the program can call BitBlt and bang! Suddenly (at the next vertical retrace interval) the entire contents of the memory DC’s buffer are copied to the appropriate part of video RAM, and at the next sweep of the electron gun, the picture appears on the screen. This technique is known as double buffering. The name is appropriate because there are two buffers involved: one on the graphics card (video RAM) and one that is not video RAM, and the second one is a “double” of the first in the sense that it has the same pixel format. 2/22/2009
173
What is double buffering ? [Some books reserve this term for a special case, in which the graphics card has two buffers that are alternately used to refresh the monitor, eliminating the copying phase. But most books use the term double buffering for what we have described.] Whatever is stored in the memory DC will not be visible, unless and until it gets copied to the DC that corresponds to the screen. This is done with BitBlt, so that the display happens without flicker.
2/22/2009
174
Why use double buffering ? Double buffering can be used whenever the computations needed to draw the window are time-consuming. Of course, you could always use space to replace time, by storing the results of those computations. That is, in essence, what double-buffering does. The end result of the computations is an array of pixel information telling what colors to paint the pixels. That’s what the memory DC stores. This situation arises all the time in graphics programming. All three-dimensional graphics programs use double-buffering. MathXpert uses it for two-dimensional graphics. We will soon examine a computer graphics program to illustrate the technique. Another common use of double-buffering is to support animation. If you want to animate an image, you need to “set a timer” and then at regular time intervals, use BitBlt to update the screen to show the image in the next position. 2/22/2009 175
Why use double buffering? The BitBlt will take place during the vertical retrace interval. Meantime, between “ticks” of the timer, the next image is being computed and drawn into the memory DC. In the next lecture, this technique will be illustrated. When should the memory DC be created? Obviously it has to be created when the view is first created. Less obviously, it has to be created again when the view window is resized. But the memory DC is the same size as the screen DC, which is the same size as the client area of the view window. When this changes, you must change your memory DC accordingly. When you create a new memory DC, you must also destroy the old one, or else memory will soon be used up by all the old memory DC’s (provided the evil user plays around resizing his 2/22/2009 176 window many times).
What is double buffering ? It turns out that every window receives a Resize event when it is being resized, and it also receives a Resize event soon after its original creation (when its size changes from zero by zero to the initial size). Therefore, we add a handler for the Resize event and put the code for creating the memory DC in the Resize message handler.
2/22/2009
177
Cathode Ray Tube
Adobe Acrobat 7.0 Document
2/22/2009
178
Liquid crystal display LCD (Liquid Crystal Display) panels are "transmissive" displays, meaning they aren't their own light source but instead rely on a separate light source and then let that light pass through the display itself to your eye. We can start to describe how an LCD panel works by starting with that light source. The light source is a very thin lamp called a "back light" that sits directly behind the LCD panel as shown in Figure 1. Figure 1
2/22/2009
179
LCD The light from the backlighting then passes through a polarizing filter (a filter
that aligns the light waves in a single direction). From there the now polarized light then passes through the actual LCD panel itself. The liquid crystal portion of the panel either allows the polarized light to pass through or blocks the light from passing through depending on how the liquid crystals are aligned at the time the light tries to pass through. See Figure 2. Figure 2
2/22/2009
180
LCD The liquid crystal portion or the panel is spit up into tiny individual cells that
are each controlled by a tiny transistor to supply current. Three cells side by side each represent one "pixel" (individual picture element) of the image. An 800 x 600 resolution LCD panel would have 480,000 pixels and each pixel would have three cells for a total of 1,440,000 individual cells. Red, green and blue are the primary colors of light. All other colors are made up of a combination of the primary colors. An LCD panel uses these three colors to produce color which is why there are three cells per pixel — one cell each for red, green, and blue. Once the light is passed through the liquid crystal layer and the final polarizing filter it then passes through a color filter so that each cell will then represent one of the three primary colors of light. See Figure 3. Figure 3
2/22/2009
181
LCD The three cells per pixel then work in conjunction to produce color. For example, if a
pixel needs to be white, each transistor that controls the three color cells in the pixel would remain off, thus allowing red, green and blue to pass through. Your eye sees the combination of the three primary colors, so close in proximity to each other, as white light. If the pixel needed to be blue, for and area of an image that was going to be sky, the two transistors for the red and green cells would turn on, and the transistor for the blue cell would remain off, thus allowing only blue light to pass through in that pixel. Pros: 1.LCD displays are very thin. They can be mounted in places traditional CRT televisions and monitors cannot. 2.Color reproduction is excellent. 3.Contrast is good, although not great. 4.Pixel structure is very small, which creates a very smooth image. 5.Durable technology. 6.No burn-in issues. Cons: 1.Very expensive technology per square inch of viewing area. 2.Black levels and details in dark scenes are not as strong as those in competing technologies. 3.Dead pixels can be an issue, although quality has improved as the technology has matured. 2/22/2009 182 4.Sizes above 40" are cost prohibitive.
LCD Is an LCD Panel right for you? It depends on your needs. Below is a list of common scenarios where an LCD panel provides the best performance, followed by a list of scenarios that might suggest the need to use a different technology. Scenarios where an LCD flat panel will perform well: 1.
Any application that will require a screen of less than 42" diagonal.
2.
Installations that require the monitor/television to be built into a wall or cabinetry, and require a diagonal image of less than a 42".
3.
Pre-made entertainment centers and bedroom armoires.
4.
Any application that requires wall mounting and requires a diagonal image of less than 42".
Scenarios where another technology might be more effective: 1.
Any application that requires a large screen — larger than 40" diagonal. LCD displays get cost prohibitive for sizes above 40". If you opt to select an LCD panel of over 40", be prepared to pay.
2.
Applications where the best possible image quality is needed. A CRT is still going to give the best shadow detail and color. 2/22/2009 183
3.
Tight budgets; CRT technology will be much less expensive per viewing area.
Printers
Inch
Type of measurement equal to 25.4 millimeters or 2.54 centimeters.
Measurement When referring to computers, a measurement is the process of determining a dimension, capacity, or quantity of an object or the duration of a task. By using measurements an individual can help ensure that an object is capable of fitting within an area, a storage medium is capable of storing the necessary files, a task will complete in the required time, or how fast a product is when compared to another product. Below is a listing of different types of computer measurements you may encounter while working with a computer or in the 2/22/2009 184 computer field.
Printers PPI
Short for Pixels Per Inch, PPI is the number of pixels per inch a pixel image is made up of. The more pixels per inch the image contains, the higher quality the image will be. Term that comes from the words Picture Element (PEL). A pixel is the smallest portion of an image or display that a computer is capable of printing or displaying. You can get a better understanding of what a pixel is when zooming into an image, as seen in the example to the right. As you can see in this example, the character image in this picture has been zoomed into at 1600%. Each of the blocks seen in this example is a single pixel of this image. Everything on the computer display looks similar to this when zoomed in upon; the same is true with printed images, which are created by several little dots that are measured in DPI.
Pixel image A type of computer graphic that is composed entirely of pixels. 2/22/2009
185
Printers There seems to be a lot of confusion about what PPI means (apart from the fact that it means Pixels Per Inch of course). This article is for beginners in computer graphics and digital photography. Dots Per Inch usually means the maximum dots a printer can print per inch. Roughly speaking the more DPI the higher quality the print will be. DPI is for printers, PPI is for printed images. But I don't think there is an official definition of the difference. PPI and DPI are sometimes but not always the same, I'll assume for simplicity in this article that they are. However, until you print an image the PPI number is meaningless. 2/22/2009
186
Printers
Until you print an image the PPI number is meaningless. Imagine, for simplicity's sake, that the image below, when printed on your printer is one inch (or 2.54 cm) square:
If you count the pixels (blocks, dots) you'll find that there are 10 across the width of an image. If this was printed at the size of 1 inch it would be a 10 PPI image. Here is a 50 PPI version of the same image:
2/22/2009
187
Printers You need not count the pixels, believe me the square above is a 50 by 50 pixel image, and if printed so that it covered exactly 1 square inch it would be a 50 PPI image. Now lets look at a 150 PPI image:
PPI is simply how many pixels are printed per inch of paper. You may not be able to see the pixels (because your eyes or your printer is not that high a quality). The above images are approximations but you get the idea. I don't care what the "image information" on your camera says, or the what PPI reading on your paint program says, only when you print can you really say what the PPI is. And the same image will have different PPI when 188 printed2/22/2009 at different sizes.
Printers
DPI (dots per inch) is a measurement of printer resolution, though it is commonly applied, somewhat inappropriately, to monitors, scanners and even digital cameras. For printers, the DPI specification indicates the number of dots per inch that the printer is capable of achieving to form text or graphics on the printed page. The higher the DPI, the more refined the text or image will appear. To save ink, a low DPI is often used for draft copies or routine paperwork. This setting might be 300 or even 150 DPI. High resolution starts at 600 DPI for standard printers, and can far exceed that for color printers designed for turning out digital photography or other high-resolution images. 2/22/2009
189
Printers
In the case of monitors, DPI refers to the number of pixels present per inch of display screen. The technically correct term is "PPI" or pixels per inch, but DPI is commonly used instead. A display setting of 1280 x 1024 has 1.3 million DPI, while a setting of 800 x 600 has 480,000, or less than half the resolution of the higher setting. With fewer dots per inch, the picture will not have the clarity that can be achieved with a higher DPI saturation. This is because displays create images by using pixels. Each dot or pixel reflects a certain color and brightness. The greater the DPI, the more detailed the picture can be. Higher DPI also requires more memory and can take longer to 'paint' images, depending on the system's video card, processor and other components. 2/22/2009
190
Printers Scanners also operate at different resolutions. Scan time will increase with higher DPI settings, as the scanner must collect and store more data. However, the greater the DPI, or requested resolution, the richer the resulting image. A high DPI setting mimics the original image in a truer fashion than lower DPI settings are capable of doing. If the image is to be enlarged, a high DPI setting is necessary. Otherwise the enlarged picture will look "blocky" or blurry because the software lacks information to fill in the extra space when the image is enlarged. Instead it "blows up" each pixel to "smear" it over a wider area. Technically again, the more correct term in this application is sampled PPI, but DPI is more often used. 2/22/2009
191
Printers Digital cameras have their own specifications in terms of megapixels and resolution, but DPI is often mentioned in this context as well. Since DPI in all cases refers to the output image, a digital camera capable of the most basic current standards of resolution —- 3.0 megapixels and better —- will output an image capable of taking advantage of a very high DPI setting on the printer. However, if your printer is only capable of 600 DPI, the extra resolution of the camera will be lost in the printing process. When buying or upgrading components it is therefore critical that each product is capable of supporting the highest standards of any interfacing product. 2/22/2009
192
Printers Print quality The quality of the hard copy produced by a computer printer. Below is a listing of some of the more common reasons why the print quality may differ. 1. Type of printer - Each type of printer has its own capabilities of printing. With standard printers, dot matrix is commonly the lowest quality printer, ink jet printers are commonly average quality, and laser printers are commonly the best quality. 2.
Low DPI - Printer has a low DPI.
3. Print mode - The mode that the hard copy was produced may also affect the overall quality of the print. For example, if the mode was draft quality, the printer will print faster, but will be a lower quality. 4. Available toner or ink - If the printer is low on toner or ink the quality can be dramatically decreased. 5. Dirty or malfunctioning printer - If the printer is dirty or is malfunctioning this can also affect the quality of the print. 6. Image quality - It is important to realize that when printing a computer graphic, the quality may not be what you expect because of any of the below reasons. •Printer does not have enough colors to produce the colors in the image. For example, some printers may only have four available inks where others may have six or more available inks. See process color. •The image is a low quality or low resolution image. 2/22/2009 •Image is too small and/or has too many colors in a small area.
193
Printers Most people have used printers at some stage for printing documents but few are aware of how it works. Printed documents are arguably the best way to save data. There are two types of basic printers Impact and Non-impact. Impact printers, as the very name implies means that the printing mechanism touches the paper for creating an image. Impact printers were used in early 70s and 80s. In Dot Matrix printers a series of small pins is used to strike on a ribbon coated with ink to transfer the image on the paper. Other Impact Printers like Character printers are basically computerized typewriters. They have a series of bars or a ball with actual characters on them, which strike on the ink ribbon to transfer the characters on the paper. At a time only one character can be printed. Daisy Wheel printers use a 2/22/2009 194 plastic or metal wheel.
Printers
These types of printers have limited usage though because they are limited to printing only characters or one type of font and not the graphics. There are Line printers where a chain of characters or pins, print an entire line, which makes them pretty fast, but the print quality is not so good. Thermal printers are nothing but printers used in calculators and fax machines. They are inexpensive to use. Thermal printers work by pushing heated pins against special heat sensitive paper. More efficient and advanced printers have come out now which use new Non-impact Technology. Non-impact printers are those where the printing mechanism does not come into the contact of paper at all. This makes them quieter in operation in comparison to the impact 2/22/2009 printers. 195
Printers In mid 1980s Inkjet printers were introduced. These have been the most widely used and popular printers so far. Colour printing got revolutionized after inkjet printers were invented. An Inkjet printer's head has tiny nozzles, which place extremely tiny droplets of ink on the paper to create an image. These dots are so small that even the diameter of human hair is bigger. These dots are placed precisely and can be up to the resolution of 1440 x 720 per inch. Different combinations of ink cartridges can be used for these printers.
2/22/2009
196
Printers How an Inkjet printer works The print head in this printer scans the page horizontally back and forth and another motor assembly rolls the paper vertically in strips and thus a strip is printed at a time. Only half a second is taken to print a strip. Inkjet printers were very popular because of their ability to colour print. Most inkjets use Thermal Technology. Plain copier paper can be used in these printers unlike thermal paper used for fax machines. Heat is used to fire ink onto the paper through the print head. Some print heads can have up to 300 nozzles. Heat resistant and water based ink is used for these printers.
2/22/2009
197
Printers The latest and fastest printers are Laser Printers. They use the principal of static electricity for printing it as in photocopiers. The principle of static electricity is that it can be built on an insulated object. Oppositely charged atoms of objects (positive and negative) are attracted to each other and cling together. For example, pieces of nylon material clinging to your body, or the static you get after brushing hair. A laser printer uses this same principle to glue ink on the paper.
2/22/2009
198
Printers How Laser Printer works Unlike the printers before, Laser printers use toner, static electricity and heat to create an image on the paper. Toner is dry ink. It contains colour and plastic particles. The toner passes through the fuser in the computer and the resulting heat binds it to any type of paper. Printing with laser printers is fast and non-smudge and the quality is excellent because of the high resolution that it can achieve with 300 dots per inch to almost 1200 dpi at the higher end.
2/22/2009
199
Printers Basic components of a laser printer are fuser, photoreceptor drum assembly, developer roller, laser scanning unit, toner hopper, corona wire and a discharge lamp. The laser beam creates an image on the drum and wherever it hits, it changes the electrical charge like positive or negative. The drum then is rolled on the toner. Toner is picked up by charged portion of the drum and gets transferred to the paper after passing through the fuser. Fuser heats up the paper to amalgamate ink and plastic in toner to create an image. Laser printers are called "page printers" because entire page is transferred to the drum before printing. Any type of paper can be used in these printers. Laser printers popularized DTP or Desk Top Publishing for it can print any number of fonts and any graphics.. 2/22/2009
200
Printers This is how the computer and printer operate to print When we want to print something we simply press the command "Print". This information is sent to either RAM of the printer or the RAM of the computer depending upon the type of printer we have. The process of printing then starts. While the printing is going on, our computer can still perform a variety of operations. Jobs are put in a buffer or a special area in RAM or Random Access Memory and the printer pulls them off at its own pace. We can also line up our printing jobs this way. This way of simultaneously performing functions is called spooling. Our computer and the printer are thus in constant communication.
2/22/2009
201
Printing Images In image processing, there are overlapping terms that tend to get interchanged. Especially for image and print resolution: dpi (dots per inch), ppi (pixel or points per inch), lpi (lines per inch). In addition to this, the resolution of an image is stated by its dimensions in pixels or in inches (at a certain ppi or dpi resolution). Yes, we can understand if your head is swimming. Let’s understand this: When an image is captured using either a camera or a scanner, the result is a digital image consisting of rows – known as arrays – of different picture elements that are called pixels. This array has a horizontal and vertical dimension. The horizontal size of the array is defined by the number of pixels in one single row (say 1,280) and the number of rows (say 1,024), giving the image a horizontal orientation. That picture would have a “resolution” of “1,024 x 1,280 pixels”. 2/22/2009
202
Printing images The size of the image displayed is dependent o the number of pixels the monitor displays per inch. The “pixel per inch” resolutions (ppi) of monitors vary, and are usually in the range of 72 ppi to 120 ppi (the latter, lager 21.4” monitors). In most cases, however, with monitors the resolution is given as the number of pixels horizontally and vertically (e.g.1,0240 x 1,280 or 1,280 x 1,600). So the “size” of an image very much depends on how many pixels are displayed per inch. Thus, we come to a resolution given in ‘pixels per inch’ or ppi for short. With LCD monitors, their ppi resolution is fixed and can’t be adjusted (at least not without a loss of display quality). With CRT monitors you have more flexibility (we won’t go into this further0. When an image is printed, its physical size depends upon how many image pixels we put down on paper, but also how an individual image 2/22/2009 203 pixel is laid down on the paper.
How image pixels produced by printer dots? There are only a few printing technologies where a printer can directly produce a continuous color range within an individual image pixel printed. Most other types of printers reproduce the color of a pixel in an image by approximating the color by an n x n matrix of fine dots using a specific pattern and a certain combination of the basic colors available to the printer. If we want to reproduce a pixel of an image on paper, we not only have to place a physical printer’s ‘dot’ on paper, but also have to give that ‘dot’ the tonal value of the original pixel. With bitonal images, that is easy. If the pixel value is o, you lay down a black printed dot, and if the pixel is 1, you omit the dot. However, if the pixel has gray value (say 128 out of 256), and you print with a black-and-white laser printer (just to make the explanation a bit simpler), we must find different way. This technique is called rasterization or dithering. To simulate different tonal values (let’s just stick to black-and-white for the moment), a number of printed dots are placed in a certain pattern on the paper to reproduce a single pixel of the image. In a low-resolution solution, we could use matrix of 3 printed dots by 3 printed dots per pixel. 2/22/2009
204
How image pixels produced by printer dots? Using more printed dots per image pixel allows for more different tonal values. With a pattern of 6 x 6 dots, you get 37 tonal grades, (which is sufficient). For a better differentiation let’s call the matrix of printer dots representing a pixel of the image a raster cell. Now we see why a printer’s “dot per inch” (dpi) resolution has to be much higher than the resolution has to be much higher than the resolution of a display (where a single dot on a screen may be used to reproduce a single pixel in an images, as the individual screen dot (also called a pixel) may have different tonal (or brightness) values. When you print with a device using relatively low resolution for grayscale or colored images, you must make a trade-off between a high resolution image (having as many “raster cells per inch” as possible) and larger raster cells providing greater tonal value per cell. 2/22/2009
205
How image pixels produced by printer dots? The image impression may be improved when the printer is able to vary the size of its dots. This is done on some laser printers, as well as with some of today’s photo inkjet printers. If the dot size can be varied (also called modulated), fewer numbers of dots (n x n) are needed to create a certain number of different tonal values, (which results in a finer raster). You may achieve more tonal values from a fixed raster cell size. There are several different ways (patterns) to place single printed dots in a raster cell, and the pattern for this dithering is partly a secret of the printer driver. The dithering dot pattern is less visible and more photo-like, when the pattern is not the same for all raster cells having the same tonal values, but is modified from raster cell to raster cell in some random way. 2/22/2009
206
Linear Systems
2/22/2009
207
Linear Space Invariant System
2/22/2009
208
Linear Space Invariant System
2/22/2009
209
This Property holds
2/22/2009
210
Convolution in 1 Dimension Let’s look at some examples of convolution integrals, ∞
f (x) = g(x) ⊗ h(x) =
∫ g(x')h(x − x' )dx'
−∞ So there are four steps in calculating a convolution integral: #1. Fold h(x’) about the line x’=0 #2. Displace h(x’) by x #3. Multiply h(x-x’) * g(x’) #4. Integrate 2/22/2009
211
Math of Convolution n
g ( x ) = h * f ( x ) = ∑ h( n) f ( x − n) −n
1 2 1 2/22/2009
h(-1)=1 h(0)=2
h(1)=1
212
Convolution (1D) Filter coefficients
1 2 1
(mask, kernel, template, window) Filter
Input Signal/Image-row
1 1 2 2 1 1 2 2 1 1 Output Signal/Image-row Filter Response
2/22/2009
5 4
213
Math of 2D Convolution/Correlation n
Convolution
m
g ( x, y ) = h * f ( x, y ) = ∑∑ h(m, n) f ( x − m, y − n) −n −m
Correlation
n
m
g ( x, y ) = h ο f ( x, y ) = ∑∑ h(m, n) f ( x + m, y + n) −n −m
2/22/2009
214
Correlation (1D) This process is called Correlation!!
1 2 1
1 1 2 2 1 1 2 2 1 1
5 4 2/22/2009
7 4
7 4
5 4
5 4
7 4
7 4
5 4 215
Correlation Vs Convolution 1 2 1 Correlation
n
g ( x ) = h ο f ( x ) = ∑ h( n) f ( x + n) −n
1 1 2 2 1 1 2 2 1 1 1 2 1 Convolution
n
g ( x ) = h * f ( x ) = ∑ h( n) f ( x − n) −n
1 1 2 2 1 1 2 2 1 1 In image processing we use CORRELATION but (nearly) always call it CONVOLUTION!!!!! Note: When the filter is symmetric: correlation = convolution! 2/22/2009
216
Correlation on images
1 2 1 1 2 2/22/2009
Process of moving a filter mask over the image and compute sum of products at each location. In convolution filter is first rotated by 180 degree Input Output
2 1 0 2 5
0 4 1 1 3
1 2 0 0 1
3 2 1 2 2
1 9
1 1 1
1 1 1
12 9
217
1 1 1
Correlation on images 1 9
1 1 1
1 1 1
1 1 1
Input
1 2 1 1 2 2/22/2009
2 1 0 2 5
0 4 1 1 3
Output
1 2 0 0 1
3 2 1 2 2
12 9
11 9
218
Applications of Convolution/correlation – Blurring – Edge detection –Template matching
2/22/2009
219
Blurring (smoothing)
Also know as: Smoothing kernel, Mean filter, Low pass filter 1 1 1 The simplest filter: 1 – Spatial low pass filter 1 1 1 1 3
Another mask: – Gaussian filter: 1 1 2 1 4
2/22/2009
9
1 1
1 1
1 1
1 16
1 2 1
2 4 2
1 2 1
220
Applications of smoothing
Blurring to remove identity or other details Degree of blurring = kernel size
Show: camera, mean, convolution 2/22/2009
221
Applications of smoothing
Preprocessing: enhance objects Smooth + Thresholding
2/22/2009
222
Uneven illumination
Improve segmentation Uneven illumination – Within an image – Between images
Solution – “Remove background” – Algorithm: g(x,y) = f(x,y) – f(x,y), f(x,y)=mean – Use a big kernel for f(x,y), e.g., 10-50 (IJ: mean=50, sub,TH) 2/22/2009
223
Uneven illumination Input f(x,y)
2/22/2009
Mean f(x,y)
f(x,y) – f(x,y)
Edges
224
Application of smoothing
Remove noise
2/22/2009
225
Correlation application: Template Matching
2/22/2009
226
Template Matching
The filter is called a template or a mask
Input image
Output
Output as 3D
Template
The brighter the value in the output, the better the match Implemented as the correlation coefficient 2/22/2009
227
Template Matching Output
2/22/2009
228
Correlation application: Edge detection
2/22/2009
229
Edge detection Edge detection
2/22/2009
230
Edge detection
g x ( x, y ) ≈ f ( x + 1, y ) − f ( x − 1, y ) g y ( x, y ) ≈ f ( x, y + 1) − f ( x, y − 1) 2/22/2009
231
Edge detection g x ( x, y ) ≈ f ( x + 1, y ) − f ( x − 1, y ) g y ( x, y ) ≈ f ( x, y + 1) − f ( x, y − 1)
2/22/2009
232
Properties of convolution commutative:
associative:
f ⊗g=g⊗ f
f ⊗ (g ⊗ h) = ( f ⊗ g) ⊗ h
multiple convolutions can be carried out in any order. distributive: 2/22/2009
f ⊗ (g + h) = f ⊗ g + f ⊗ h
233
Convolution Theorem ℑ{ f ⊗ g} = F(k)⋅ G(k)
In other words, convolution in real space is equivalent to multiplication in Frequency space.
2/22/2009
234
Proof of convolution Theorem So we can rewrite the convolution integral, f ⊗g=
∞
∫ f (x)g(x'−x)dx
−∞
as, f ⊗g=
1 ∞
∞
4π 2 −∞
−∞
∞
∫ dx ∫ F(k)e dk ∫ G(k' )eik'( x'− x )dk' ikx
−∞
change the order of integration and extract a delta function, ∞ ∞ 1 ∞ 1 f ⊗g= ∫ dkF(k) ∫ dk'G(k')eik'x' ∫ eix(k −k')dx 2π −∞ 2π −∞ −∞ 1 4 42 4 43 2/22/2009
δ (k−k')
235
Proof of convolution theorem ∞ ∞ 1 ∞ ik'x' 1 f ⊗g= ∫ dkF(k) ∫ dk'G(k')e ∫ eix(k −k')dx 2π −∞ 2π −∞ −∞ 1 4 42 4 43
δ (k−k')
or, ∞ 1 ∞ f ⊗g= ∫ dkF(k) ∫ dk'G(k')eik'x'δ (k − k') 2π −∞ −∞
Integration over the delta function selects out the k’=k value. 2/22/2009
1 ∞ f ⊗g= ∫ dkF(k)G(k)eikx' 2π −∞
236
Proof of convolution theorem 1 ∞ f ⊗g= ∫ dkF(k)G(k)eikx' 2π −∞ This is written as an inverse Fourier transformation. A Fourier transform of both sides yields the desired result.
ℑ{ f ⊗ g} = F(k)⋅ G(k) 2/22/2009
237
Convolution in 2-D For such a system the output h(x,y) is the convolution of f(x,y) with the impulse response g(x,y)
2/22/2009
238
Convolution in 2-D
2/22/2009
239
Example of 3x3 convolution mask
2/22/2009
240
In Plain Words Convolution is essentially equivalent to computing a weighted sum of image pixels where filter is rotated 180 degree.
Convolution is Linear operation
2/22/2009
241
Why Mathematical transformations?
Why
– To obtain a further information from the signal that is not readily available in the raw signal.
Raw Signal
– Normally the time-domain signal
Processed Signal
– A signal that has been "transformed" by any of the available mathematical transformations
Fourier Transformation
– The most popular transformation
2/22/2009
242
What is a Transform and Why do we need one ?
Transform: A mathematical operation that takes a function or sequence and maps it into another one Transforms are good things because… – The transform of a function may give additional /hidden information about the original function, which may not be available /obvious otherwise – The transform of an equation may be easier to solve than the original equation (recall your fond memories of Laplace transforms in DFQs) – The transform of a function/sequence may require less storage, hence provide data compression / reduction – An operation may be easier to apply on the transformed function, rather than the original function (recall other 243 2/22/2009 fond memories on convolution).
Why transform ?
2/22/2009
244
Introduction to Fourier Transform
f(x): continuous function of a real variable x
Fourier transform of f(x): ℑ{ f ( x)} = F (u ) =
∞
∫ f ( x) exp[− j 2πux]dx
Eq. 1
−∞
where 2/22/2009
j = −1 245
Introduction to Fourier transform
(u) is the frequency variable. The integral of Eq. 1 shows that F(u) is composed of an infinite sum of sine and cosine terms and… Each value of u determines the frequency of its corresponding sine-cosine pair.
2/22/2009
246
Introduction to Fourier transform
Given F(u), f(x) can be obtained by the inverse Fourier transform: −1
ℑ {F (u )} = f ( x) ∞
=
∫ F (u ) exp[ j 2πux]du
−∞
• The above two equations are the Fourier transform pair. 2/22/2009
247
Introduction to Fourier transform jθ
e = cosθ + j sinθ cos(−θ ) = cos(θ ) 1 M −1 F(u) = ∑ f (x)[cos2πux/ M − j sin2πux/ M ] M x=0
2/22/2009
Each term of the FT (F(u) for every u) is composed of the sum of all values of f(x) 248
Introduction to Fourier transform
The Fourier transform of a real function function is is generally generally complex complex and and we we use use polar polar coordinates: coordinates: F ( u ) = R ( u ) + jI ( u ) F (u ) = F (u ) e
jφ ( u )
F ( u ) = [ R 2 ( u ) + I 2 ( u )] 1 / 2
φ ( u ) = tan 2/22/2009
−1
⎡ I (u ) ⎤ ⎢ R (u ) ⎥ ⎣ ⎦
249
Introduction to Fourier transform
|F(u)| (magnitude function) is the Fourier spectrum of f(x) and φ(u) its phase angle. The square of the spectrum
P(u ) = F (u ) = R (u ) + I (u ) 2
2
2
is referred to as the power spectrum of f(x) (spectral density). 2/22/2009
250
Introduction to Fourier transform
Fourier spectrum:
• Phase: • Power spectrum:
2/22/2009
[
]
F(u, v) = R (u, v) + I (u, v) 2
2
1/ 2
⎡ I (u , v ) ⎤ φ (u , v ) = tan ⎢ ⎥ ⎣ R (u , v ) ⎦ −1
P(u,v) = F(u,v) = R (u,v) +I (u,v) 2
2
2
251
Spatial Frequency decomposition 0.25µm myelin
• Any image can be decomposed into a series of sines and cosines added together to give the image
I(x) = ∑ ai (cos k i x) + ibi (sin k i x) i
Amplitudes
Phase
Fourier Transform -50
0
50
100
150
200
250
300
Pixel
-50
0
50
100
150
2/22/2009
200
250
300
252 Pixel
FT
Fourier Transform of the Myelin Image
Low frequency
High frequency 2/22/2009
253
FT reversible Fourier transform of myelin
1 F
2/22/2009
=
254
2-D Image Transform General Transform N −1 N −1
F(u, v) = ∑∑T (x, y, u, v) f (x, y) x=0 y=0
N −1 N −1
f ( x, y ) = ∑∑ I ( x, y, u, v) F (u, v) u =0 v =0
2/22/2009
255
Discrete Fourier Transform
2/22/2009
256
Discrete Fourier Transform
A continuous function f(x) is discretized into a sequence:
{f (x0), f (x0 +Δx), f (x0 +2Δx),...,f (x0 +[N −1]Δx)} by taking N or M samples Δx units apart.
2/22/2009
257
Discrete Fourier Transform
Where x assumes the discrete values (0,1,2,3,…,M-1) then
f ( x) = f ( x0 + xΔx) • The sequence {f(0),f(1),f(2),…f(M-1)} denotes any M uniformly spaced samples from a corresponding continuous function. 2/22/2009
258
Discrete Fourier Transform
The discrete Fourier transform pair that applies to sampled functions is given by: M −1
1 F(u) = ∑ f (x)exp[− j2πux / M] M x= 0 f ( x) =
M −1
For u=0,1,2,…,M-1
and
∑ f (u) exp[ j2πux / M ]
For x=0,1,2,…,M-1
u= 0
2/22/2009
259
Discrete Fourier Transform
To compute F(u) we substitute u=0 in the exponential term and sum for all values of x We repeat for all M values of u It takes M*M summations and multiplications M −1
1 F(u) = ∑ f (x)exp[−j2πux/ M] M x= 0
For u=0,1,2,…,M-1
The Fourier transform and its inverse always exist!
2/22/2009
260
Discrete Fourier Transform
The values u = 0, 1, 2, …, M-1 correspond to samples of the continuous transform at values 0, Δu, 2Δu, …, (M-1)Δu. i.e. F(u) represents F(uΔu), where: 1 Δu = MΔx
2/22/2009
261
Discrete Fourier Transform
In a 2-variable case, the discrete FT pair is: 1 M −1 N −1 F (u, v) = f ( x, y) exp[− j 2π (ux / M + vy / N )] ∑∑ MN x=0 y =0 For u=0,1,2,…,M-1 and v=0,1,2,…,N-1 M −1 N −1
AND:
f (x, y) = ∑∑ F (u, v) exp[j2π (ux / M + vy / N)] u=0 v=0
For x=0,1,2,…,M-1 and y=0,1,2,…,N-1 2/22/2009
262
Discrete Fourier Transform
Sampling of a continuous function is now in a 2-D grid (Δx, Δy divisions). The discrete function f(x,y) represents samples of the function f(x0+xΔx,y0+yΔy) for x=0,1,2,…,M-1 and y=0,1,2,…,N-1. 1 Δu = , MΔx
2/22/2009
1 Δv = NΔy 263
Discrete Fourier Transform
When images are sampled in a square array, M=N and the FT pair becomes: 1 F (u , v) = N
N −1 N −1
∑∑ f ( x, y) exp[− j 2π (ux + vy) / N ] x =0 y =0
For u,v=0,1,2,…,N-1
AND:
1 f ( x, y ) = N
N −1 N −1
∑∑ F (u, v) exp[ j 2π (ux + vy) / N ] u =0 v =0
For x,y=0,1,2,…,N-1 2/22/2009
264
Properties of 2-D Fourier transform Translation Distributivity and Scaling Rotation Periodicity and Conjugate Symmetry Separability Convolution and Correlation 2/22/2009
265
Translation
f (x,y)exp[j2π (u0 x /M + v0 y /N)]⇔F(u − u0,v −v0 ) and
f (x − x0,y − y0 ) ⇔F(u,v)exp[−j2π (ux0 /M + vy0 /N)]
2/22/2009
266
Translation
The previous equations mean: – Multiplying f(x,y) by the indicated exponential term and taking the transform of the product results in a shift of the origin of the frequency plane to the point (u0,v0). – Multiplying F(u,v) by the exponential term shown and taking the inverse transform moves the origin of the spatial plane to (x0,y0). – A shift in f(x,y) doesn’t affect the magnitude of its 2/22/2009 267 Fourier transform
Distributivity & Scaling ℑ{ f1 ( x, y ) + f 2 ( x, y )} = ℑ{ f1 ( x, y )} + ℑ{ f 2 ( x, y )} ℑ{ f1 ( x, y ) ⋅ f 2 ( x, y )} ≠ ℑ{ f1 ( x, y )} ⋅ ℑ{ f 2 ( x, y )}
Distributive over addition but not over multiplication.
2/22/2009
268
Distributivity and Scaling
For two scalars a and b, af ( x, y ) ⇔ aF (u, v)
1 f (ax, by ) ⇔ F (u / a, v / b) ab
2/22/2009
269
Rotation
Polar coordinates: x = r cosθ ,
y = r sin θ ,
u = ω cos ϕ ,
v = ω cos ϕ
Which means that: f ( x, y ), F (u , v) become f (r ,θ ), F (ω , ϕ ) 2/22/2009
270
Rotation f (r ,θ + θ 0 ) ⇔ F (ω , ϕ + θ 0 )
Which means that rotating f(x,y) by an angle θ0 rotates F(u,v) by the same angle (and vice versa).
2/22/2009
271
Periodicity & Conjugate Symmetry
The discrete FT and its inverse are periodic with period N:
F(u,v)=F(u+M,v)=F(u,v+N)=F(u+M,v+N)
2/22/2009
272
Periodicity & conjugate symmetry
Although F(u,v) repeats itself for infinitely many values of u and v, only the M,N values of each variable in any one period are required to obtain f(x,y) from F(u,v). This means that only one period of the transform is necessary to specify F(u,v) completely in the frequency domain (and similarly f(x,y) in the spatial domain).
2/22/2009
273
Periodicity & Conjugate Symmetry
(shifted spectrum) move the origin of the transform to u=N/2.
2/22/2009
274
Periodicity & Conjugate Symmetry
For real f(x,y), FT also exhibits conjugate symmetry: F (u , v) = F * (−u ,−v) F (u , v) = F (−u ,−v) or
F (u ) = F (u + N ) F (u ) = F (−u )
• i.e. F(u) has a period of length N and the magnitude of the transform is centered on the origin. 2/22/2009
275
Separability
The discrete FT pair can be expressed in separable forms which (after some manipulations) can be expressed as: 1 M −1 F(u,v) = F(x,v)exp[− j2πux / M] ∑ M x= 0
Where: 2/22/2009
⎡ 1 N −1 ⎤ F(x,v) = ⎢ ∑ f (x, y)exp[− j2πvy /N]⎥ ⎥⎦ ⎢⎣ N y= 0 276
Separability in Specific forms
Separable
T ( x, y , u , v ) = T1 ( x, u )T2 ( y , v )
Symmetric
2/22/2009
T ( x, y , u , v ) = T1 ( x, u )T2 ( y , v ) 277
Separability
For each value of x, the expression inside the brackets is a 1-D transform, with frequency values v=0,1,…,N-1. Thus, the 2-D function F(x,v) is obtained by taking a transform along each row of f(x,y) and multiplying the result by N.
2/22/2009
278
Separability
The desired result F(u,v) is then obtained by making a transform along each column of F(x,v).
2/22/2009
279
Energy preservation g = f 2
N −1N −1
2
N −1N −1
∑ ∑ f ( x, y ) = ∑ ∑ g (u , v)
x =0 y =0
2/22/2009
2
2
u =0 v =0
280
Energy Compaction !
2/22/2009
281
An Atom
Both functions have circular symmetry. The atom is a sharp feature, whereas its transform is a broad smooth function. This illustrates the reciprocal relationship between a function and its Fourier transform.
2/22/2009
282
Original Image – FA-FP
2/22/2009
283
A Molecule
2/22/2009
284
Fourier duck
2/22/2009
285
Reconstruction from Phase of cat & Amplitude of duck
2/22/2009
286
Reconstruction from Phase of duck & Amplitude of cat
2/22/2009
287
Original Image-Fourier Amplitude Keep Part of the Amplitude Around the Origin and Reconstruct Original Image (LOW PASS filtering)
2/22/2009
288
Keep Part of the Amplitude Far from the Origin and Reconstruct Original Image (HIGH PASS filtering)
2/22/2009
289
Example
2/22/2009
290
Reconstruction from phase of one image and amplitude of the other
2/22/2009
291
Example
2/22/2009
292
Reconstruction from phase of one image and amplitude of the other
2/22/2009
293
Reconstruction Example
Cheetah Image Fourier Magnitude (above) Fourier Phase (below) 2/22/2009
294
Reconstruction example
Zebra Image Fourier Magnitude (above) Fourier Phase (below) 2/22/2009
295
Reconstruction Reconstruction with Zebra phase, Cheetah Magnitude
2/22/2009
296
Reconstruction Reconstruction with Cheetah phase, Zebra Magnitude
2/22/2009
297
2/22/2009
298
Optical illusion
2/22/2009
299
Optical illusion
2/22/2009
300
Optical illusion
2/22/2009
301
Optical illusion
2/22/2009
302
Optical illusion
2/22/2009
303
Optical illusion
2/22/2009
304
Optical illusion
2/22/2009
305
Optical illusion
2/22/2009
306
Optical illusion
2/22/2009
307
Optical illusion
2/22/2009
308
Optical illusion
2/22/2009
309
2/22/2009
310
Optical illusion
2/22/2009
311
Optical illusion
2/22/2009
312
2/22/2009
313
Optical illusion
2/22/2009
314
Optical illusion
2/22/2009
315
Discrete Cosine Transform 1-D (2 x + 1)uπ ⎤ ⎡ C (u ) = a (u ) ∑ f ( x) cos ⎥⎦ x =0 ⎣⎢ 2 N u = 0,1,Κ , N − 1 N −1
⎧ ⎪ ⎪ a (u ) = ⎨ ⎪ ⎪⎩
2/22/2009
1 N
u=0
2 N
u = 1,Κ , N − 1
316
IDCT – 1D
(2 x + 1)uπ ⎤ ⎡ f ( x) = ∑ a (u )C (u ) cos ⎢⎣ 2 N ⎥⎦ u =0 N −1
2/22/2009
317
1D Basis functions N=8 u=0
u=1
u=2
u=3
1.0
1.0
1.0
1.0
0.5
0.5
0.5
0.5
0
0
0
0
-0.5
-0.5
-0.5
-0.5
-1.0
-1.0
-1.0
-1.0
u=4
u=5
u=6
u=7
1.0
1.0
1.0
1.0
0.5
0.5
0.5
0.5
0
0
0
0
-0.5
-0.5
-0.5
-0.5
-1.0
-1.0
-1.0
-1.0
2/22/2009
318
1D Basis functions N=16
2/22/2009
319
Example : 1D signal
2/22/2009
320
DCT
2/22/2009
321
2-D DCT
( 2 x + 1)uπ ⎤ ( 2 y + 1)vπ ⎤ ⎡ ⎡ cos C (u , v ) = a (u ) a (v ) ∑ ∑ f ( x, y ) cos ⎥⎦ ⎥⎦ x =0 y =0 2N 2N ⎣⎢ ⎣⎢ N −1N −1
(2 y + 1)vπ ⎤ ( 2 x + 1)uπ ⎤ ⎡ ⎡ cos f ( x, y ) = ∑ ∑ a (u )a (v )C (u , v) cos ⎢⎣ 2 N ⎥⎦ ⎢⎣ 2 N ⎥⎦ u =0 v =0 N −1N −1
u, v = 0,1,Κ , N − 1 2/22/2009
322
Advantages
Notice that the DCT is a real transform. The DCT has excellent energy compaction properties. There are fast algorithms to compute the DCT similar to the FFT. 2/22/2009
323
2-D Basis functions N=4 v
0
1
2
3
u 0
1
2
2/22/2009
3
324
2-D Basis functions N=8
2/22/2009
325
Separable
2/22/2009
326
Example: Energy Compaction
2/22/2009
327
Relation between DCT & DFT g ( x ) = f ( x ) + f (2 N − 1 − x ) 0 ≤ x ≤ N −1 ⎧ f ( x ), =⎨ ⎩ f ( 2 N − 1 − x ), N ≤ x ≤ 2 N − 1 N − point 2 N − point f ( x) → g( x)
2/22/2009
DFT 2 N − point → G (u ) →
N − point C f (u )
328
DFT to DCT F ro m D F T to D C T (C o n t.) D C T has a higher com pression ration than D FT - D C T avoids the generation of spurious spectral com ponents
2/22/2009
329
December 21 1807 illusion “An arbitrary function, continuous or with discontinuities, defined in a finite interval by an arbitrarily capricious graph can always be expressed as a sum of sinusoids” J.B.J. Fourier Jean B. Joseph Fourier (1768-1830) 2/22/2009
330
Frequency analysis
Frequency Spectrum
– Be basically the frequency components (spectral components) of that signal – Show what frequencies exists in the signal
Fourier Transform (FT)
– One way to find the frequency content – Tells how much of each frequency exists in a signal N −1
∑ x (n + 1 ) ⋅ W
kn N
X(f )=
1 x (n + 1) = ∑ X (k + 1) ⋅W N− kn N k =0
∞
X (k + 1 ) =
n=0 N −1
2/22/2009
w
N
= e
2π ⎞ − j ⎛⎜ ⎟ ⎝ N ⎠
x(t ) =
∞
∫ x(t ) ⋅ e
− 2 jπft
dt
−∞
2 jπft ( ) X f ⋅ e df ∫
−∞
331
Complex Function = ∑ (weight )i • (Simple Function )i
i
Complex function representation through simple building blocks – Basis functions
Using only a few blocks Î Compressed representation Using sinusoids as building blocks Î Fourier transform – Frequency domain representation of the function
− jωt
F(ω) = ∫ f (t)e 2/22/2009
dt
1 jωt f (t) = ∫ F(ω)e dω 2π
332
How does it work Anyway?
Recall that FT uses complex exponentials (sinusoids) as building blocks. e j ω t = cos (ω t ) + j sin (ω t ) For each frequency of complex exponential, the sinusoid at that frequency is compared to the signal. If the signal consists of that frequency, the correlation is high Æ large FT coefficients. F (ω ) = ∫ f (t )e
− jω t
dt
1 j ωt f (t ) = F ( ω ) e dω ∫ 2π
If the signal does not have any spectral component at a frequency, the correlation at that frequency is low / zero, Æ small / zero FT coefficient. 2/22/2009
333
FT At work x1 (t ) = cos(2π ⋅ 5 ⋅ t )
x2 (t ) = cos(2π ⋅ 25 ⋅ t )
x3 (t ) = cos(2π ⋅ 50 ⋅ t ) 2/22/2009
334
FT At work x1 (t )
F
X 1 (ω )
x2 (t )
F
X 2 (ω )
x3 (t )
F
X 3 (ω )
2/22/2009
335
FT At work x4 (t ) = cos(2π ⋅ 5 ⋅ t ) + cos(2π ⋅ 25 ⋅ t ) + cos(2π ⋅ 50 ⋅ t )
x4 (t )
F
2/22/2009
X 4 (ω ) 336
FT At work Complex exponentials (sinusoids) as basis functions: F (ω ) =
∞
∫ f (t ) ⋅ e
− jωt
dt
−∞
1 f (t ) = 2π
∞
∫ F (ω ) ⋅ e
−∞
F
2/22/2009
337
jωt
dt
Stationarity of Signal
Stationary Signal – Signals with frequency content unchanged in time – All frequency components exist at all times
Non-stationary Signal – Frequency changes in time – One example: the “Chirp Signal”
2/22/2009
338
Stationary & Non Stationary Signals
FT identifies all spectral components present in the signal, however it does not provide any information regarding the temporal (time) localization of these components. Why? Stationary signals consist of spectral components that do not change in time – all spectral components exist at all times – no need to know any time information – FT works well for stationary signals However, non-stationary signals consists of time varying spectral components – How do we find out which spectral component appears when? – FT only provides what spectral components exist , not where in time they are located. – Need some other ways to determine time localization
components 2/22/2009
of spectral 339
Stationary & Non Stationary Signals
Stationary signals’ spectral characteristics do not change with time x4 (t ) = cos(2π ⋅ 5 ⋅ t ) + cos(2π ⋅ 25 ⋅ t ) + cos(2π ⋅ 50 ⋅ t )
Non-stationary signals have time varying spectra x5 (t ) = [ x1 ⊕ x2 ⊕ x3 ] ⊕ Concatenation 2/22/2009
340
Stationary & Nonstationary signals 3
6 0 0
2
5 0 0
Magnitud e
Stationary
Magnitud e
2 Hz + 10 Hz + 20Hz
1
4 0 0
0
3 0 0
-1
2 0 0
-2
1 0 0
-3
0
0 .2
0 .4
0 .6
0 .8
0
1
0
5
Time
2 0
2 5
2 5 0
0 .8 0 .6
2 0 0
0 .4 0 .2
1 5 0
0 -0 .2
1 0 0
-0 .4 -0 .6
2/22/2009
1 5
Magnitud e
NonStationary
1
Magnitud e
0.0-0.4: 2 Hz + 0.4-0.7: 10 Hz + 0.7-1.0: 20Hz
1 0
Frequency (Hz)
5 0
341
-0 .8 -1
0
0 .5
Time
1
0
0
5
1 0
1 5
Frequency (Hz)
2 0
2 5
Chirp signals Frequency: 2 Hz to 20 Hz Different in Time DomainFrequency: 20 Hz to 2 Hz 150
1
0.8
0.8
0.6
0.6
Magnitud e
0.4
100
Magnitud e
Magnitud e
0.4 0.2 0
-0.2
100
0.2 0
-0.2
50
-0.4
50
-0.4
-0.6
-0.6
-0.8
-0.8
-1
150
Magnitud e
1
0
0.5
Time
1
0
0
5
10
15
20
Frequency (Hz)
25
-1
0
0.5
Time
1
0
0
5
10
15
20
Frequency (Hz)
Same in Frequency Domain 2/22/2009
Atwhat whattime timethe thefrequency frequencycomponents componentsoccur? occur? FT FTcan cannot nottell! tell! At
342
25
Stationary & Non Stationary Signals
Perfect knowledge of what frequencies exist, but no information about where these frequencies are located in time
2/22/2009
343
FFT Vs Wavelet
FFT, basis functions: sinusoids Wavelet transforms: small waves, called wavelet FFT can only offer frequency information Wavelet: frequency + temporal information Fourier analysis doesn’t work well on discontinuous, “bursty” data 2/22/2009
344
Fourier Vs. Wavelet
Fourier – Loses time (location) coordinate completely – Analyses the whole signal – Short pieces lose “frequency” meaning
Wavelets – Localized time-frequency analysis – Short signal pieces also have significance – Scale = Frequency band
2/22/2009
345
Shortcomings of FT y
Sinusoids and exponentials – Stretch into infinity in time, no time localization – Instantaneous in frequency, localization
perfect spectral
– Global analysis does not allow analysis of non-stationary signals y
Need a local analysis scheme for a time-frequency representation (TFR) of nonstationary signals – Windowed F.T. or Short Time F.T. (STFT) : Segmenting the signal into narrow time intervals, narrow enough to be considered stationary, and then take the Fourier transform of each segment, Gabor 1946. – 2/22/2009 Followed by other TFRs, which differed from each other 346 by the selection of the windowing function
Nothing More Nothing less
FT Only Gives what Frequency Components Exist in the Signal The Time and Frequency Information can not be Seen at the Same Time Time-frequency Representation of the Signal is Needed
Most of Transportation Signals are Non-stationary. (We need to know whether and also when an incident was happened.) ONE EARLIER SOLUTION: SHORT-TIME FOURIER TRANSFORM (STFT)347 2/22/2009
Short Time Fourier Transform -STFT 1. 2. 3. 4. 5. 6.
Choose a window function of finite length Place the window on top of the signal at t=0 Truncate the signal using this window Compute the FT of the truncated signal, save. Incrementally slide the window to the right Go to step 3, until window reaches the end of the signal For each time location where the window is centered, we obtain a different FT – Hence, each FT provides the spectral information 2/22/2009 of a separate time-slice of the signal, providing 348 simultaneous time and frequency information
STFT Time parameter
Frequency parameter
ω ′ STFTx (t , ω ) =
Signal to be analyzed
FT Kernel (basis function)
− j ωt ′ ∫ [x(t ) ⋅W (t − t )]⋅ e dt t
STFT of signal x(t): Computed for each window centered at t=t’ 2/22/2009
Windowing function
Windowing function centered at t=t’ 349
STFT
2/22/2009
t’=-8
t’=-2
t’=4
t’=8
350
STFT
STFT provides the time information by computing a different FTs for consecutive time intervals, and then putting them together – –
Time-Frequency Representation (TFR) Maps 1-D time domain signals to 2-D time-frequency signals
Consecutive time intervals of the signal are obtained by truncating the signal using a sliding windowing function How to choose the windowing function? – What shape? Rectangular, Gaussian, Elliptic…? – How wide? Wider window require less time steps Æ low time resolution Also, window should be narrow enough to make sure that the portion of the signal falling within the window is stationary Can we choose an arbitrarily narrow window…? 2/22/2009 351
Selection of STFT Window STFTxω (t ′, ω ) = ∫ [x(t ) ⋅ W (t − t ′)]⋅ e − jωt dt t
Two extreme cases: Î W(t) infinitely long: W (t ) = 1 STFT turns into FT, providing excellent frequency information (good frequency resolution), but no time information W(t) infinitely short: W (t ) = δ (t ) STFTxω (t ′, ω ) = ∫ [x(t ) ⋅ δ (t − t ′)]⋅ e − jωt dt = x(t ′) ⋅ e − jωt ′ t
Î STFT then gives the time signal back, with a phase factor. Excellent time information (good time resolution), but no frequency information
2/22/2009
352
Drawbacks of STFT
2/22/2009
353
Drawbacks of STFT
Unchanged Window Dilemma of Resolution – Narrow window -> poor frequency resolution – Wide window -> poor time resolution Heisenberg Uncertainty Principle – Cannot know what frequency exists at what time intervals
2/22/2009
354
Heisenberg principle Δt ⋅ Δf ≥
Time resolution: How well two spikes in time can be separated from each other in the transform domain
1 4π
Frequency resolution: How well two spectral components can be separated from each other in the transform domain
Both time and frequency resolutions cannot be arbitrarily high!!! Î ÎWe cannot precisely know at what time instance a frequency component is located. We can only know what interval of frequencies are present in which time intervals 2/22/2009
355
Drawbacks of STFT
F
2/22/2009
T
356
Multiresolution analysis
Wavelet Transform
– An alternative approach to the short time Fourier transform to overcome the resolution problem – Similar to STFT: signal is multiplied with a function
Multiresolution Analysis
– Analyze the signal at different frequencies with different resolutions – Good time resolution and poor frequency resolution at high frequencies – Good frequency resolution and poor time resolution at low frequencies – More suitable for short duration of higher frequency; and longer duration of lower frequency components
2/22/2009
357
Wavelet Definition
“The wavelet transform is a tool that cuts up data, functions or operators into different frequency components, and then studies each component with a resolution matched to its scale”
2/22/2009
358
Principles of wavelet transform
Split Up the Signal into a Bunch of Signals Representing the Same Signal, but all Corresponding to Different Frequency Bands Only Providing What Frequency Bands Exists at What Time Intervals
2/22/2009
359
The wavelet transform
Overcomes the preset resolution problem of the STFT by using a variable length window Analysis windows of different lengths are used for different frequencies: – Analysis of high frequenciesÎ Use narrower windows for better time resolution – Analysis of low frequencies Î Use wider windows for better frequency resolution This works well, if the signal to be analyzed mainly consists of slowly varying characteristics with occasional short high frequency bursts. Heisenberg principle still holds!!! The function used to window the signal is called the
wavelet 2/22/2009
360
Wavelet transform
Scale and shift original waveform Compare to a wavelet Assign a coefficient of similarity
2/22/2009
361
Definition of continuous wavelet transform
A normalization Translation parameter, Scale parameter, Signal to be (location of window) measure of frequency constant analyzed
ψ
ψ
CWTx (τ , s ) = Ψx (τ , s ) =
Continuous wavelet transform of the signal x(t) using the analysis wavelet ψ(.)
Wavelet
1
∫ x(t )ψ
s t
∗⎛ t −τ ⎞
⎜ ⎟dt ⎝ s ⎠
The mother wavelet. All kernels are obtained by translating (shifting) and/or scaling the mother wavelet
Scale = 1/frequency
– Small wave – Means the window function is of finite length
Mother Wavelet – A prototype for generating the other window functions – 2/22/2009 All the used windows are its dilated or compressed and shifted versions
362
CWT
for each Scale for each Position Coefficient (S,P) = Signal x all time Wavelet (S,P) end end Scale
∫
Coefficient
2/22/2009
363
Scaling-- value of “stretch”
Scaling a wavelet simply means stretching (or compressing) it.
f(t) = sin(t) scale factor1
2/22/2009
f(t) = sin(2t) scale factor 2
f(t) = sin(3t) scale factor 3
364
Scale •It lets you either narrow down the frequency band of interest, or •determine the frequency content in a narrower time interval •Scaling = frequency band •Good for non-stationary data
Scale – S>1: dilate the signal – S<1: compress the signal
High Scale -> a Stretched wavelet -> Non-detailed Global View of Signal -> Span Entire Signal –> Low Frequency -> Slowly changing, coarse features Low Scale -> a Compressed Wavelet -> Rapidly Changing details -> High Frequency -> Detailed View Last in Short Time Only Limited Interval of Scales is Necessary
2/22/2009
365
Scale is (sort of) like frequency Small scale -Rapidly changing details, -Like high frequency
Large scale -Slowly changing details -Like low frequency 2/22/2009
366
Scale is (sort of ) like frequency
The scale factor works exactly the same with wavelets. The smaller the scale factor, the more "compressed" the wavelet. 2/22/2009
367
Shifting Shifting a wavelet simply means delaying (or hastening) its onset. Mathematically, delaying a function f(t) by k is represented by f(t-k)
2/22/2009
368
Shifting
C = 0.0004
C = 0.0034 2/22/2009
369
Computation of CWT ψ
ψ
CWTx (τ , s ) = Ψx (τ , s ) =
1
∫ x(t )ψ
s t
∗⎛ t −τ ⎞
⎜ ⎟dt ⎝ s ⎠
Step 1: The wavelet is placed at the beginning of the signal, and set s=1 (the most compressed wavelet); Step 2: The wavelet function at scale “1” is multiplied by the signal, and integrated over all times; then multiplied by ; Step 3: Shift the wavelet to t= , and get the transform value at t= and s=1; Step 4: Repeat the procedure until the wavelet reaches the end of the signal; Step 5: Scale s is increased by a sufficiently small value, the above procedure is repeated for all s; Step 6: Each computation for a given s fills the single row of the time-scale plane; Step 7: CWT is obtained if all s are calculated. 2/22/2009
370
Simple steps for CWT 1. Take a wavelet and compare it to a section at the start of the original signal. 2. Calculate a correlation coefficient c
2/22/2009
371
Simple steps to CWT 3. Shift the wavelet to the right and repeat steps 1 and 2 until you've covered the whole signal. 4. Scale (stretch) the wavelet and repeat steps 1 through 3. 5. Repeat steps 1 through 4 for all scales.
2/22/2009
372
CWTxψ (τ , s ) = Ψψ x (τ , s ) =
∗⎛ t −τ ⎞ ( ) x t ψ ⎜ ⎟dt ∫ s ⎝ ⎠ s t
1
WT At work
Lowfrequency frequency(large (large scale) Low scale)
2/22/2009
373
WT At work
2/22/2009
374
WT At work
2/22/2009
375
WT At work
2/22/2009
376
Resolution of Time & Frequency Better time resolution; Poor frequency resolution
Frequenc y
Better frequency resolution; Poor time resolution 2/22/2009
Time • Each box represents a equal portion • Resolution in STFT is selected once for entire analysis
377
Comparison of Transformations
2/22/2009
From http://www.cerm.unifi.it/EUcourse2001/Gunther_lecturenotes.pdf, p.10
378
Discretization of CWT
It is Necessary to Sample the Time-Frequency (scale) Plane. At High Scale s (Lower Frequency f ), the Sampling Rate N can be Decreased. The Scale Parameter s is Normally Discretized on a Logarithmic Grid. The most Common Value is 2. The Discretized CWT is not a True Discrete Transform
Discrete Wavelet Transform (DWT)
– – – – –
2/22/2009
Provides sufficient information both for analysis and synthesis Reduce the computation time sufficiently Easier to implement Analyze the signal at different frequency bands with different resolutions Decompose the signal into a coarse approximation and detail information
379
Discrete Wavelet transforms
CWT computed by computers is really not CWT, it is a discretized version of the CWT. The resolution of the time-frequency grid can be controlled (within Heisenberg’s inequality), can be controlled by time and scale step sizes. Often this results in a very redundant representation How to discretize the continuous time-frequency plane, so that the representation is non-redundant? – Sample the time-frequency plane on a dyadic (octave) grid ψ 1 ψ ∗⎛ t −τ ⎞
CWTx (τ , s ) = Ψx (τ , s ) = 2/22/2009
∫ x(t )ψ
s t
⎜ ⎟dt s ⎝ ⎠
(
ψ kn (t ) = 2 − k ψ 2 − k − n
)
k , n380 ∈Z
Multiresolution analysis
Analyzing a signal both in time domain and frequency domain is needed many a times – But resolutions in both domains is limited by Heisenberg uncertainty principle Analysis (MRA) overcomes this , how? – Gives good time resolution and poor frequency resolution at high frequencies and good frequency resolution and poor time resolution at low frequencies – This helps as most natural signals have low frequency content spread over long duration and high frequency content for short durations
2/22/2009
381
Discrete wavelet transform signal lowpass
highpass filters
Approximation (a) 2/22/2009
Details (d) 382
Discrete wavelet transform
Dyadic sampling of the time –frequency plane results in a very efficient algorithm for computing DWT: – Subband coding using multiresolution analysis – Dyadic sampling and multiresolution is achieved through a series of filtering and up/down sampling operations x[n] y[n] H y[n ] = x[n ] * h[n ] = h[n ] * x[n ] N
= ∑ x[k ] ⋅ h[n − k ] k =1 N
2/22/2009
= ∑ h[k ] ⋅ x[n − k ] k =1
383
DWT implementation ~
∑ yhigh [k ] ⋅ g[−n + 2k ]
yhigh [k ] = ∑ x[n] g[− n + 2k ]
x[n]
k
n
~ G
2
~ H
2
~
ylow[k ] = ∑ x[n] h[− n + 2k ]
~ G
2
2
G
~ H
2
2
H
+
2
G
2
H
+
∑ yhigh [k ] ⋅ g[−n + 2k ] k
n
Decomposition
x[n]
Reconstruction
G
Half band high pass filter
2 Down-sampling
H
Half band low pass filter
2 Up-sampling
2-2/22/2009 level DWT decomposition. The decomposition can be continues as long 384 as there are enough samples for down-sampling.
DWT - Demystified g[n] Length: 256 B: π/2 ~ π Hz
h[n]
2 d1: Level 1 DWT Coeff.
Length: 128 B: π/4 ~ π/2 Hz
a1
g[n]
2/22/2009
|G(jw)|
h[n]
2
2
Length: 128 B: 0 ~ π /4 Hz
a2 d2: Level 2 DWT Coeff.
Length: 64 B: π/8 ~ π/4 Hz
Length: 256 B: 0 ~ π/2 Hz
2
g[n] 2 d3: Level 3 DWT Coeff.
-π
-π/2
π/2
π
h[n] 2
Length: 64 B: 0 ~ π/8 Hz
…a3….Level 3 approximation Coefficients
385
w
2D DWT
Generalization of concept to 2D 2D functions ÍÎ images f(x,y) ÍÎ I[m,n] intensity function
Why would we want to take 2D-DWT of an image anyway? – Compression – Denoising – Feature extraction
Mathematical ∞form ∞ f o ( x , y ) = ∑ ∑ a o ( i , j ) ⋅ sφφ ( x − i , y −
j)
i = −∞ j = −∞
2/22/2009
a o ( i , j ) =< f ( x , y ), sφφ ( x − i , y − j ) >
sφφ ( x, y ) = φ ( x) ⋅ φ ( y )
sψψ ( x, y ) = ψ ( x) ⋅ψ ( y )
386
Implementation of 2d DWT COLUMNS ROWS
……
ROWS
COLUMNS ……
~ H
1 2
LL Ak +1
~ G
1 2
( h) D LH k +1
~ H
1 2
HL Dk(v+)1
1 2
(d ) D HH k +1
2 1
INPUT IMAGE
COLUMNS
ROWS
~ G
2 1 COLUMNS
LL
LH
HL
HH
INPUT IMAGE 2/22/2009
~ H
~ G
LLL LLH LHL LHH
HL
LH HH
LLH LHL LHH LL
HL
LH HH
387
Up and down …. Up and down 2 1
Downsample columns along the rows: For each row, keep the even indexed columns, discard the odd indexed columns
1 2
Downsample columns along the rows: For each column, keep the even indexed rows, discard the odd indexed rows
2 1
Upsample columns along the rows: For each row, insert zeros at between every other sample (column)
1 2
Upsample rows along the columns: For each column, insert zeros at between every other sample (row)
2/22/2009
388
Reconstruction LL Ak +1
1 2
H 2 1
LH D ( h) k +1
1 2
G
HL D (v ) k +1
1 2
H
HH Dk( d+)1 2/22/2009
ORIGINAL IMAGE
2 1 1 2
H
G
G
389
Subband coding algorithm
Halves the Time Resolution –
Only half number of samples resulted
–
The spanned frequency band halved
Doubles the Frequency Resolution
0-1000 Hz
X[n] 512
Filter 1
256 S S
D2
A2 A3
2/22/2009
D3
Filter 2
A1 D1
A1
256
D1: 500-1000 Hz
128
D2: 250-500 Hz
128
D3: 125-250 Hz
Filter 3
64
A2
64
A3: 0-125 Hz390
Application of wavelets
Compression De-noising Feature Extraction Discontinuity Detection Distribution Estimation Data analysis – Biological data – NDE data – Financial data 2/22/2009
391
Fingerprint compression Wavelet: Haar Level:3
2/22/2009
392
Image denoising using wavelet transform
Image de-noising using wavelet transform: Utilize the same principles as for signal decomposition and de-noising. Each column of an image matrix is convolved with high-pass and low-pass filter followed by downsampling. The same process is applied to image matrix rows. The choice of threshold limits δ for each decomposition level and modification of its coefficients for k=0, 1, … N-1 Backward image
if reconstruction | c(k ) |> δ out of ifmodified | c(kwavelet ) |> δ transform coefficients
2/22/2009
393
Image enhancement using wavelet transform
2/22/2009
394