Digital Image Processing SHADAB KHAN ARRO
What is Image Processing??
Essentially it’s a tool for analyzing image data. It’s concerned with extracting meaningful data from real world images. Digital image processing has evolved enormously in recent times, and is still evolving, one of the hottest topic of research all across the globe.
Typical applications of IP
Automated visual inspection system: Checking for objects with defects visually. Satellite image processing. Classification (OCR), identification (Handwriting, Fingerprints) etc. Can satisfy tight tolerances and it also has no subjectivity so we can change parameters for inspection.
Typical application of IP
Biomedical Field: An extensive use of IP techniques for improving the dark images which are typical in biomedical field. e.g. MEMICA arm system Robotics: UGV, UAV, AUV, ROV. Miscellaneous: Image forensics, Movies, Industries, Defense etc.
Traffic Monitoring
Face Detection
Medical IP
Stanley: Beginning of a new era
Morphing
Image representation:
Computers can not handle continuous images but only arrays of digital numbers. Images are represented as a 2-D arrays of points (2-D matrix). A point on this 2-D grid (corresponding to the image matrix elements) is called Pixel (Picture element). It represents the average irradiance over the area of the pixel
Image Basics:
Image: x, y, f (x, y) distribution in 2-D space. One with finite value & discrete values of above all is a digital image. The tristimulus theory of color perception implies that any color can be obtained from a mix of the three primaries, red, green and blue. Radiance is the total amount of energy that flows from the light source, and it is usually measured in watts (W). Luminance, measured in lumens (lm), gives a measure of the amount of energy an observer perceives from a light source. For example, light emitted from a source operating in the far infrared region of the spectrum could have significant energy (radiance), but an observer would hardly perceive it; its luminance would be almost zero Brightness is a subjective descriptor of light perception that is practically impossible to measure. It embodies the achromatic notion of intensity and is one of the key factors in describing color sensation.
Overlapping of primaries
Image sensing:
Image acquisition:
f(x,y) = reflectance(x,y) * illumination(x,y) Reflectance in [0,1], illumination in [0,inf]
Sampling and Quantization A digital image a[m,n] described in a 2D discrete space is derived from a n analog image a(x,y) in a 2D contin uous space through a sampling pro cess that is frequently referred to a s digitization. If our samples are d apart, we can write this as: f [i ,j] = Quantize { f(i d, j d) } The image can now be represented as a matrix of integer values
62
79
23
119
120
105
4
0
10
10
9
62
12
78
34
0
10
58
197
46
46
0
0
48
176
135
5
188
191
68
0
49
2
1
1
29
26
37
0
77
0
89
144
147
187
102
62
208
255
252
0
166
123
62
0
31
166
63
127
17
1
0
99
30
S & Q continued…
Gray level resolution
Spatial Resolution
Image as function:
f(x,y) y x
Types of images:
Color Images: A color image is just three functions pasted together. We can write this as a “vectorvalued” function:
r( x, y) f ( x, y) = g( x, y) b( x, y) BGR [0-255]
Gray scale images
Each pixel value has only one component, grayscale value. This represents the brightness on a scale of 0-255 scale. It is also stored in memory with R=G=B. Back conversion into color images is possible with some techniques involving mapping of 2^8 to 2^24, uses ANN.
Binary images
Each pixel can take only two values that is 0 and 1. Easiest to work with. Fast processing time. For rendering it on gray scale, 1 255 00
Image zooming and shrinking
Zooming can be viewed as oversampling the image and shrinking can be viewed as undersampling an image. (NNI) Various other techniques such as Bilinear interpolation, cubic interpolation etc. are being used for producing better results and reducing contours in the image.
Zooming
Neighborhood
N 4 or 4 connectivity neighborhood (VN) of a
pixel at position (x,y) is given by following locations: (x+1,y), (x,y+1), (x-1,y), (x,y-1) N 8 or 8 connectivity neighbor-hood of a pixel at (x,y) is given by following locations: (x+1,y),(x,y+1),(x-1,y), (x,y-1),(x+1,y+1),(x+1,y-1) (x-1,y+1),(x-1,y-1)
N 4 and N 8
N 4 = { b: Lcity
N8
(b,a) =1 }
= { b: Lchess (b,a) =1}
Distance measurement
Types of distances between the pixels: 1. D 4 or cityblock distance: D 4 = |x-s| + |y-t| 2. D8 or chessboard distance: D 8 = max {|x-s|, |y-t|} 3. Euclidean Distance 1f f f f `
a F`
a2 b
D e p,q = x @ s + y @ t
c2 G2
Connected set, component, region, boundary
Let S represent a subset of pixels in an image. Two pixels p and q are said to be connected in S if there exists a path between them consisting entirely of pixels in S. For any pixel p in S, the set of pixels that are connected to it in S is called a connected component of S. If it only has one connected component, then set S is called a connected set. Let R be a subset of pixels in an image. We call R a region of the image if R is a connected set. The boundary (also called border or contour) of a region R is the set of pixels in the region that have one or more neighbors that are not in R. Edges are formed from pixels with derivative values that exceed a preset threshold. Depending on the type of connectivity and edge operators used, the edge extracted from a binary region will be the same as the region boundary. Boundary defined.
Image operations: Image Processing: image in -> image out Image Analysis: image in -> measurements out Image Understanding: image in -> high-level description out As we shall see in further lectures that we shall apply some kind of transformation function on the image such that: g(x,y) = T{f(x,y)} Application of this transformation function is done in these ways:
Operation over an image Operation
Characterization
* Point
- the output value at a specific coordinate is dependent only on the input value at that same coordinate.
* Local
- the output value at a specific coordinate is dependent on the input values in the neighborhood of that same coordinate.
P2
* Global
- the output value at a specific coordinate is dependent on all the values in the input image.
N2
Generic Complexit y/Pixel
constant
Image size = N x N; neighborhood size = P x P. The complexity is specified in operations per pixel
Image Enhancement in Spatial Domain SHADAB KHAN ARRO
Image enhancement in spatial domain
The principal objective of enhancement is to process an image so that the result image is more suitable than the original image for a specific application. There are two broad categories:
Spatial domain: These approaches are based on direct manipulation of pixels in an image. Frequency domain: These techniques are based on modifying the Fourier transform of an image.
Spatial domain basics
Spatial domain refers to the aggregate of pixels composing an image. Spatial domain methods are procedures that operate directly on these pixels g(x,y) = T [f(x,y)] where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined over some neighborhood of (x, y). In addition, T can operate on a set of input images, such as performing the pixel-by-pixel sum of K images for noise reduction
Spatial domain basics
The simplest form of T is when the neighborhood is of size 1*1 (that is, a single pixel). In this case, g depends only on the value of f at (x, y), and T becomes a gray-level (also called an intensity or mapping) transformation function of the form s = T (r) where, for simplicity in notation, r and s are variables denoting, respectively, the gray level of f(x, y) and g(x, y) at any point (x, y)
Thresholding function
Point processing and Masks
Usually enhancement at any point in an image depends on the gray level at that point, techniques in this category often are referred to as point processing. The general approach is to use a function of the values of f in a predefined neighborhood of (x, y) to determine the value of g at (x, y). One of the principal approaches in this formulation is based on the use of so-called masks. Basically, a mask is a small 2-D array, such as the one shown in which the values of the mask coefficients determine the nature of the process, such as image sharpening. Enhancement techniques based on this type of approach often are referred to as mask processing or filtering w(1,1) w(2,1) w(3,1) w(1,2) w(2,2) w(3,2) w(3,1) w(3,2) w(3,3)
Spatial domain techniques What is Contrast stretching?
How to enhance the contrast?
Low contrast image values concentrated near a narrow range (mostly dark, or mostly bright, or mostly medium values) • Contrast enhancement change the image value distribution to cover a wide range • Contrast of an image can be revealed by its histogram
Some basic transformation functions Image Negatives s = L-1-r Log Transformation s = c log(1+r) Power law transformation
s = cr γ
Image negative s = L-1-r
Log transformation
s = c log(1+r)
Power law transformation
The exponent in the power law transform is referred to as “gamma”.
s = cr
γ
Gamma Correction Since all image rendering devices can vary in gamma error, the images which are being telecasted or uploaded on internet are usually pre processed to an averaged gamma value so that image looks fine on most of them.
Power law transformation Magnetic resonance (MR) image of a fractured human spine. (b)–(d) Results of applying the transformation in with c=1 and g=0.6, 0.4, and 0.3, respectively
Power law transformation (a) Aerial image. (b)–(d) Results of applying the transformation with c=1 and g=3.0, 4.0, and 5.0, respectively. (Original image for this example courtesy of NASA.)
Log/power law transformation
f
Piecewise linear stretching
Contrast stretching
Contrast stretching (a) Form of transformation function. (b) A low-contrast image. (c) Result of contrast stretching. (d) Result of thresholding.
Contrast stretching
If r1 = s1 and r 2 = s 2 the transformation is a linear function that produces no changes in gray levels. If r1 = r 2 , s1 = 0 and s 2 = L @ 1 the transformation becomes a thresholding function that creates a binary image. Intermediate values of ( r1 , s1 ) and ( r 2 , s 2 ) produce various degrees of spread in the gray levels of the output image, thus affecting its contrast. In general, r1 , r 2 and s1 , s 2 is assumed so that the function is single valued and monotonically increasing. This condition preserves the order of gray levels.
Gray level slicing (a) This transformation highlights range [A, B] of gray levels and reduces all others to a constant level. (b) This transformation highlights range [A, B] but preserves all other levels. (c) An image. (d) Result of using the transformation in (a).
Gray level slicing
Bit plane slicing
Usually the visually significant data is stored in higher four bits, rest of the data accounts towards subtle detail of an image. Used when we require to modify the contribution made by any particular bit plane.
An 8 bit fractal image
A fractal is an image generated by mathematical expressions
Contribution of each bit plane
The eight bit planes of the image in earlier slide. The number at the bottom, right of each image identifies the bit plane.
Histogram
The histogram of a digital image with gray levels in the range [0, L-1] is a discrete ` a function h r k = nk , where r k is the k th gray level and nk is the number of pixels in the image having gray level r k . ` a Loosely speaking, p r k gives an estimate of the probability of occurrence of gray level r k .
Histogram
The graph on the right shows the histogram of the image on left, notice the distribution of discrete lines over the x coordinate( nk )
Histogram Low Contrast image The histogram will be narrow and will be concentrated on toward the middle of the gray scale.
High Contrast image whose pixels have a large variety of gray tones. The histogram is not far from a uniform.
Can two images have same histogram??
Different images
But have the same histogram
More examples of histogram
Histogram Normalization
It’s a very common to normalize the histogram. It’s done by dividing the values by the number of pixels in an image, hence a normalized histogram is given by:
nff f f kf p rk = n Note that sum of all components of a normalized histogram is equal to 1 ` a
Histogram equalization
Histogram equalization is done in order to have a uniform spread of histogram over the plane. Let us consider a transformation function of the form: 0≤ r≤ 1 s=T(r) that produces a level s for every pixel having a pixel value r in the original image.
Histogram equalization
Assumptions for T(r): (a) T(r) is a single valued and monotonically increasing function in the interval 0 ≤ r ≤ 1 and ` a (b) 0 ≤ T r ≤ 1 for 0 ≤ r ≤ 1 Validity of assumptions For inverse transformation Non inversion of gray levels
` a
` a
Consider pr r the pdf of r and ps s the pdf of s. Then we have, ` a
ps s =
L M M ` aL f f f f f f L dr M M pr r L L ds M
Consider a transformation function of the form: r ` a
` a
s = T r = Z pr w dw 0
According to Leibniz’s rule the derivative of a definite integral with respect to its upper limit is simply the integrand value at that limit. Applying this rule to equation stated previously we get,
` a Tffffff rfffff
ds f f f f f f =d dr dr H
r
=
dffff f f L JZ
dr
o
` a
= pr r
` a
pr w
I
M dwK
Therefore we have,
` a
ps s
=1
Which shows that applying the transformation function of the form as described, will produce a uniform histogram. The discrete formulation is:
` a
k
b c
s k = T r k = X pr r j j=0
Thus a processed image is obtained by mapping each pixel with r k value in the input image into a corresponding pixel with level s k in the output image via the transformation function mentioned above
Histogram Equalization Notice how the vertical bars get Separated and cover the entire range in the Transformed result
Spatial filtering and Masks
Sometime we need to manipulate values obtained from neighboring pixels for forming the output image. Example: How can we compute an average value of pixels in a 3x3 region center at a pixel z ? Pixel z 2 4 1 2 6 2 9 2 3 4 4 4 7 2 9 7 6 7 5 2 3 6 1 5 7 4 2 5 1 2 2 5 2 3 2 8 Image
How masking is done.
Masking continued…. Step 1. Selected only needed pixels
4
1
2
6
2
9
2
3
4
4
4
7
2
9
7
6
7
5
2
3
6
1
5
7
4
2
5
1
2
2
5
2
3
2
8
…
3
4
4
9
7
6
3
6
1
…
2
…
Pixel z
…
Masking continued… …
Step 2. Multiply every pixel by 1/9 and then sum up the values
4
4
9
7
6
3
6
1
…
…
…
3
X
1 9
1
1
1
1
1
1
1
1
1
Mask or Window or Template
1 1 1 y = ⋅3+ ⋅4 + ⋅4 9 9 9 1 1 1 + ⋅9 + ⋅7 + ⋅6 9 9 9 1 1 1 + ⋅ 3 + ⋅ 6 + ⋅1 9 9 9
Averaging an image Question: How to compute the 3x3 average values at every pixels?
2
4
1
2
6
2
9
2
3
4
4
4
7
2
9
7
6
7
5
2
3
6
1
5
7
4
2
5
1
2
Solution: Imagine that we have a 3x3 window that can be placed everywhere on the image
Masking Window
Step 1: Move the window to the first location where we want to compute the average value and then select only pixels inside the window.
2
4
1
2
6
2
2
4
1
9
2
3
4
4
4
9
2
3
7
2
9
7
6
7
7
2
9
5
2
3
6
1
5
7
4
2
5
1
2
Step 2: Compute the average value 3
1 y = ∑∑ ⋅ p (i, j ) i =1 j =1 9
Sub image p
Original image
Step 4: Move the window to the next location and go to Step 2
3
Step 3: Place the result at the pixel in the output image
4.3 Output image
The 3x3 averaging method is one example of the mask operation or Spatial filtering.
The mask operation has the corresponding mask (sometimes called window or template). The mask contains coefficients to be multiplied with pixel values. Example : moving averaging w(1,1) w(2,1) w(3,1) w(1,2) w(2,2) w(3,2) w(3,1) w(3,2) w(3,3)
Mask coefficients
1 9
1
1
1
1
1
1
1
1
1
The mask of the 3x3 moving average filter has all coefficients = 1/9
The mask operation at each point is performed by: 1. Move the reference point (center) of mask to the location to be computed 2. Compute sum of products between mask coefficients and pixels in subimage under the mask.
…
Mask frame w(1,1) w(2,1) w(3,1)
p(1,1) p(2,1) p(3,1)
…
p(2,1) p(2,2) p(3,2)
…
w(1,2) w(2,2) w(3,2)
p(1,3) p(2,3) p(3,3)
…
w(3,1) w(3,2) w(3,3)
Mask coefficients
Subimage N
The reference point of the mask
M
y = ∑∑ w(i, j ) ⋅ p(i, j ) i =1 j =1
The spatial filtering on the whole image is given by: 1. Move the mask over the image at each location. 2. Compute sum of products between the mask coefficients and pixels inside subimage under the mask. 3. Store the results at the corresponding pixels of the output image. 4. Move the mask to the next location and go to step 2 until all pixel locations have been used.
(a) Original image, of size 500*500 pixels. (b)–(f) Results of smoothing with square averaging filter masks of sizes n=3, 5, 9, 15, and 35, respectively. The black squares at the top are of sizes 3, 5, 9, 15, 25, 35, 45, and 55 pixels, respectively; their Borders are 25 pixels apart. The letters at the bottom range in size from 10 to 24 points, in increments of 2 points; the large letter at the top is 60 points. The vertical bars are 5 pixels wide and 100 pixels high; their separation is 20 pixels. The diameter of the circles is 25 pixels, and their borders are 15 pixels apart; their gray levels range from 0% to 100% black in increments of 20%.The background of the image is 10% black. The noisy rectangles are of size 50*120 pixels.
Image smoothening and thresholding
(a) Image from the Hubble Space Telescope. (b) Image processed by a 15*15 averaging mask. (c) Result of thresholding (b)
Order statistics filter Original image Subimage
Statistic parameters Mean, Median, Mode, Min, Max, Etc.
Moving window
Output image
Example of filtering
(a) X-ray image of circuit board corrupted by salt-andpepper noise. (b) Noise reduction with a 3*3 averaging mask. (c) Noise reduction with a 3*3 median filter.
Derivatives of an image ANY DEFINITION WE USE FOR A FIRST DERIVATIVE (1) MUST BE ZERO IN FLAT SEGMENTS (AREAS OF CONSTANT GRAY-LEVEL VALUES); (2) MUST BE NONZERO AT THE ONSET OF A GRAY LEVEL STEP OR RAMP; AND (3) MUST BE NONZERO ALONG RAMPS. SIMILARLY, ANY DEFINITION OF A SECOND DERIVATIVE (1) MUST BE ZERO IN FLAT AREAS; (2) MUST BE NONZERO AT THE ONSET AND END OF A GRAY-LEVEL STEP OR RAMP; AND (3) MUST BE ZERO ALONG RAMPS OF CONSTANT SLOPE. SINCE WE ARE DEALING WITH DIGITAL QUANTITIES WHOSE VALUES ARE FINITE,THE MAXIMUM POSSIBLE GRAY-LEVEL CHANGE ALSO IS FINITE, AND THE SHORTEST DISTANCE OVER WHICH THAT CHANGE CAN OCCUR IS BETWEEN ADJACENT PIXELS. A FIRST ORDER DIFFERENTIAL EQUATION IS GIVEN AS:
a ` a ∂ffff ffff ` =f x +1 @f x ∂x
A SECOND ORDER DIFFERENTIAL EQUATION IS
a ` a ` a ∂fffffff fffff ` = f x + 1 + f x @ 1 @ 2f x 2 ∂x 2
WE CONSIDER THE IMAGE SHOWN NEXT AND CALCULATE TE FIRST AND SECOND ORDER DERVIVATIVE ALONG A LINE WHICH PASSES THROUGH THE ISOLATED POINT
FIRST, WE NOTE THAT THE FIRST-ORDER DERIVATIVE IS NONZERO ALONG THE ENTIRE RAMP, WHILE THE SECONDORDER DERIVATIVE IS NONZERO ONLY AT THE ONSET AND END OF THE RAMP. BECAUSE EDGES IN AN IMAGE RESEMBLE THIS TYPE OF TRANSITION, WE CONCLUDE THAT FIRST-ORDER DERIVATIVES PRODUCE “THICK” EDGES AND SECOND-ORDER DERIVATIVES, MUCH FINER ONES. NEXT WE ENCOUNTER THE ISOLATED NOISE POINT. HERE, THE RESPONSE AT AND AROUND THE POINT IS MUCH STRONGER FOR THE SECOND- THAN FOR THE FIRST-ORDER DERIVATIVE. FINALLY, IN THIS CASE, THE RESPONSE OF THE TWO DERIVATIVES IS THE SAME AT THE GRAY-LEVEL STEP (IN MOST CASES WHEN THE TRANSITION INTO A STEP IS NOT FROM ZERO, THE SECOND DERIVATIVE WILL BE WEAKER).
(1) FIRST-ORDER DERIVATIVES GENERALLY PRODUCE THICKER EDGES IN AN IMAGE. (2) SECOND-ORDER DERIVATIVES HAVE A STRONGER RESPONSE TO FINE DETAIL, SUCH AS THIN LINES AND ISOLATED POINTS. (3)FIRST ORDER DERIVATIVES GENERALLY HAVE A STRONGER RESPONSE TO A GRAY-LEVEL STEP. (4)SECOND ORDER DERIVATIVES PRODUCE A DOUBLE RESPONSE AT STEP CHANGES IN GRAY LEVEL. WE ALSO NOTE OF SECOND-ORDER DERIVATIVES THAT, FOR SIMILAR CHANGES IN GRAY-LEVEL VALUES IN AN IMAGE, THEIR RESPONSE IS STRONGER TO A LINE THAN TO A STEP, AND TO A POINT THAN TO A LINE.
USE OF SECOND DERIVATIVES FOR ENHANCEMENT–THE LAPLACIAN IT CAN BE SHOWN (ROSENFELD AND KAK [1982]) THAT THE SIMPLEST ISOTROPIC DERIVATIVE OPERATOR IS THE LAPLACIAN, WHICH, FOR A FUNCTION (IMAGE) F(X, Y) OF TWO VARIABLES, IS DEFINED AS
SINCE DERIVATIVE OF ANY ORDEWR IS A LINEAR OPERATOR THE LAPLACIAN OPERATOR IS ALSO A LINEAR OPERATOR HENCE THE PARTIAL SECOND ORDER DERIVATIVE ALONG X- AXIS IS
AND THE PARTIAL SECOND ORDER DERIVATIVE ALONG Y- AXIS IS
HENCE OUR LAPLACIAN LINEAR OPERATOR’S DISCRETE FORMULATION IS
THE MASK WHICH IMPLIES THE ABOVE FORMULATION IS
(a) Filter mask used to implement the digital Laplacian, as defined above (b) Mask used to implement an extension of this equation that includes the diagonal neighbors. (c) and (d) are two other implementations of the Laplacian.
BECAUSE THE LAPLACIAN IS A DERIVATIVE OPERATOR, ITS USE HIGHLIGHTS GRAY-LEVEL DISCONTINUITIES IN AN IMAGE AND DEEMPHASIZES REGIONS WITH SLOWLY VARYING GRAY LEVELS. THIS WILL TEND TO PRODUCE IMAGES THAT HAVE GRAYISH EDGE LINES AND OTHER DISCONTINUITIES, ALL SUPERIMPOSED ON A DARK, FEATURELESS BACKGROUND. BACKGROUND FEATURES CAN BE “RECOVERED” WHILE STILL PRESERVING THE SHARPENING EFFECT OF THE LAPLACIAN OPERATION SIMPLY BY ADDING THE ORIGINAL AND LAPLACIAN IMAGES. AS NOTED IN THE PREVIOUS PARAGRAPH, IT IS IMPORTANT TO KEEP IN MIND WHICH DEFINITION OF THE LAPLACIAN IS USED. IF THE DEFINITION USED HAS A NEGATIVE CENTER COEFFICIENT, THEN WE SUBTRACT, RATHER THAN ADD, THE LAPLACIAN IMAGE TO OBTAIN A SHARPENED RESULT. THUS, THE BASIC WAY IN WHICH WE USE THE LAPLACIAN FOR IMAGE ENHANCEMENT IS AS FOLLOWS:
(a) Image of the North Pole of the moon. (b) Laplacian Filtered image. (c) Laplacian image scaled for Display purposes. (d) Image enhanced by using equation stated earlier
Unsharp Masking and High Boost Filtering -1
-1
-1
0
-1
0
-1
k+8
-1
-1
k+4
-1
-1
-1
-1
0
-1
0
Equation:
kf ( x, y ) − ∇ 2 f ( x, y ) f hb ( x, y ) = 2 kf ( x , y ) + ∇ f ( x, y )
The center of the mask is negative The center of the mask is positive
Result of HB Filtering (b) Laplacian of (a) computed with the mask in Using k=0. (c) Laplacian enhanced image using the mask (b) with k=1. (d) Same as (c), but using k=1.7.
References
Digital Image Processing [2e], Rafael C Gonzalez and R E Woods, Prentice Hall India.
End of lecture 1 I look forward to receive any kind of constructive criticism or suggestions to improve myself in my further presentations. SHADAB KHAN
[email protected] +91 99004 04678 Post your doubts at: E-mail. Image Processing community at Orkut.