ZeePedia

Tools:CONVOLUTION, FOURIER TRANSFORMS, Circularly symmetric signals

<< Digital Image Definitions:COMMON VALUES, Types of operations, VIDEO PARAMETERS
Perception:BRIGHTNESS SENSITIVITY, Wavelength sensitivity, OPTICAL ILLUSIONS >>
img
...Image Processing Fundamentals
Standard
NTSC
PAL
SECAM
Property
images / second
29.97
25
25
ms / image
33.37
40.0
40.0
lines / image
525
625
625
(horiz./vert.) = aspect ratio
4:3
4:3
4:3
interlace
2:1
2:1
2:1
µs / line
63.56
64.00
64.00
Table 3: Standard video parameters
In an interlaced image the odd numbered lines (1,3,5,...) are scanned in half of the
allotted time (e.g. 20 ms in PAL) and the even numbered lines (2,4,6,...) are
scanned in the remaining half. The image display must be coordinated with this
scanning format. (See Section 8.2.) The reason for interlacing the scan lines of a
video image is to reduce the perception of flicker in a displayed image. If one is
planning to use images that have been scanned from an interlaced video source, it is
important to know if the two half-images have been appropriately "shuffled" by the
digitization hardware or if that should be implemented in software. Further, the
analysis of moving objects requires special care with interlaced video to avoid
"zigzag" edges.
The number of rows (N) from a video source generally corresponds one­to­one
with lines in the video image. The number of columns, however, depends on the
nature of the electronics that is used to digitize the image. Different frame grabbers
for the same video camera might produce M = 384, 512, or 768 columns (pixels)
per line.
3.
Tools
Certain tools are central to the processing of digital images. These include
mathematical tools such as convolution, Fourier analysis, and statistical
descriptions, and manipulative tools such as chain codes and run codes. We will
present these tools without any specific motivation. The motivation will follow in
later sections.
3.1 CONVOLUTION
There are several possible notations to indicate the convolution of two (multi-
dimensional) signals to produce an output signal. The most common are:
c =a b = ab
(1)
6
...Image Processing Fundamentals
We shall use the first form, c = a b , with the following formal definitions.
In 2D continuous space:
+∞ +∞
∫ ∫  a( χ ,ζ )b( x - χ , y - ζ )dχdζ
c (x, y) = a( x, y ) b( x, y ) =
(2)
-∞ -∞
In 2D discrete space:
+∞
+∞
∑ ∑a[ j , k ]b[ m - j, n - k]
c[m, n] = a[m , n] b[m, n] =
(3)
j=-∞ k =-∞
3.2 P  ROPERTIES OF CONVOLUTION
There are a number of important mathematical properties associated with
convolution.
· Convolution is commutative.
c =a b = ba
(4)
· Convolution is associative.
c = a (b d ) = (a b) d = a b d
(5)
· Convolution is distributive.
c = a (b + d ) = (a b) + ( a d)
(6)
where a, b, c, and d are all images, either continuous or discrete.
3.3 F  OURIER TRANSFORMS
The Fourier transform produces another representation of a signal, specifically a
representation as a weighted sum of complex exponentials. Because of Euler's
formula:
e jq = cos(q ) + jsin( q)
(7)
where j  2 = -1 , we can say that the Fourier transform produces a representation of
a (2D) signal as a weighted sum of sines and cosines. The defining formulas for
the forward Fourier and the inverse Fourier transforms are as follows. Given an
image a and its Fourier transform A, then the forward transform goes from the
7
img
...Image Processing Fundamentals
spatial domain (either continuous or discrete) to the frequency domain which is
always continuous.
A = F {a}
­
(8)
Forward
The inverse Fourier transform goes from the frequency domain back to the spatial
domain.
a = F  -1 {A}
­
(9)
Inverse
The Fourier transform is a unique and invertible operation so that:
{F  -1 {A}}
{F {a}}
a = F  -1
A= F
and
(10)
The specific formulas for transforming back and forth between the spatial domain
and the frequency domain are given below.
In 2D continuous space:
+∞ +∞
a (x, y)e  - j(ux + vy) dxdy
A(u, v) =
­
(11)
Forward
-∞ -∞
+∞ +∞
1
A(u, v)e  + j (ux +vy )dudv
4š  2 --
a( x , y) =
­
(12)
Inverse
∞ ∞
In 2D discrete space:
+∞
+∞
∑ ∑a[ m, n ]e- j (m n )
A(, Ψ ) =
­
(13)
Forward
m =-∞ n=-∞
+ š +š
1
A(, Ψ )e  + j(m n) ddΨ
a[m, n] =
­
(14)
Inverse
4š  2
-š -š
3.4 P  ROPERTIES OF F  OURIER TRANSFORMS
There are a variety of properties associated with the Fourier transform and the
inverse Fourier transform. The following are some of the most relevant for digital
image processing.
8
img
...Image Processing Fundamentals
· The Fourier transform is, in general, a complex function of the real frequency
variables. As such the transform can be written in terms of its magnitude and
phase.
A(u, v) = A( u, v ) e jϕ (u, v)
A(, Ψ) = A(, Ψ) e jϕ ( ,Ψ)
(15)
· A 2D signal can also be complex and thus written in terms of its magnitude and
phase.
a( x , y) = a( x, y ) e jϑ( x, y)
a[m , n] = a[m, n] e jϑ[ m, n]
(16)
· If a 2D signal is real, then the Fourier transform has certain symmetries.
A(u, v) = A* ( -u, - v)
A(, Ψ) = A* (-Ω, )
(17)
The symbol (*) indicates complex conjugation. For real signals eq. (17) leads
directly to:
A(u, v) = A (- u, -v)
ϕ (u, v) = -ϕ ( -u, - v )
(18)
A(, Ψ) = A(-Ω, )
ϕ (, Ψ) = -ϕ ( -Ω, )
· If a 2D signal is real and even, then the Fourier transform is real and even.
A(u, v) = A(- u, -v )
A(, Ψ ) = A(-Ω, )
(19)
· The Fourier and the inverse Fourier transforms are linear operations.
F {w1a + w2 b} = F {w1 a}  + F {w2 b} = w1 A + w2 B
(20)
F  -1 {w1 A + w2B} = F  -1 {w1 A} + F  -1{w2 B} = w1 a + w2 b
where a and b are 2D signals (images) and w1 and w2 are arbitrary, complex
constants.
· The Fourier transform in discrete space, A(,Ψ), is periodic in both and Ψ.
Both periods are 2š.
A(Ω+ 2šj , Ψ + 2šk ) = A( , Ψ)
j, k integers
(21)
· The energy, E , in a signal can be measured either in the spatial domain or the
frequency domain. For a signal with finite energy:
9
img
...Image Processing Fundamentals
Parseval's theorem (2D continuous space):
+∞ +∞
+∞ +∞
1
E = ∫ ∫ a( x, y) dxdy =   2
∫∫
2
A(u, v)  2 dudv
(22)
4š
-∞ -∞
-∞ -∞
Parseval's theorem (2D discrete space):
+š +š
+∞
+∞
1
E = ∑ ∑  a[m, n] =   2
∫∫
2
A(, Ψ ) 2 ddΨ
(23)
4š
m=-∞ n=-∞
-š -š
This "signal energy" is not to be confused with the physical energy in the
phenomenon that produced the signal. If, for example, the value a[m,n] represents a
photon count, then the physical energy is proportional to the amplitude, a, and not
the square of the amplitude. This is generally the case in video imaging.
· Given three, multi-dimensional signals a, b, and c and their Fourier transforms A,
B , and C:
F
c =a b
C = A· B
and
(24)
F
1
c =a· b
C=
2 AB
4š
In words, convolution in the spatial domain is equivalent to multiplication in the
Fourier (frequency) domain and vice-versa. This is a central result which provides
not only a methodology for the implementation of a convolution but also insight
into how two signals interact with each other--under convolution--to produce a
third signal. We shall make extensive use of this result later.
· If a two-dimensional signal a(x,y) is scaled in its spatial coordinates then:
(
)
If
a( x, y )
a Mx · x , My · y
(25)
Au M , v M M  x · My
Then
A(u, v)
y
x
· If a two-dimensional signal a(x,y) has Fourier spectrum A(u,v) then:
+∞ +∞
∫ ∫  a( x, y )dxdy
A(u = 0, v = 0) =
-∞ -∞
(26)
+∞ +∞
1
2 ∫ ∫ A(u, v)dxdy
a( x = 0, y = 0 ) =
4š -∞ -∞
10
img
...Image Processing Fundamentals
· If a two-dimensional signal a(x,y) has Fourier spectrum A(u,v) then:
a( x, y) F
a(x , y) F
juA(u, v)
jvA(u, v)
x
y
(27)
a (x , y) F
a( x , y) F
2
2
↔ - u  2 A(u, v)
↔ - v  2 A(u, v)
x2
y2
3.4.1 Importance of phase and magnitude
Equation (15) indicates that the Fourier transform of an image can be complex.
This is illustrated below in Figures 4a-c. Figure 4a shows the original image
a[m,n], Figure 4b the magnitude in a scaled form as log(|A(,Ψ)|), and Figure 4c
the phase ϕ(,Ψ).
Figure 4a
Figure 4b
Figure 4c
log(|A(,Ψ)|)
ϕ(,Ψ)
Original
Both the magnitude and the phase functions are necessary for the complete
reconstruction of an image from its Fourier transform. Figure 5a shows what
happens when Figure 4a is restored solely on the basis of the magnitude
information and Figure 5b shows what happens when Figure 4a is restored solely
on the basis of the phase information.
11
img
...Image Processing Fundamentals
Figure 5a
Figure 5b
ϕ(,Ψ) = 0
|A(,Ψ)| = constant
Neither the magnitude information nor the phase information is sufficient to restore
the image. The magnitude­only image (Figure 5a) is unrecognizable and has severe
dynamic range problems. The phase-only image (Figure 5b) is barely recognizable,
that is, severely degraded in quality.
3.4.2 Circularly symmetric signals
An arbitrary 2D signal a(x,y) can always be written in a polar coordinate system as
a(r,θ). When the 2D signal exhibits a circular symmetry this means that:
a( x , y) = a(r,θ ) = a(r )
(28)
where r2 = x2 + y2 and tanθ = y/x. As a number of physical systems such as lenses
exhibit circular symmetry, it is useful to be able to compute an appropriate Fourier
representation.
The Fourier transform A(u, v) can be written in polar coordinates A(ωr,ξ) and then,
for a circularly symmetric signal, rewritten as a Hankel transform:
A(u, v) = F {a( x , y )} = 2š  a(r )J  o(ωr r)rdr = A (ω  r )
(29)
0
where ω2 = u  2 + v  2 and tanξ = v u and Jo(·) is a Bessel function of the first kind
r
of order zero.
The inverse Hankel transform is given by:
1
A(ωr )J  o(ω  rr )ω  rdω  r
a( r) =
(30)
2š
0
12
img
...Image Processing Fundamentals
The Fourier transform of a circularly symmetric 2D signal is a function of only the
radial frequency, ωr. The dependence on the angular frequency, ξ, has vanished.
Further, if a(x,y) = a(r) is real, then it is automatically even due to the circular
symmetry. According to equation (19), A(ωr) will then be real and even.
3.4.3 Examples of 2D signals and transforms
Table 4 shows some basic and useful signals and their 2D Fourier transforms. In
using the table entries in the remainder of this chapter we will refer to a spatial
domain term as the point spread function (PSF) or the 2D impulse response and its
Fourier transforms as the optical transfer function (OTF) or simply transfer
function. Two standard signals used in this table are u(·), the unit step function, and
J1(·), the Bessel function of the first kind. Circularly symmetric signals are treated
as functions of r as in eq. (28).
3.5 STATISTICS
In image processing it is quite common to use simple statistical descriptions of
images and sub­images. The notion of a statistic is intimately connected to the
concept of a probability distribution, generally the distribution of signal amplitudes.
For a given region--which could conceivably be an entire image--we can define
the probability distribution function of the brightnesses in that region and the
probability density function of the brightnesses in that region. We will assume in
the discussion that follows that we are dealing with a digitized image a[m,n].
3.5.1 Probability distribution function of the brightnesses
The probability distribution function, P (a), is the probability that a brightness
chosen from the region is less than or equal to a given brightness value a. As a
increases from ­to + , P (a) increases from 0 to 1. P (a) is monotonic, non-
decreasing in a and thus dP /da 0.
3.5.2 Probability density function of the brightnesses
The probability that a brightness in a region falls between a and a+Ća, given the
probability distribution function P (a), can be expressed as p(a)Ća where p(a) is the
probability density function:
dP( a)
p( a)Ć a = 
 Ća
(31)
da
13
img
...Image Processing Fundamentals
Ra, b ( x , y) =
sin(aω  x )   sin( bωy )
F
T.1 Rectangle

 aω  x     bω  y  
1
u(a2 - x  2)u( b2 - y  2 )
4ab
Ra, b ( x , y) Ra,b (x , y)
F
2
sin(aω  x )   sin( bω  y )
T.2 Pyramid
2
 
 aω  x     bω  y  
J1(aω )
u(a2 - r2 )
2
F
T.3 Cylinder
Pa (r ) =
aω 
ša2
Pa (r ) Pa(r )
J ( aω )
F
2
41
T.4 Cone
aω 
14
img
...Image Processing Fundamentals
1 J1(ω  cr / 2 ) 2
   -1 ω 
F
T.5 Airy PSF
PSF(r ) =  
cos    -
š
ωc
r
2
u(ω  c2 - ω
šω
2
ω 
  ω  1- ω  
  
   
c
c
G2 D( f , σ ) = exp( 2σ  2 / 2)
 r2
F
T.6 Gaussian
1
g2 D (r ,σ ) =
2 exp -
2σ  2
2šσ
2š
F
1
T.7 Peak
ω
r
e  - ar
2ša / (ω  2 + a  2 )3 / 2
F
T.8 Exponential
Decay
Table 4: 2D Images and their Fourier Transforms
15
img
...Image Processing Fundamentals
Because of the monotonic, non-decreasing character of P (a) we have that:
+∞
p(a)da = 1
p( a) 0
and
(32)
­
For an image with quantized (integer) brightness amplitudes, the interpretation of
Ća is the width of a brightness interval. We assume constant width intervals. The
brightness probability density function is frequently estimated by counting the
number of times that each brightness occurs in the region to generate a histogram,
h[a]. The histogram can then be normalized so that the total area under the
histogram is 1 (eq. (32)). Said another way, the p[a] for a region is the normalized
count of the number of pixels, Λ, in a region that have quantized brightness a:
1
Λ =  h[ a]
p[a] =
h[a]
with
(33)
Λ
a
The brightness probability distribution function for the image shown in Figure 4a is
shown in Figure 6a. The (unnormalized) brightness histogram of Figure 4a which
is proportional to the estimated brightness probability density function is shown in
Figure 6b. The height in this histogram corresponds to the number of pixels with a
given brightness.
1.00
1600
median
0.75
1200
maximum
0.50
800
mimimum
0.25
400
0.00
0
0
32
64 96 128 160 192 224 256
0
32 64 96 128 160 192 224 256
Brightness
Brightness
(a)
(b)
Figure 6: (a) Brightness distribution function of Figure 4a with minimum, median, and
maximum indicated. See text for explanation. (b) Brightness histogram of Figure 4a.
Both the distribution function and the histogram as measured from a region are a
statistical description of that region. It must be emphasized that both P [a] and p[a]
should be viewed as estimates of true distributions when they are computed from a
specific region. That is, we view an image and a specific region as one realization of
16
img
...Image Processing Fundamentals
the various random processes involved in the formation of that image and that
region. In the same context, the statistics defined below must be viewed as
estimates of the underlying parameters.
3.5.3 Average
The average brightness of a region is defined as the sample mean of the pixel
brightnesses within that region. The average, ma, of the brightnesses over the Λ
pixels within a region () is given by:
1
 a[ m, n ]
ma =
(34)
Λ ( m, n)∈ℜ
Alternatively, we can use a formulation based upon the (unnormalized) brightness
histogram, h(a) = Λ·p(a), with discrete brightness values a. This gives:
1
 a · h[a]
ma =
(35)
Λ a
The average brightness, ma, is an estimate of the mean brightness, µa, of the
underlying brightness probability distribution.
3.5.4 Standard deviation
The unbiased estimate of the standard deviation, sa, of the brightnesses within a
region () with Λ pixels is called the sample standard deviation and is given by:
1
(a[ m, n] - ma )
2
=
sa
Λ - 1 m,n ∈ℜ
(36)
 a2 [m, n] - Λma
2
m, n∈ℜ
=
Λ -1
Using the histogram formulation gives:
 a · h[a ] -Λ · ma
2
2
 a
=
sa
(37)
Λ -1
The standard deviation, sa, is an estimate of σa of the underlying brightness
probability distribution.
17
img
...Image Processing Fundamentals
3.5.5 Coefficient-of-variation
The dimensionless coefficient­of­variation, CV, is defined as:
sa
CV =
× 100%
(38)
ma
3.5.6 Percentiles
The percentile, p%, of an unquantized brightness distribution is defined as that
value of the brightness a such that:
P (a) = p%
or equivalently
a
 p(α )dα = p%
(39)
­
Three special cases are frequently used in digital image processing.
the minimum value in the region
· 0%
the median value in the region
· 50%
the maximum value in the region
· 100%
All three of these values can be determined from Figure 6a.
3.5.7 Mode
The mode of the distribution is the most frequent brightness value. There is no
guarantee that a mode exists or that it is unique.
3.5.8 Signal­to­Noise ratio
The signal­to­noise ratio, SNR, can have several definitions. The noise is
characterized by its standard deviation, sn. The characterization of the signal can
differ. If the signal is known to lie between two boundaries, amin a amax, then
the SNR is defined as:
a
- amin
SNR = 20 log10 max
dB
­
(40)
Bounded signal
sn
If the signal is not bounded but has a statistical distribution then two other
definitions are known:
­
Stochastic signal
ma
SNR = 20 log10    dB
(41)
S & N inter-dependent
sn
18
img
...Image Processing Fundamentals
s
SNR = 20 log10 a dB
(42)
S & N independent
sn
where ma and sa are defined above.
The various statistics are given in Table 5 for the image and the region shown in
Figure 7.
Statistic
Image
ROI
Average
137.7
219.3
Standard Deviation
49.5
4.0
Minimum
56
202
Median
141
220
Maximum
241
226
Mode
62
220
NA
SNR (db)
33.3
Figure 7
Table 5
Region is the interior of the circle.
Statistics from Figure 7
A SNR calculation for the entire image based on eq. (40) is not directly available.
The variations in the image brightnesses that lead to the large value of s (=49.5) are
not, in general, due to noise but to the variation in local information. With the help
of the region there is a way to estimate the SNR. We can use the s(=4.0) and the
dynamic range, amax ­ amin, for the image (=241­56) to calculate a global SNR
(=33.3 dB). The underlying assumptions are that 1) the signal is approximately
constant in that region and the variation in the region is therefore due to noise, and,
2) that the noise is the same over the entire image with a standard deviation given
by sn = s.
3.6 CONTOUR REPRESENTATIONS
When dealing with a region or object, several compact representations are available
that can facilitate manipulation of and measurements on the object. In each case we
assume that we begin with an image representation of the object as shown in Figure
8a,b. Several techniques exist to represent the region or object by describing its
contour.
3.6.1 Chain code
This representation is based upon the work of Freeman [11]. We follow the
contour in a clockwise manner and keep track of the directions as we go from one
contour pixel to the next. For the standard implementation of the chain code we
19
img
...Image Processing Fundamentals
consider a contour pixel to be an object pixel that has a background (non-object)
pixel as one or more of its 4-connected neighbors. See Figures 3a and 8c.
The codes associated with eight possible directions are the chain codes and, with x
as the current contour pixel position, the codes are generally defined as:
321
Chain codes =
4  x 0
(43)
567
Digitization
(b)
(a)
Contour
Run Lengths
(c)
(d)
Figure 8: Region (shaded) as it is transformed from (a) continuous to (b)
discrete form and then considered as a (c) contour or (d) run lengths
illustrated in alternating colors.
3.6.2 Chain code properties
· Even codes {0,2,4,6} correspond to horizontal and vertical directions; odd codes
{1,3,5,7} correspond to the diagonal directions.
· Each code can be considered as the angular direction, in multiples of 45°, that we
must move to go from one contour pixel to the next.
· The absolute coordinates [m,n] of the first contour pixel (e.g. top, leftmost)
together with the chain code of the contour represent a complete description of the
discrete region contour.
20
img
...Image Processing Fundamentals
· When there is a change between two consecutive chain codes, then the contour
has changed direction. This point is defined as a corner.
3.6.3 "Crack" code
An alternative to the chain code for contour encoding is to use neither the contour
pixels associated with the object nor the contour pixels associated with background
but rather the line, the "crack", in between. This is illustrated with an enlargement
of a portion of Figure 8 in Figure 9.
The "crack" code can be viewed as a chain code with four possible directions
instead of eight.
1
Crack codes = 2  x 0
(44)
3
Close up
(a)
(b)
Figure 9: (a) Object including part to be studied. (b) Conto ur
pixels as used in the chain code are diagonally shaded. The
"crack" is shown with the thick black line.
The chain code for the enlarged section of Figure 9b, from top to bottom, is
{5,6,7,7,0}. The crack code is {3,2,3,3,0,3,0,0}.
3.6.4 Run codes
A third representation is based on coding the consecutive pixels along a row--a
run--that belong to an object by giving the starting position of the run and the
ending position of the run. Such runs are illustrated in Figure 8d. There are a
number of alternatives for the precise definition of the positions. Which alternative
should be used depends upon the application and thus will not be discussed here.
21