VIDEO
Please use a browser that supports iframes!
image filtering
COS 351 - Computer Vision
three views of filtering
image filters in spatial domain
filter is a mathematical operation of a grid of numbers
smoothing, sharpening, measuring texture
image filters in the frequency domain
filtering is a way to modify the frequencies of images
denoising, sampling, image compression
templates and image pyramids
filtering is a way to match a template to the image
detection, coarse-to-fine registration
image filtering
Image filtering: compute function of local neighborhood at each position
Really important!
Enhance images
denoise, resize, increase contrast, etc.
Extract information from images
texture, edges, distinctive points, etc.
Detect patterns
example: box filter
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
image filtering
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
\[h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\]
box filter
What does it do?
Replaces each pixel with an average of its neighborhood
Achieves smoothing effect (remove sharp features)
\[
\texttt{g[} \cdot, \cdot \texttt{]} =
\frac{1}{9}
\]
smoothing with box filter
practice with linear filters
original ?
practice with linear filters
original identity (no change)
practice with linear filters
original ?
practice with linear filters
original shifted left by 1 pixel
practice with linear filters
original ?
[ note that filter sums to 1 ]
practice with linear filters
original sharpened
[ note that filter sums to 1 ]
sharpening
Sharpening filter accentuates differences with local average
-0.111 -0.111 -0.111
-0.111 1.888 -0.111
-0.111 -0.111 -0.111
other filters: sobel
Sobel filter finds vertical and horizontal edges
other filters: sobel
Sobel filter finds vertical and horizontal edges
filtering vs. convolution
With kernel g
and image f
2D filtering
h = filter2(g, f);
or h = imfilter(f, g);
\(h[m,n] = \sum_{k,l} g[k,l]\ \ f[m+k,n+l]\)
2D convolution
h = conv2(g, f);
\(h[m,n] = \sum_{k,l} g[k,l]\ \ f[m-k,n-l]\)
key properties of linear filters
Linearity
filter(f1 + f2) = filter(f1) + filter(f2)
Shift invariance
same behavior regardless of pixel location
filter(shift(f)) = shift(filter(f))
Any linear, shift-invariant operator can be represented as a convolution
key properties of linear filters
Commutative: \(a * b = b * a\)
conceptually no difference between filter and signal
but particular filtering implementations might break this equality
Associative: \(a * (b * c) = (a * b) * c\)
often apply several filters one after another: \((((a * b_1) * b_2) * b_3)\)
this is equivalent to applying one filter: \(a * (b_1 * b_2 * b_3)\)
Distributes over addition: \(a * (b + c) = (a * b) + (a * c)\)
Scalars factor out: \(ka * b = a * kb = k(a * b)\)
Identity: unit impulse \(e = [0,0,1,0,0]\), \(a * e = a\)
important filter: gaussian
Weight contributions of neighboring pixels by nearness
\[G_\sigma = \frac{1}{2\pi\sigma^2} e^{-\frac{(x^2+y^2)}{2\sigma^2}}\]
\[\]
0.003 0.013 0.022 0.013 0.003
0.013 0.059 0.097 0.059 0.013
0.022 0.097 0.159 0.097 0.022
0.013 0.059 0.097 0.059 0.013
0.003 0.013 0.022 0.013 0.003
\[ 5\times5, \sigma = 1\]
smoothing with box filter
smoothing with Gaussian filter
gaussian filters
remove "high-frequency" components from the image (low-pass filter)
image becomes more smooth
convolution with self is another gaussian
so can smooth with small-width kernel, repeat, and get same result as larger-width kernel would have
convolving two times with Gaussian kernel of width \(\sigma\) is same as convolving once with kernel of width \(\sigma\sqrt{2}\)
Separable kernel
factors into product of two 1D Gaussians
separability of the gaussian filter
\[ \begin{array}{rl} G_\sigma(x,y) & = \frac{1}{2\pi\sigma^2} e^{-\frac{(x^2+y^2)}{2\sigma^2}} \\ & = \left( \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{x^2}{2\sigma^2}} \right) \left( \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{y^2}{2\sigma^2}} \right) \end{array} \]
The 2D Gaussian can be expressed as the product of two functions, one a function of \(x\) and the other a function of \(y\).
In this case, the two functions are the (identical) 1D Gaussian.
separability example
2D convolution (center location only)
separability example
2D convolution (center location only)
2D filter factors into a product of 1D filters
separability example
2D convolution (center location only)
\[ \Bigg( \]
\[ \Bigg) \]
\(*\)
separability example
2D convolution (center location only)
Perform convolution along rows...
... then along remaining column.
separability
Why is separability useful in practice?
separability
Why is separability useful in practice?
Because it is faster!
Filtering an \(M\)-by-\(N\) image with a \(P\)-by-\(Q\) filter requires roughly \(MNPQ\) multiplies and adds.
If the kernel is separable, filtering can be done in two steps
first step requires \(MNP\) multiplies and adds
second step requires \(MNQ\) multiplies and adds
In total, filtering requires \(MN(P+Q)\) multiplies and adds, which is a \(PQ/(P+Q)\) speed-up.
practical matters
How big should the filter be?
Values at edges should be near zero
Rule of thumb for Gaussian: set filter half-width to about \(3\sigma\)
practical matters
What about near the edge?
the filter window falls off the edge of the image
need to extrapolate
methods:
clip filter (black)
wrap around
copy edge
reflect across edge
practical matters
What is the size of the output?
filter2(g, f, 'full'); % sum of sizes of f and g
filter2(g, f, 'same'); % same as f
filter2(g, f, 'valid'); % difference of sizes of f and g
median filters
A median filter operates over a window by selecting the median intensity in the window
What advantage does a median filter have over a mean filter?
Is a median filter a kind of convolution?
comparison: salt and pepper noise
[ © 2006 Steve Marschner, Slide by Steve Seitz ]
take-home messages
Linear filtering is sum of dot product at each position
can smooth, sharpen, translate (among many other uses)
Be aware of details for filter size, extrapolation, cropping
practice questions
Write down a 3x3 filter that returns a positive value if the average value of the 4-adjacent neighbors is less than the center and a negative value otherwise.
Write down a filter that will compute the gradient in the x-direction:
gradx(y,x) = im(y,x+1)-im(y,x) for each x,y
practice questions
Fill in the blanks, where *
is filtering operator
_ = D * B
A = _ * _
F = D * _
_ = D * D
Laplacian of Gaussian
\[G_\sigma = \frac{1}{2\pi\sigma^2} e^{-\frac{(x^2+y^2)}{2\sigma^2}}\]
\[ LoG_\sigma(x,y) = - \frac{1}{\pi\sigma^4} \left( 1 - \frac{x^2+y^2}{2\sigma^2} \right) e^{-\frac{x^2+y^2}{2\sigma^2}}\]