CS651: Computer Vision
Spring 2007
|
Home |
Schedule |
Assignments |
Additional Resources
Assignment 4: Stereo
Due Thursday, Apr. 5
Overview
In this assignment you will analyze several algorithms for performing
stereo-based 3D surface reconstruction. You should read and
understand the paper by
Scharstein & Szeliski (particularly Sections 1-3) which describes the
notion of "disparity space" along with several diffusion approaches
for improving the quality of finding optimal matches.
This entire project should be doable in MATLAB.
1. Disparity space (20 points)
In class, we discussed the notion of "disparity space": the 3D volume
of intensity differences parameterized over the pixels in one stereo
image and the range of possible disparities. It is useful to think
about stereo algorithms as performing some type of aggregation (or
diffusion) across disparity space in order to form a final robust
estimate of the depth at each pixel in the respective camera (e.g.,
summing the differences within small windows of pixels is the most
common example). The first goal of this assignment is to visualize
disparity space for some real scenes and evaluate the performance of
three different strategies for estimating scene depth:
Do the following:
- Download stereo data from here
(courtest Middlebury College and Diego Nehab). You will find three
test scenes. First focus on the 'cones' and 'teddy' datasets. Note
that you are given a left/right rectified stereo pair and "ground
truth" disparity images. Disparities are encoded using a scale factor
4 for gray levels 1 .. 255, while gray level 0 means "unknown
disparity". Therefore, the encoded disparity range is 0.25-63.75
pixels. These disparities will be used to evaluate the performance of
your assignment later so understand how they are organized.
- For each of the two stereo pairs, select 4 segments of
neighboring pixels in the same row (e.g., 200 pixels wide) at
"interesting" regions in the scene (i.e., across depth boundaries,
textured areas, texture-less areas, above non-Lambertian surfaces,
etc.). So you will have 8 segments, each 200 pixels wide, altogether.
- For each line segment (indexed here by 'x'), compute the
following 2D function of pixel intensity differences between the two
images for different disparity values (note this is a 2D slice of the
full 3D disparity space we discussed in class and described in
[Scharstein & Szeliski 1996]):
- Show these images in your write-up along
with an indication of the corresponding segments in the stereo images
(you only need to show the reference image from each pair). Comment
on what about the disparity space would indicate that finding the
*correct* disparity would be easy/difficult. Relate properties of
the disparity space image to properties in the scene.
2. Stereo (50 points)
Your next task is to implement several stereo matching algorithms.
- Implement the standard "sum-of-squares" algorithm for a range
of window sizes. You may wish to think of this as aggregating the
squared intensity differences (which are stored in the function you
previously computed) within N-size windows and selecting that
disparity at each pixel with the lowest total sum. Run your code on
the segments from the first part for different window sizes (1, 3,
5, 10, 20) and plot the final disparity for each along with ground
truth. Note that because we are considering only one row of pixels
at a time you will only need to aggregate intensity differences
within Nx1 image windows. Include these plots in your write-up and
comment on what you see. How well does the reconstruction algorithm
perform in mainly flat textured areas?, texture-free areas?, near
depth boundaries? around non-Lambertian surfaces? How does its
performance depend on the size of the window considered.
- Implement the "membrane model" of [Scharstein & Szeliski 1996]
(Section 3 of their paper). Because we are only considering 1D
segments of pixels, the neighborhood (N_4 in the paper) will only
include the two neighborhing pixels to the left and right of each
pixel. Use the values of \lambda and \beta that are reported in the
paper and perform n=10 iterations of the discrete update rule. For
one segment, include 10 images showing the disparity space image
after each application of the update rule (you should see the
"diffusion" process at work).
- Set the disparity at each pixel to that with the lowest value
after applying these updates and add this result to the plot from
above (so you are comparing this with the standard "sum-of-squares"
approaches with varying window size and ground truth). Does the
diffusion process improve the results? Where? Why?
- Implement diffusion with local stopping criteria (Section 3.2
in [Scharstein & Szeliski 1996]). The first step is to compute the
winner margin certainty measure at each pixel in each segment
from part 1. Visulize this certainty by plotting it on top of its
corresponding 2D disparity space image (use plot command
and watch the scales of the axes). Comment about this measure of
certainty: How well does it identifying pixels at which you would
expect accurate matches? Next, apply the discrete update rules that
follow this certainty measure as described in Section 3.2 of the
paper. Show 10 images after applying each update for the same
segment as above. Lastly, add the plot of the final depth estimate
to the plot with the other results (and, of course, include this in
your write-up). Does incorporating a notion of certainty improve
the quality of the matches at each pixel? Your answer should refer
to specific areas and relate them to properties of the scene.
2. Space-time Stereo (30 points)
In this part of the assignment you will explore the benefits to
finding correspondences by projecting "unstructured" light patterns
into the scene and considering windows in time across which these
patterns change. Refer to the space-time
stereo paper by Davis et al. for all the details.
Implement the following:
- Download data from the 'tablet scene' here. This includes reference
images from the left and right cameras to give you a sense of what
this scene contains along with space-time image sequences captured
with synchronized cameras and a digital projector displaying
high-frequency intensity patterns. You do not have ground truth
disparity information for this scene.
- Find two, 200-pixel wide segments as before that interesect
"interesting" features in the data.
- Using only the first pairs of frames (st_left/h001.png and
st_right/h001.png), compute the disparity space function as before.
Show this in your write-up along with an indication of the
originating image segments.
- Using windows of varying sizes (1, 5, 10), find the disparity
at each pixel in each segment that has a minimum SSD score. Plot
these depth estimates together. Discuss the quality of these
estimates keeping in mind that you don't have ground truth (refer to
features in the reference image).
- Compute a slightly modified version of the disparity space
function as follows (note the sum is over frames in the sequence):
- Compute the certainty of each pixel in each segment using the
winner margin metric as before. Next, compare this visualization of
disparity space that considers windows in time and its certainty to
those computed for only the first frame in the sequence. What do
you notice? How has the certainty improved? Where is the certainty
still small?
- Finally, assign the disparity at each pixel in each segment to
that with the minimum squared difference across the entire sequence
of frames (i.e., this corresponds to a space-time window with a 1x1
spatial extent and a temporal extent equal to the length of the
sequence, 32 in our case). Compare this estimate with the estimates
from above that consider only the first frame and use varying window
sizes. Comment on the accuracy obtained with the space-time
approach.
Submitting
This assignment is due Thurday, April 5, 2007 at 11:59 PM. Please see
the general notes on submitting
your assignments, as well as the late policy and the collaboration policy.
Please submit:
- Your write-up as an HTML file with
embedded images and links to your code. Please include all of your
beautiful visualizations of disparity space and pixel certainty along
with answers to any questions posed in the assignment.