CS 4501: Computer Vision
Spring 2011
|
Home |
Schedule |
Assignments |
Final Project |
Additional Resources
Final Project
Proposals due Wednesday, April 13
Presentations on Monday, May 2, 6PM-9PM in MEC 341
Written reports due Monday, May 2 (immediately following your presentation)
No late presentations or reports allowed.
The final assignment for this semester is to do an in-depth project
implementing a nontrivial vision system. You will be expected to
design a complete pipeline, read up on the relevant literature,
implement the system, and evaluate it on real-world data. You will
work in small groups (2-3 people), and must deliver
- A short (2 paragraph) proposal by April 13
- A 10-15 minute group presentation describing your system on May 2, and
- A report on your system. This should be in the style of
a research paper, and should include sections on previous work,
design and implementation, results, and a discussion of the strengths
and weaknesses of your system. The report should be in HTML format,
and we expect lots of pretty pictures!
Project ideas:
- Implement a photo stitching program that takes as input a set of images and aligns and blends them into a single panorama.
- Set up a webcam in a public space and perform tracking, counting, and/or classification of people, cars, etc.
- Using a similar camera setup (and perhaps a microphone), design an "anomoly detector" that recognizes any behavior out of the ordinary (e.g., falls, robberies, etc.)
- Foliage/tourist removal from several photos of a building. An important question to answer is whether you want to attempt 3D reconstruction as part of the process, or whether you want to consider it as a purely 2D problem.
- Video textures - see the SIGGRAPH paper linked from the video textures web page.
- OCR or handwriting recongition. This can be based on templates or on (some simplified version of) the "shape context" approach of Belongie, Malik, and Puzicha. See the ICCV paper on their web page.
- Implement a system for stabilizing video captured with a hand-held camera.
- Implement a system for performing view interpolation from sparse viewpoints (see Zitnik et al. "High-quality video view interpolation using a layered representation" that appeared at SIGGRAPH 2004).
- Implement a system that recovers rough 3D geometry from a sequence of video frames using a combination of SIFT features with Structure from Motion.
- Any of the above on a camera phone. Jason has several Nokia N900 phones (described here) that you are free to use for this project.
Project ideas for those with graphics experience:
- Inserting computer-generated objects into a video sequence taken with a moving camera. Use a calibration or structure from motion method to recover the camera pose.
- Some variant of Facade (human-assisted architectural modeling from a small number of photographs). See the the SIGGRAPH 96 paper linked from the Facade web page.
- Vision-based automatic image morphing (e.g., of faces). That is,
you use an optical flow or other correspondence method to
generate matches between images, then use a morphing algorithm to
generate intermediate frames.
- Image-based visual hull (shape from silhouettes) for moving
scenes. See the SIGGRAPH 2000 paper, linked from their
web page.