Kamin Whitehouse :: cs651 Fall '06



Syllabus Readings Presentation Projects
Course Projects

All students in this course must produce a paper by the end of the class. The topic can address any aspect of the design, implementation, or usage of networked, embedded systems. There are two components to this project: (1) writing a paper of your own and (2) critiquing the papers of your classmates.

Three ideas for the project must be submitted by email by 9/20. The first outline for the paper is due 9/27 and should be one page long. Each student should bring two copies of the paper to class every Thursday: (i) one to hand in and (ii) one to give to another student. An copy of the previous week's version must also be handed in, with comments from a classmate.

The focus of the paper will be on the framing of your project: identifying the problem and goals, making a hypothesis, and designing an experiment to test that hypothesis. The actual implementation will not be graded. This is to ensure that the project you choose is both well motivated and will produce a compelling conclusion, before you spend any time implementing. Thus, you will spend the semester creating a paper that frames a project that you have not necessarily implemented, with the goal of abstracting away the research process so that you can apply it to subsequent projects (including, perhaps, the one that is the subject of your paper).

Proposals should cover all topics in this checklist, and an outline for the paper is provided below. All versions of the paper, including the initial outline, should have all sections and should address each point in one way or another. Critiques of the outlines/papers should evaluate these points explicitly: provide suggestions on broadening or narrowing the problem definition, suggest alternative goals, suggest alternative types of related work, suggest alternative designs, suggest alternative experimental setups, evaluation criteria, etc.

It is suggested to use LaTeX to create this project report, since most academic papers you will write in graduate school will probably be in LaTeX. LaTeX is a tag-based markup language like HTML and the basics can easily be learned by reading a sample file. We will use the ACM sig-alternate style, which is also a great template to learn from.

Some ideas for projects can be found here.

Introduction

Define the application, the problem, your goals, and your hypothesis

Describe a sensor application and the constraints that this application domain dictates. Describe how the problem is or would likely be solved with current technology. Then, identify the limitations (which constraints are violated) or bottlenecks (which constraints are tightest) for the existing solution(s). Be sure to cover the following points:

  • The specific application scenario
  • The constraints of this application
  • The obvious solution(s) to this problem using current technology
  • Why the obvious solution(s) is(are) obviously not good enough

To find ideas for projects, it may be useful to skim the programs/abstracts of conferences from the application domain in which you are interested. For example, human activity recognition may be in conferences like Pervasive, CHI, and UIST. Environmental Sciences have their own conferences, etc.

Example 1: If your application is to track zebras in the Sahara with GPS, the constraints are: 1 sample every 3 minutes, a maximum weight of the collar, no fixed base stations, mobile nodes, etc. The current solution is to use VHF transmitters or commercially available GPS systems. VHF is limited to tracking for a few hours a day. Commercial GPS systems are bottlenecked by storage space (3000 readings max). Possible other solutions are to use Wifi or long-range modem radio, both of which are bottlenecked by power (ie. weight) constraints. (Satellite? Wi-Max?)

Example 2: If your application is to monitor elders in their home, the constraints are: rapid installation. The current technology is to have an expert install the hardware. The bottleneck here is cost (hundreds of dollars per installation?) An alternative is self installation: people install the sensors and indicate in a computer program the structure of the house, where the sensors are located, and how they are situated. The bottleneck here the user installation time.

Related Work

Relate this problem/solution to other problems/solutions

Read all of the papers that provide insight and/or complete solutions into this problem. If one of the existing solutions adequately solves the problem, go back to Step 1. Otherwise, write a section that first summarizes the known facts about this problem and second describes the existing solutions you have found, making a point to indicate for each why it does not completely solve your problem. Be sure to cover the following points:

  • An introductory paragraph of what kinds of solutions might be related
  • A brief summary of every existing system that solves some subset of your problem
  • For each, a description of which constraint that system does not satisfy
  • A summary paragraph

To find the papers, try searching google and google scholar first. Find a few papers that come very close to your topic, and follow their references. Also, try looking through the programs/abstracts of SenSys, IPSN, EWSN, Secon, MobiSys, MobiHoc, MobiCom, and InfoCom (all of which may also be useful for finding applications).

Example 1: If your project is on exploiting the capture effect to design a new low-power MAC or routing protocol, you would need to find basic literature on the capture effect itself, including fundamental causes, which radios it occurs with, etc. You would need to find any existing MAC or routing protocols that already do exploit capture. And you would need to find all other MAC or routing protocols that are also designed for low-power operation.


Example 2: If your project is on the automatic semantic interpretation of sensor data, you should find all projects dealing with data interpretation. There may not be much work in this area. In such a case, finding only a few papers will not suffice. You must then find sensor projects that could/should have benefited from automatic data interpretation, and describe what these projects did instead (eg human interpretation or supervised machine learning). You must also find projects that interpret other kinds of data, such as data mining from the web, and describe how these solutions are or are not suited for your problem.

System Proposal

Describe the system that will address your problem/goals

Propose a system to solve your problem. This should include the sensors you will use, the sensor platform (mote) you will use, the power source, any additional hardware (laptops, PDAs, etc). Then, any algorithms you are proposing should be thoroughly described. Each component of the system should be profiled in terms of all application constraints listed in Step 1 (eg, power, bandwidth, latency, user time, programming time, etc), as well as expected total system usage. If the system does not meet application constraints, go back to Step 1. Be sure to cover the following points:

  • You hardware specifications, data flow model (routing), and deployment procedure
  • An enumeration of the new components
  • For each component, an analysis of the tradeoffs in terms of system constraints

To profile your system resource consumption, you may need to go to spec sheets for the sensors, processors, radios, etc. to obtain this data, and will likely need to do some simple calculations similar to those we did in class for other projects. Some of the numbers can be found more easily in existing papers. Numbers about programming time, user time, etc may be more difficult to estimate, but try to make some quantitative hypothesis.

Example 1: If your project is to design a system to automatically learn to identify salient features from streaming sensor data, you must describe the learning algorithm, the user interface for providing input and training data to this learning algorithm, and the software architecture for integrating this into a real sensor system. Then, you must conjecture an estimated profile of your system in terms of the application constraints. In this case, you will probably have to estimate the time required of the user (since this system is probably designed to save developer time), and an estimate for the false positive and false negative rates, based on previous studies of similar learning algorithms with similar data.


Example 2: If your project is to design a new mobile-mobile routing protocol, you need to specify exactly how the protocol works, including the user interface, how many packet types, the header formats of each, and the route update and maintenance protocols. Then, you must conjecture an estimated profile of how this protocol will do in terms of your application constraints. If you application is constrained by bandwidth, you must estimate how much bandwidth is consumed over the range of network loads (high vs. low traffice) and mobility patterns (fast vs slow, as well as random vs. directed motion). If you application is power-constrained, a similar analysis should be performed for power consumption.

Experimental Setup

Describe the experiment that will test your hypothesis

Propose an experiment to evaluate how well your system works in comparison with previous technologies. This usually means actually using the system for the application that was proposed while instrumenting the deployment to collect ground truth and/or evaluation metrics. “Ground truth” must be collected if there are things in your experiment that are not controlled, like user actions or goals. Evaluation metrics must always be collected. Typically, you should collect exactly as many evaluation metrics as there are application constraints listed in Step 1. In this report, you must describe exactly how your system will be used in an application setting, how the important variables will be controlled or measured (ground truth), and how you will collect the evaluation metrics.

Typically, you will need to run the same experimental setup with two system implementations: one using old technology and one using your new technology. This is not always necessary, but is highly recommended because it often makes the difference in terms of conference acceptance. You will typically also repeat the experiment multiple times in order to provide statistically significant results. You must state how many times your repeat, as well as whether or not you are going to test the effects of any independent variables, such as transmission power, sampling rate, sensor model, etc. Be sure to cover the following points:

  • A complete description how the scenario/simulation that you will be using
  • An enumeration of how many times you will run the experiment, and which independent variable will be changed in each run
  • A description of how your will collect data the evaluation metric, uncontrolled variables, and ground truth.

Example 1: If you are designing a system to automatically identify activities in an office, such as talking on the phone, typing, etc. you must describe how the office is setup, how you choose the subjects, how many times you repeat the experiments, whether you are repeating the experiments with different independent variables, such as subject gender, sensor model, etc. You must also indicate how you obtain ground truth, i.e. what the person is actually doing or wants to do, and how you correlate the ground truth in time with what your system is estimating. Perhaps you use a screen scraper and pressure sensors on the phone and in the chair for ground truth. Finally, you must indicate how you obtain your evaluation metrics, including false positive and false negative rate, power consumption, storage requirements, etc. If you only care about false positives/negatives, most people would not expect you to compare against another system implementation but only to compare . However, if your application has systems constraints like bandwidth or power, you should create or borrow an older or more naive implementation to provide an empirical comparison.

Example 2: If you are designing a tracking algorithm, you need to describe an experiment in which a moving object is being tracked. This includes: the physical environment, the type of object, and the type of motion. You will also need to describe how to collect ground truth, ie. how you know where the object being tracked really is. Either you must carefully and repeatably control the movement of the object or you must carefully measure the movement of the object. GSP only provides 3-10ft accuracy. DGPS can provide centimeter accuracy but not at high velocity. Then, you must describe how you will collect evaluation metrics. Clearly, you will need to compare the estimated position of the object with the ground truth. If your application is power or bandwidth constrained, you will also need to indicate how you are measuring these variables. Because systems like these are difficult to implement, it is tempting to not compare with a second system. Resist this temptation. At the very least, build an off-the-shelf camera and 802.11 based system with another school's tracking software. Measure the power consumption or bandwidth. You must provide information again about how to obtain ground truth and evaluation metrics.

Experimental Results

Draw plots with the hypothesized data produced by your experiments

Before actually doing the experiment, add the outline of the section in which you would present the results. Typically, you should first summarize the key results. Then, present more detailed graphs of the evaluation metrics collected from each experiment. Towards the end of the section, or throughout, indicate interesting or unexpected details in the data. Since your data will not be collected yet, most of this text will only be a place holder. However, there are two points you should be sure to nail down in advance: statistical analysis and graphs. At some point, you will need to compare some data sets to indicate improvement; be sure to indicate how you will do this comparison (eg. a one-sided t-test?). Doing this will ensure that you are, for example, collecting enough readings to draw statistically significant conclusions. You should also put graphs of your actual predicted results; do not just put a blank pair of x and y axes as a space holder for your graphs. This will help you identify which data sets will be interesting, and which will be hard to visualize. It also helps focus your experiment so that you collect exactly the data you need (not too much and not too little). Boring graphs will probably just be summarized with simple statistics, but if all of your graphs turn out to be boring, return to 4a, 3, or even Step 1. Be sure to cover the following points:

  • A summary of what you expect your experiments to show, in general
  • Graphs showing the hypothesized data
  • Explanations of why the data is hypothesized that way

Example 1: If you are designing a system that exploits infrequent voice traffic to reduce energy consumption and latency in a streaming audio application, your first paragraph would indicate that, for certain kinds of patterns, the data shows that your system has 3 times lower latency and 50% lower energy consumption than the competing method. However, at other traffic patterns, latency is slightly higher and energy consumption is the same between the two. You would then have graphs, with the traffic patterns on the x axis, of both latency and power consumption, pointing out and explaining interesting parts of the curves, and explaining which statistical tests were used on the data to make the statements drawn in the first paragraph.


Example 2: If you are designing a system to help scientists deploy sensors in an effective topology, with proper sensor coverage and proper wireless coverage, the first paragraph will explain for which types of applications and constraints your technique worked and for which it did not. Similar to before, you would present graphs with the types of applications on the x axis showing what kind of coverage you achieved for each. You then need to describe the statistical tests you ran on the data to compare with results from the naive techniques

Conclusions

Discuss what you have learned about your hypothesis, and why this matters

Draw conclusions from your study. This means more than just summarizing the results of Part 4b; quantify the benefits in terms of the application, ie. in terms of money, saved developer time, effectiveness of wildlife conservation efforts, etc. Also generalize your problem and solution; does this system solve a general class of problems (possibly with an enumerable set of modifications) or is it only applicable to this particular problem. Finally, identify some open problems based on either limitations of your experimental setup or on unexplained observations during the experiment.



Kamin Whitehouse
Computer Science Department
The University of Virginia
217 Olsson Hall
Charlottesville, Virginia 94720