The goal of the course project is to provide the students an opportunity to explore research directions in Natural Language Processing and develop some useful NLP applications and tools. Therefore, the project aims at producing a "deliverable" result, meaning that your project should be self-complete, reproducible (scientifically correct) and related to the course content. A typical (successful) project consists of 1) a novel and sound solution to an interesting problem, 2) correct and meaningful comparisons to baselines or existing approaches, 3) comprehensive literature review and discussion (e.g., error analysis). The best outcome of the project is something that is publishable, reaching the quality of short or workshop papers in major NLP conferences (see the criteria of short papers at ACL CFP and papers in ACL Anthology). However, different from submitting a paper, we will not penalize negative results, as long as the proposed approach is well explored.

It is recommended to form a group with a diversity of background, but not required.


Here is a summary how your project will be graded:

  • Project proposal (0%)
    • State your motivation, your plan, and the expected outcome of your project. No more than 2 pages.
    • Although we will not grade the proposal, it is important to use this opportunity to plan your project and communicate with the instructor and TA. Especially, if your final project fails, we will take the proposal into consideration when giving the final score.
  • Project report (35%)
    • Like a research paper. No more than 8 pages (reference excluded).
    • You can use project proposal to draft your final report -- i.e. the writing of the proposal can be reused in the final report.
    • Different projects may be graded differently. For example, a project that reimplements some proposed approaches and performs comprehensive comparison might get a high score even if there is no new approach proposed. Here is a general grading rubric
      • 10% on clarity (Is it clear what was done? Is the report well-written and well-structured? Is the idea well-motivated? Is the literature review comprehensive?)
      • 10% on soundness and correctness. (Is the technical approach well-chosen and deep enough? Is the implementation correct?)
      • 10% on the meaningful comparison. (Are the experiment settings correct? Are the approaches experimental results correctly interpreted?)
      • 5% on novelty and substance. (What are the new things that we can learn from this project?)
  • Project presentation (15%)
    • 5% graded by the instructor (and TA)
    • 10% graded by your classmates (5% depends on the average scores + 5% depends on the average of the top 20 scores)

By default, students in the same team will get the same score unless special circumstances. We encourage students to use a version control system (e.g., github, gitlab, etc...). It is important to keep your hard work in a safe place and log the contributions of individuals. If your team members complain about you and you cannot provide evidence of your contribution, we may lower your score.

Pick a topic

As a research project, it is recommended not reinvent the wheel from scratch. Therefore, when picking a topic, it is important to know what existing resources that can be leveraged. Asking the following questions to yourself:

  • What is the problem? Why this problem interesting and essential?
  • Is there an existing approach? How is your idea different from others?
  • How to evaluate your idea? What data can be used for evaluating the proposed approach? Is the data set available?
  • What software packages and resources that you can use for implementing your idea?
  • What is the best and the worse outcomes of the project? (i.e., measure your risk).
  • Who will be your group members? Do they have special expertises? How to split the workload?

Some possible ways to find a topic are:

  • Take an existing problem we mentioned in the class and come up with some new ideas.
  • Read a published paper carefully and ask yourself if there is any challenge left from the paper or if you can improve the proposed approach.
  • There are many NLP shared tasks at Semieval, CoNLL, and some workshops. A shared task often provides a well-defined problem and data set, allowing different teams to fairly compared their approaches. You can use the shared tasks in previous years as a testbed of your approaches or participate in a shared task in this year (it is okay if you cannot get the results on the final test data set when the semester end. Just evaluate your approach on the development split),

We will give a chance for students to recruit their group members in class.

Project proposal

A two-page project proposal is due on 10/13 before the class. You should address the questions in the "Pick a topic" section in your proposal. You can use this chance to draft the introduction and the related work section of your final report.

Project report

Each team must submit a written project report. You should assume the report is like a short conference paper. If you have a demo system, you can include some screenshots of your system. It is also recommended to include a discussion of how your research work can be further extended. It is required to use the provided ACL Latex style files and submit the report in PDF format.

The report should be less than 8 pages without references (no minimum requirement). A concise and short report is better than a lengthy one.

Project presentation (15%)

Each project team is expected to make a presentation of their project. We expect everyone to attend the final project presentation unless special circumstances. We will announce the final presentation date later. The length of the presentation depends on the number of groups (5min~10min) and will be announced later.

Your presentation will be graded mainly based on

  • The clarity of your slides and presentation
  • How well your key messages deliver to the audience
  • Time control
  • How well you handle the questions from the audience (note that the instructor may randomly pick team members to answer questions during the presentation).

The presentation will be graded by both the instructor and the student peers.