CS 6501: Text Mining Spring 2019 · CS@UVa

Homework

Our machine problems are typically due in two weeks. Your report is required to be in PDF format, and please name your report as "CompID-MPx.PDF", where CompID is your computing ID and MPx is the machine problem index (e.g., MP1 or MP2). A collab submission site will be created when the homework becomes official. All machine problems will be individual assignments, and any sort of coding sharing is prohibited.

Machine Problems

  • March 28th, 2019 --- MP3— Text Categorization

    This assignment is designed to help you practice with general steps in building a text categorization system. It consists of five tasks, including feature selection, building Naive Bayes classifier, kNN classifier, and classification performance evaluation, totaling 100 points.

  • March 02nd, 2019 --- MP2— Hidden Markov Model for Port-of-speech Tagging

    This assignment is designed to help you get familiar with supervised Hidden Markov Model for Port-of-speech Tagging. It consists of three parts, totaling 100 points. The implementation could be largely based on what you have developed for MP1.

  • January 31st, 2019 --- MP1— Getting Familiar with Text Processing

    This assignment is designed to help you get familiar with basic document representation and analysis techniques. It consists of two parts, totaling 100 points.