diochnos/teaching

Spring 2019: CS 6501 - Learning Theory

Table of Contents

Course Description

Learning theory is a field that develops models for capturing learning phenomena and ultimately studies the efficiency and the limitations of learning algorithms in these frameworks. We will explore basic aspects of machine learning theory.

We will start with learning algorithms that use membership and equivalence queries, version spaces, and decision trees. We will then switch to the probably approximately correct (PAC) model of learning, Occam's razor, VC theory, and ultimately prove the fundamental theorem of learning theory.

From this point and on, we will cover, not necessarily in this order, the following topics. The relationship between weak and strong learning (boosting). Online learning and learning with expert advice in the mistake bound model, where we will see algorithms such as randomized halving, weighted majority, and winnow. We will also cover aspects of adversarial machine learning (poisoning and evasion attacks) and thus explore connections between machine learning and security.

Time permitting, there are further ideas for related interesting topics. For example, theoretical aspects of learning that are related to evolution and evolution seen as a learning process drawing results from evolutionary algorithms and Valiant's framework of evolvability, or active learning.

[Course Description] [Table of Contents] [Top]

Basic Information

Syllabus

The syllabus is available here.

Time and Location

Mondays, Wednesdays, and Fridays 1:00pm – 1:50pm, Olsson Hall 009.

Contact Information

Please see here.

Teaching Assistants

The graduate student Suya (fs5xz) will be a TA for this class this year.

Office Hours

We have office hours at the following times and places.

Tuesdays
11:00am – 1:00pm, 414 Rice Hall (Dimitris)
Wednesdays
2:00pm – 4:00pm, 414 Rice Hall (Dimitris)
Fridays
2:00pm – 6:00pm, 436 Rice Hall (Suya)
Exceptions to the Regular Schedule of Office Hours

If you want to meet us outside of our office hours, the most reliable method is to send an email and arrange an appointment.

[Basic Information] [Table of Contents] [Top]

Homework Assignments

Assignment 1: Here. Announced on January 16. Due on January 30.

Assignment 2: Here. Announced on January 30. Due on February 8.

Assignment 3: Here. Announced on February 10. Due on February 20.

[Homework Assignments] [Table of Contents] [Top]

Final Projects

Graduate students should deliver a final project. This is optional for undergraduates.

In general I am open to discussion for various ideas regarding the final project. You are allowed to work alone but the route that I expect most people will take is to form a group of 2-3 people, read a research paper and write a summary that will be about two pages long (11pt) and be distributed to the rest of their students in class. Depending on our pace as well as on the number of different groups of students that are formed for a final project we may or may not have oral presentations of these research papers. For a list of candidate papers or book chapters to be presented at the end of the course, please have a look below. The candidate material below is split into one- and multi-person projects.

I am happy to discuss final projects on other subjects that are related to the course. For example, you may want to use a data set that you acquired online for training a classifier and then predicting its generalization ability by performing cross-validation and ultimately compare how close this empirical estimate is with the prediction that you have from PAC learning (for an appropriate tradeoff between the accuracy parameter ε and the confidence parameter δ). I am also open to other suggestions; for example one can implement Shannon's mind reading machine. However, in any case, you need to talk to me well in advance and make sure that I agree with the path that you want to follow for what you plan to deliver for your final project.

Candidate Material

Please have a look here.

Clicking on the title of a paper takes you to the paper. It might be easier if you first download the pdf file locally to your computer.

[Final Projects] [Table of Contents] [Top]

Learning Theory Class Resources

[Learning Theory Class Resources] [Table of Contents] [Top]

Assigned Readings

[Assigned Readings] [Table of Contents] [Top]

Optional Readings

[Optional Readings] [Table of Contents] [Top]

Additional Resources

[Additional Resources] [Table of Contents] [Top]

Class Log

Class 1 (Jan 14, 2019)

Introduction. Overview of the course by going through the syllabus.

Discuss the Coupon Collector's problem as an example of the kind of problems that we will be looking at in later parts of the course.

Discussion on basic terminology about machine learning.

Perhaps a good idea is to review Sections 1–2.4 (up to the weak law of large numbers) from the document Essentials on the Analysis of Randomized Algorithms. Also, have a look at Section 3 which discusses the Coupon Collector's problem.

For a general review on probability theory as well as a basic technique for proving bounds in randomized algorithms (which thus applies to learning theory as well), please have a look at the document Basic Tools and Techniques for Algorithms in Learning Theory.

Finally, the document Tools for Bounding Probabilities discusses bounds that we will use more frequently.

Class 2 (Jan 16, 2019)

Homework 1 announced. Due on Wed, Jan 30.

Terminology and basic notions in machine learning: hypothesis class (hypothesis space), concept class, hypothesis, concept (ground truth), instances, examples, symmetric difference.

Boolean functions and the class of functions that will be the workhorse for our course: conjunctions and monotone conjunctions.

Started the discussion of queries and concept learning from Dana Angluin's paper Queries and Concept Learning.

Class 3 (Jan 18, 2019)

We discussed the learnability of monotone conjunctions using equivalence and membership queries. We also discussed whether the algorithms extend to general conjunctions.

Here are notes from what was discussed in the class.

Monday, January 21, 2019

Martin Luther King day – no classes.

Class 4 (Jan 23, 2019)

Discussed a few more things from Dana Angluin's paper. One example was how the learner may be forced to query all but one hypotheses when the hypothesis space is the set of all full conjunctions. Another thing that we discussed was Lemmas 1 and 2 that are also relevant to one of the homework problems.

Started discussion from Tom Mitchell's book; Chapter 2. The more-general-than-or-equal-to relation. Lattice with partial ordering of hypotheses in the hypothesis space. Correspondence between hypotheses in the hypothesis space and sets of examples in the instance space.

Class 5 (Jan 25, 2019)

Consistency and version spaces.

Algorithms Find-S and List-Then-Eliminate.

Most specific and most general boundaries.

Class 6 (Jan 28, 2019)

Version space representation theorem.

Candidate Elimination Algorithm. Execution on our toy example.

Discussion on the convergence of Candidate Elimination Algorithm (no noise and realizability assumption).

Class 7 (Jan 30, 2019)

Using partially learned concepts to make predictions. What queries the learners should request next and the Halving algorithm.

Inductive bias and the futility of bias-free learners.

Introduction to decision trees and inductive bias on decision trees.

Class 8 (Feb 1, 2019)

Entropy, information gain and description of ID3 algorithm for decision tree learning.

Partial execution of the algorithm.

Brief discussion on the characteristics of the search space in decision tree learning.

Class 9 (Feb 4, 2019)

Searching the hypothesis space: characteristics and contrasting the search of ID3 with the search performed by the Candidate-Elimination algorithm.

Bias in decision tree learning and Occam's razor.

Definition of overfitting. Handling overfitting: Reduced-Error Pruning and Rule Post-Pruning.

Partitioning original examples to: training set, validation set, test set.

Class 10 (Feb 6, 2019)

Handling continuous-valued attributes. One needs only to examine boundary points for ID3.

Discussion on missing attributes or attributes that have different costs for testing.

Perceptron and geometry of the solution (hypothesis).

The perceptron update rule for learning from linearly separable data.

Using a perceptron to learn a circle by augmenting the coordinates of the training examples. In general this is not efficient as we introduce exponentially many monomials in the representation; see "stars and bars" problem which can be used to compute precisely the number of monomials for multivariate polynomials of degree d.

Class 11 (Feb 8, 2019)

No free lunch theorems.

Proper learning vs representation independent (`improper') learning. Learning in a realizable setting. Learning in an agnostic setting.

Definition of PAC learning and efficient PAC learning. Definition of agnostic PAC learning and efficient agnostic PAC learning.

Class 12 (Feb 11, 2019)

Homework 3 announced last night.

Efficient PAC learnability of axis-aligned rectangles in the plane.

Class 13 (Feb 13, 2019)

Announced a list of papers that are candidate for presentation; see here.

Decided by plurality vote that the midterm will take place on Monday, February 25.

Using Find-S to properly and efficiently PAC learn general conjunctions.

Class 14 (Feb 15, 2019)

PAC learning finite concept classes. Efficiency depends on whether or not we also have a polynomial time algorithm for learning.

As an immediate application we improved the sample complexity of (properly) learning conjunctions merely by giving an upper bound on the size of the concept class.

Next time we will discuss the intractability of learning 3-term DNF formulae. Meanwhile we discussed efficient PAC learnability of 3-term DNFs by using 3-CNFs. In other words, whereas efficiently PAC learning 3-term DNFs is going to be proved to be hard when properly learning 3-term DNFs, nevertheless, by using 3-CNF formulae as our hypothesis class (learning in the realizable case, but not proper learning) we are able to efficiently PAC learn 3-term DNF formulae.

Class 15 (Feb 18, 2019)

Intractability of PAC learning 3-term DNF formulae properly. (Section 1.4 from the Kearns-Vazirani book.)

We still need to discuss a few things next time.

Class 16 (Feb 20, 2019)

TBA

Class 17 (Feb 22, 2019)

TBA

Class 18 (Feb 25, 2019)

Midterm.

Class 19 (Feb 27, 2019)

TBA

Class 20 (Mar 1, 2019)

TBA

Class 21 (Mar 4, 2019)

TBA

Class 22 (Mar 6, 2019)

TBA

Class 23 (Mar 8, 2019)

TBA

Monday, March 11, 2019

No class; Spring recess.

Wednesday, March 13, 2019

No class; Spring recess.

Friday, March 15, 2019

No class; Spring recess.

Class 24 (Mar 18, 2019)

TBA

Class 25 (Mar 20, 2019)

TBA

Class 26 (Mar 22, 2019)

TBA

Class 27 (Mar 25, 2019)

TBA

Class 28 (Mar 27, 2019)

TBA

Class 29 (Mar 29, 2019)

TBA

Class 30 (Apr 1, 2019)

TBA

Class 31 (Apr 3, 2019)

TBA

Class 32 (Apr 5, 2019)

TBA

Class 33 (Apr 8, 2019)

TBA

Class 34 (Apr 10, 2019)

TBA

Class 35 (Apr 12, 2019)

TBA

Class 36 (Apr 15, 2019)

TBA

Class 37 (Apr 17, 2019)

TBA

Class 38 (Apr 19, 2019)

TBA

Class 39 (Apr 22, 2019)

TBA

Class 40 (Apr 24, 2019)

TBA

Class 41 (Apr 26, 2019)

TBA

Class 42 (Apr 29, 2019)

TBA

Friday, May 3, 2019; 9:00am – 12:00pm

Final exam in Olsson Hall 009.

[Class Log] [Table of Contents] [Top]