Course Contents and Notes

Instructor: Prof. Yanjun (Jane) Qi, (yanjun@virginia.edu)

Lecture is on Tuesday and Thursday from 12:30PM - 13:45PM in Rice Hall.

We will cover topics related to large-scale machine learning;

Besides, we will also need to cover the major learning topics not being covered by the UVA intro machine-learning course ;

We split the cotent into FIVE or more major sections.

For each section, we will cover about SIX online tutorials, video lectures or relevant papers.

More references will be provided for each topic / section as well.

Major Sections of Course Contents

Intro. Large Scale Machine Learning Topics
topic I: Deep Learning Topics
topic II: Kernel Methods Topics
topic III: Optimization for ML and High-Dim Topics
topic IV: Graphical Model Topics
topic V: Assorted: structured, low-rank, Metric, and more related topics
topic VI: Assorted: scalable, random, parallel, and more related topics

Tag	Title and Information	URLs (Video + Slide)	Talk Year	Week No.	Date
Introdution to Large-Scale Machine Learning Topics
✓Basic	Sanjiv Kumar (Columbia EECS 6898), Lecture: Introduction to large-scale machine learning	(PDFSlide)	2010	W1	Tu - 0113
✓Basic	Alex Smola - Berkeley SML: Scalable Machine Learning: Syllabus	(SyllabusURL)	2012
✓Basic	William Cohen - CMU Machine Learning with Large Datasets 10-605: Syllabus	(SyllabusURL)	2014
Topic I: Deep Learning Topics
✓Deep	DeepLearningSummerSchool12: Yann LeCun (New York University), Deep Learning, Graphical Models, Energy-Based Models, Structured Prediction (Part1 - deepNN supervised)	(Video) + (PDFslide)	2012	W1	Th - 0115
✓Deep	DeepLearningSummerSchool12: Yann LeCun (New York University), Deep Learning, Graphical Models, Energy-Based Models, Structured Prediction (Part2 - deepNN unsupervised)	(Video) + (PDFslide)	2012
✓DeepStructured	DeepLearningSummerSchool12: Yann LeCun (New York University), Deep Learning, Graphical Models, Energy-Based Models, Structured Prediction (Part3 - deepNN graph transformer network)	(Video) + (PDFslide) + (PDF2)	2012
✓DeepGM	DeepLearningSummerSchool12: Geoffrey Hinton: Introduction to Deep Learning , Deep Belief Nets (Parts 1 / Relevant Paper: A fast learning algorithm for deep belief nets )	(Video) + (PDFslide)	2012	W2	Th - 0122
✓Hardware Parallel	DeepLearningSummerSchool12: Marc'Aurelio Ranzato (Google Inc.), Large Scale Deep Learning	(Video) + (PDFslide)	2012	W2	Th - 0122
✓Deep	MLSS2005: Yann Lecun, Tutorial of Energy-based models	(Video) +(Slide)	2005	W3	Tu - 0127
✓Deep	Geoffry Hinton: Learning Energy-Based Models of High-Dimensional Data	(Video) + (PDFslide)	2012
✓DeepScaling	KDD14: Yoshua Bengio, Scaling Up Deep Learning	(Video) + (PDFslide)	2014
✓Deep	Yoshua Bengio (University of Montreal): Representation Learning with auto-encoder / decoder variants	(Video)	2012	More
✓Deep	EML07: Yoshua Bengio (University of Montreal): Speeding Up Stochastic Gradient Descent	(Video) + (PDFslide)	2007
✓Theory	DeepLearningSummerSchool12: Nando de Freitas (University of British Columbia) An Informal Mathematical Tour of Feature Learning	(Video) + (PDFslide)	2012
✓Deep	KDD14: Ruslan Salakhutdinov, Deep Learning	(Video) + (PDFslide)	2014
Topic II: Kernel Methods Topics
✓Kernel	Alexander J. Smola, Kernel methods and Support Vector Machines (Part3 )	(Video) + (PDFslide)	2008			W4	Tu - 0203
✓ParamReduct	Sanjiv Kumar (Columbia EECS 6898), Lecture: Kernel Methods (I: Scaling up kernel methods)	(PDFslide)	2010			W4	Tu - 0203
✓Hardware Parallel	ICML08: Fast Support Vector Machine Training and Classification on Graphics Processors	(Video) + (PDFslide)	2008	W5	Tu - 0210
✓DataStructure	ECML2007: Large Scale Learning with String Kernels,	(Video) + (PDFslide)	2007	W5	Tu - 0210
✓Advanced	Francis R. Bach, INRIA: Multiple kernel learning for multiple sources	(Video) + (PDFslide)	2008	W6	Tu - 0217
✓Random	Sanjiv Kumar (Columbia EECS 6898), Lecture: Randomized Algorithms	(PDFSlide)	2010
✓Random	ECML2007: Efficient Machine Learning using Random Projections	(Video)	2007
✓Random	NIPS2007: Random features for large-scale kernel machines (original paper PDF) + ECCV2012: Fourier Kernel Learning	(Video) + (PDFslide)	2007	W7	Tu - 0224
✓Random	Fast Random Feature Expansions for Nonlinear Regression	(Video) + (PDFslide)	2010
✓FastOptim	NIPS2010: Multiple Kernel Learning and the SMO Algorithm	(Video) + (PDFPaper)	2010
✓FastOptim	Fast training of support vector machines using sequential minimal optimization. In Book: Advances in Kernel Methods - Support Vector Learning, MIT Press	(PaperPDF)	1999	W8	Tu - 0303
✓FastOptim	JMLR 2005: Working Set Selection Using Second Order Information for Training Support Vector Machines	(PaperPDF)	2005	W8	Tu - 0303
✓Kernel	NIPS2009: Fast Subtree Kernels on Graphs	(Video)	2009	More
✓Kernel	NIPS09: Locality-Sensitive Binary Codes from Shift-Invariant Kernels	(Video) + (PDF)	2009
✓Kernel	PASCAL07: Graph kernels and applications in chemoinformatics	(Video)	2007
✓Kernel	S.V.N. Vishwanathan, Random walk graph kernels and rational kernels	(Video)	2007
W9: SPRING RECESS
Topic III: Optimization for ML or High-Dim/Sparsity Topics
✓SparsityOptim	KDD08: Trevor Hastie: Regularization Paths and Coordinate Descent	(Video)+ (PDFslide)	2008	W10	Tu - 0317
✓Sparsity	ICML09: Group Lasso with Overlaps and Graph Lasso. (Original Paper PDF)	(Video)	2009
✓Basic	Mark Schmidt's Note: Least Squares Optimization with L1-Norm Regularization	(NotePDF)	2005
✓Basic	Mark Schmidt's MLSS2015 tutorial:	(Video) + (Slide)	2015
✓ Optim Sparse	Mark Schmidt: Fast Non-Smooth and Big-Data Optimization	(Video)	2014
✓ Optim basic	Convex Optimization and Applications - Stephen Boyd	(Video)	2015
✓FastOptim	Sanjiv Kumar (Columbia EECS 6898), Lecture: Kernel Methods (II: fast optimization of kernel methods)	(PDFslide)	2010	W11	Tu - 0324
✓FastOptim	Sanjiv Kumar (Columbia EECS 6898), Lecture: Large-Scale Optimization Techniques	(PDFslide)	2010
✓Optim	MLSS2013: Stephen Wright (University of Wisconsin-Madison) Optimization 1-3	(video) + (slide)	2013
✓Sparsity	Sanjiv Kumar (Columbia EECS 6898), Lecture: Sparse Methods	(PDFslide)	2010
Optim	DeepSummer12: Jorge Nocedal (Northwestern University) Tutorial on Optimization methods for machine learning	(video) + (PDFslide)	2012	W12	Tu - 0331
OptimDiscrete	MLSS2014: Submodularity and Optimization -- Jeff Bilmes	(VideoI-III)+ (PDFslide)	2014
Optim	DeepSummerSchool12: Stephen Wright (University of Wisconsin-Madison) Some Relevant Topics in Optimization (PartI+II)	(video) + (PDFslide)	2010
SparsityOptim	DeepSummer12: Stephen Wright (University of Wisconsin-Madison) Sparse and Regularized Optimization	(video) + (PDFslide)	2012
Sparse	NIPS2009 tutorial: Francis R. Bach: Sparse Methods for Machine Learning: Theory and Algorithms	(video) + (PDFslide)	2009	More
Sparse	MLSS09: Emmanuel Candes, An Overview of Compressed Sensing and Sparse Signal Recovery via L1 Minimization	(Video)	2009
HighDim	NIPS2010: Peter Buhlmann, High-dimensional Statistics: Prediction, Association and Causal Inference	(Video)+ (PDFslide)	2011
HighDim	Martin J. Wainwright, High-Dimensional Statistics: Some progress and challenges ahead	(PDFslide)	2010
HighDim	AISTAT11: Martin J. Wainwright, Convex Relaxation and Estimation of High-Dimensional Matrices	(Video)+ (PDFslide)	2011
OptimAdvance	NIPS10 tutorial: Stephen J. Wright: Optimization Algorithms in Machine Learning Tutorial	(video) + (PDFslide)	2010
OptimDiscrete	NIPS12: Satoru Fujishige, Submodularity and Discrete Convexity	(Video)	2012
OptimDiscrete	ICML13: Tutorial, Submodularity In Machine Learning New-Directions	(Video)	2013
OptimDiscrete	NIPS11: Francis R. Bach, Learning with Submodular Functions: A Convex Optimization Perspective	(Video)	2011
MinMax	ICM2014 VideoSeries IL12.13 : Martin Wainwright on constrained form of statistical MinMax, Privacy, Communication and Computation	(Video)	2014
Topic IV: Graphical Model Topics
✓GM	MLSS2006: Sam Roweis, Machine Learning, Probability and Graphical Models (Part 1-4)	(Video)+ (PDFslide)	2007	W13	Tu - 0407
✓GM	MLSS2012: Martin J. Wainwright: Tutorial Materials on Graphical Models, Variational Methods and Message-Passing (PDFNote)	(Video-07)+ (Part1)+ (Part2)+ (Part3)	2012
✓MCMC Basic	MLSS2009: Iain Murray : Markov Chain Monte Carlo	(Video)+ (Slide)	2009
GM	MLSS2007: Zoubin Ghahramani, Graphical models (Part 1-6)	(Video)+ (PDFslide)	2007	W14	Tu - 0414
✓GMTopic	MLSS2009: David Blei, Topic Models (Part I+II)	(Video)+ (PDFslide)	2009
GM	Wainwright and Jordan monograph: More advanced material on exponential families, duality, and variational methods	(PDFpaper)	2008
GM	MLSS2008: Nando de Freitas, Monte Carlo Simulation for Statistical Inference, Model Selection and Decision Making (Part 1-6)	(Video) + (PDFslide)	2008	W15	Tu - 0421
GMScaling	(LSOLDM)2013: Nando de Freitas, Bayesian Optimization in a Billion Dimensions via Random Embeddings	(Video)+ (PDFslide)	2013
GMScaling	KDD2011: Ron Bekkerman, Misha Bilenko and John Langford, Scaling Up Graphical Model Inference	(PDFSlide)	2011
GM	Gaussian Processes in Practice Workshop 2006, David MacKay, : Gaussian Process Basics	(Video)+ (PDFslide)	2006	More
GMScaling	MLSS2009: Tom Minka, Microsoft Research, Approximate Inference	(Video) + (PDFslide)	2009
GM	MLSS12: Dilan Gorur, Yahoo! Research, Dirichlet Process: Practical Course	(Video)+ (PDFpaper)	2012
GM	DeepSummer12: Iain Murray (University of Edinburgh) Density estimation	(Video)+ (PDFpaper)	2008
GM	ICML13 Tutorial: Gal Elidan, Copulas in Machine Learning (Part I+II)	(Video)	2013
GMScaling	NIPS09: Pedro Domingos, Large-Scale Learning and Inference: What We Have Learned with Markov Logic Networks	(Video)+ (PDFslide)	2009
✓GMScaling	KDD14: Pedro Domingos, Principles of Very Large Scale Modeling	(Video) + (PDFslide)	2014
GMScaling	Ralf Herbrich, Distributed, Real-Time Bayesian Learning in Online Service	(Video) + (PDFpaper)	2013
W16 (Tu -0428): Project Presentation
EXTRA READINGs from here
Topic V: Assorted: structured, low-rank, Metric, and more
✓Metric	ICML07 Best Paper - Information-Theoretic Metric Learning	(Video) + (PDF)	2007	Summer15
✓Structured	CIKM08: Charles Elkan, Log-linear Models and Conditional Random Fields	(Video) + (PDF)	2008
✓Structured	ECML2012: Thomas Gartner, Fraunhofer IAIS , Algorithms for Predicting Structured Data (Part 1-3)	(Video) + (PDF)	2012
✓Structured	MLG08: Thorsten Joachims, Structured Output Prediction with Structural SVMs	(Video) + (PDF)	2008
✓Matrix	MLSS2009 : Emmanuel Candes, Department of Statistics, Stanford University : Tutorial, Matrix Completion via Convex Optimization: Theory and Algorithms	(Video)	2009
✓LowRank	MLSS2011: Emmanuel Candes, Department of Statistics, Stanford University, Title: Low-rank modeling	(Video) + (PDF)	2011
✓LowRank	Sanjiv Kumar (Columbia EECS 6898), Lecture: Matrix Approximations (Part I + Part II)	(PDF-1)+ (PDF-2)	2010
✓LowRank	ICML13 Tutorial: Tensor Decomposition Algorithms for Latent Variable Model Estimation	(Video)	2013
✓Spectral	Sham Kakade, Scalable Spectral Approaches for Learning Topics, Clusters, and Communities (JMLR paper: Tensor Decompositions for Learning Latent Variable Models)	(Video) + (PDFpaper)	2014
✓Spectral	Arik Azran, Department of Engineering, University of Cambridge: Tutorial, Spectral Clustering	(Video) + (PDF)	2008
✓DimReduct	Sanjiv Kumar (Columbia EECS 6898), Lecture: Dimensionality Reduction:	(PDF)	2010
✓ApproxNN	Sanjiv Kumar (Columbia EECS 6898), Lecture: Approximate Nearest Neighbor Search (Part I + Part II)	(PDF-1)+ (PDF-2)	2010
Topic VI: Scalable / Parallel / Random / Streaming Related Topics
Hashing	John Langford, NYU Course on Big Data, Large Scale Machine Learning - Feature Hashing	(Video)	2012	Summer16
Basic	Alex Smola: Scalable ML Course: statistics	(Video)	2012
System	Alex Smola: Scalable ML Course: System	(Video)	2012
scalable	Alex Smola: MLSS 2014: Scalable machine learning	(Video)	2014
random	Michael Mahoney on Recent Results in Randomized Numerical Linear Algebra (NIPS 2013 Workshop on Randomized Algorithms)	(Video)	2013
random	Francis Bach on Beyond stochastic gradient descent for large-scale machine learning (NIPS 2013 Workshop on Randomized Algorithms)	(Video)	2013
random	Gautam Dasarathy: Sketching Sparse Covariance Matrices (NIPS 2013 Workshop on Randomized Algorithms)	(Video)	2013
Reinforcement	Reinforcement Learning: Michael Littman, MLSS 2009	(Video) + (Slide)	2009
many more exciting video tutorials @ http://videolectures.net

Intro. I:Deep II:Kernel III:Optim IV:GM V:Scalable

For Reference: The official Academic Calendar at UVA Registrar.

Major Sections of Course Contents

Introdution to Large-Scale Machine Learning Topics

Topic I: Deep Learning Topics

Topic II: Kernel Methods Topics

Topic III: Optimization for ML or High-Dim/Sparsity Topics

Topic IV: Graphical Model Topics

Topic V: Assorted: structured, low-rank, Metric, and more

Topic VI: Scalable / Parallel / Random / Streaming Related Topics