Machine Learning Course MIT OpenCourseWare
MIT 6.034 Artificial Intelligence, Fall 2010
Instructor: Patrick Winston
This lecture covers a symbolic integration program from the early days of AI. We use safe and heuristic transformations to simplify the problem, and then consider broader questions of how much knowledge is involved, and how the knowledge is represented.
Hill Climbing is heuristic search used for mathematical optimization problems in the field of Artificial Intelligence .
In numerical analysis, hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem, then attempts to find a better solution by making an incremental change to the solution. If the change produces a better solution, another incremental change is made to the new solution, and so on until no further improvements can be found.
Undergraduate ML Courses
Introduction to Machine Learning
Introduces principles, algorithms, and applications of machine learning from the point of view of modeling and prediction; formulation of learning problems; representation, over-fitting, generalization; clustering, classification, probabilistic modeling; and methods such as support vector machines, hidden Markov models, and neural networks. Students taking graduate version complete additional assignments. Meets with 6.862 when offered concurrently. Recommended prerequisites: 6.006 and 18.06. Enrollment may be limited; no listeners.
Introduction to Inference
Probabilistic modeling for problems of inference and machine learning from data, emphasizing analytical and computational aspects. Distributions, marginalization, conditioning, and structure; graphical and neural network representations. Belief propagation, decision-making, classification, estimation, and prediction. Sampling methods and analysis. Introduces asymptotic analysis and information measures. Computational laboratory component explores the concepts introduced in class in the context contemporary applications.
Statistics, Computation and Applications
Hands-on analysis of data demonstrates the interplay between statistics and computation. Includes four modules, each centered on a specific data set, and introduced by a domain expert. Provides instruction in specific, relevant analysis methods and corresponding algorithmic aspects. Potential modules may include medical data, gene regulation, social networks, finance data (time series), traffic, transportation, weather forecasting, policy, or industrial web applications. Projects address a large-scale data analysis question. Students taking graduate version complete additional assignments. Limited enrollment; priority to Statistics and Data Science minors and to juniors and seniors.
Introduction to Data Science
Introduction to the methodological foundations of data science, emphasizing basic concepts, but also modern methodologies. Learning of distributions and their parameters. Testing of multiple hypotheses. Linear and nonlinear regression and prediction. Classification. Learning of dynamical models. Uncertainty quantification. Model validation. Causal inference. Applications and case studies drawn from electrical engineering, computer science, the life sciences, finance, and social networks.
Graduate ML Courses
Machine Learning
Principles, techniques, and algorithms in machine learning from the point of view of statistical inference; representation, generalization, and model selection; and methods such as linear/additive models, active learning, boosting, support vector machines, non-parametric Bayesian methods, hidden Markov models, Bayesian networks, and convolutional and recurrent neural networks. Recommended prerequisite: 6.036 or other previous experience in machine learning.
Statistical Learning Theory and Applications
Among different approaches in modern machine learning, the course focuses on a regularization perspective and includes both shallow and deep networks. The content is roughly divided into two parts. In the first part, key algorithmic ideas are introduced, with an emphasis on the interplay between modeling and optimization aspects. Algorithms that will be discussed include classical regularization networks (regularized least squares, SVM, logistic regression),stochastic gradient methods, implicit regularization, sketching, sparsity based methods and deep neural networks. In the second part, key ideas in statistical learning theory will be developed to analyze the properties of the various algorithms previously introduced. Classical concepts like generalization, uniform convergence and Rademacher complexities will be developed, together with topics such as bounds based on margin, stability, and privacy. The final part of the course focuses on deep learning networks. It will introduce an emerging theoretical framework addressing three key puzzles in deep learning: approximation theory -- which functions can be represented more efficiently by deep networks than shallow networks -- optimization theory -- why can stochastic gradient descent easily find global minima -- and machine learning -- whether classical learning theory can explain generalization in deep networks. It will also discuss connections with the architecture of visual cortex, which was the original inspiration of the layered local connectivity of modern networks and may provide ideas for future developments of deep learning.
Algorithms for Estimation and Inference
Introduction to statistical inference with probabilistic graphical models. Directed and undirected graphical models, and factor graphs, over discrete and Gaussian distributions; hidden Markov models, linear dynamical systems. Sum-product and junction tree algorithms; forward-backward algorithm, Kalman filtering and smoothing. Min-sum and Viterbi algorithms. Variational methods, mean-field theory, and loopy belief propagation. Particle methods and filtering. Building graphical models from data, including parameter estimation and structure learning; Baum-Welch and Chow-Liu algorithms. Selected special topics.
Graphical Models: A Geometric, Algebraic, and Combinatorial Perspective
In this research-oriented course we will introduce graphical models in the framework of exponential families. We will see that polynomial equations and combinatorial constraints naturally arise and call for algebraic and combinatorial methods to advance the statistical methodology. In particular, we will highlight the role of conic duality for Gaussian graphical models and polyhedral geometry for discrete graphical models. We will also develop methods for causal inference making use of the inherent combinatorial and algebraic structure in directed graphical models. Finally, we will discuss graphical models with hidden variables by highlighting the connections to tensor decompositions. The overarching goal of this course is to provide an overview of the interplay of techniques from combinatorics, and applied algebraic geometry, with problems arising in statistics, in particular in graphical models. Specific topics include exponential families, Grobner bases, conditional independence ideals, Bayesian networks, determinantal varieties, and hyperbolic polynomials.
Inference and Information
Introduction to principles of Bayesian and non-Bayesian statistical inference. Hypothesis testing and parameter estimation, sufficient statistics; exponential families. EM agorithm. Log-loss inference criterion, entropy and model capacity. Kullback-Leibler distance and information geometry. Asymptotic analysis and large deviations theory. Model order estimation; nonparametric statistics. Computational issues and approximation techniques; Monte Carlo methods. Selected topics such as universal inference and learning, and universal features and neural networks.
Machine Learning for Healthcare
Introduces students to machine learning in healthcare, including the nature of clinical data and the use of machine learning for risk stratification, disease progression modeling, precision medicine, diagnosis, subtype discovery, and improving clinical workflows. Topics include causality, interpretability, algorithmic fairness, time-series analysis, graphical models, deep learning and transfer learning. Guest lectures by clinicians from the Boston area and course projects with real clinical data emphasize subtleties of working with clinical data and translating machine learning into clinical practice. Limited to 55.
Bayesian Modeling and Inference
Bayesian Modeling and Inference
As both the number of data sets and data set sizes grow, practitioners are interested in learning increasingly complex information and interactions from data. Probabilistic modeling in general, and Bayesian approaches in particular, provide a unifying framework for flexible modeling that includes prediction, estimation, and coherent uncertainty quantification. In this course, we will cover the modern challenges of Bayesian inference, including (but not limited to) speed of approximate inference, making use of distributed architectures, streaming data, and complex data interactions. We will study Bayesian nonparametric models, wherein model complexity grows with the size of the data; this allows us to learn, e.g., a greater diversity of topics as we read more documents from Wikipedia, identify more friend groups as we process more of Facebook's network structure, etc.
Algorithmic Aspects of Machine Learning
As both the number of data sets and data set sizes grow, practitioners are interested in learning increasingly complex information and interactions from data. Probabilistic modeling in general, and Bayesian approaches in particular, provide a unifying framework for flexible modeling that includes prediction, estimation, and coherent uncertainty quantification. In this course, we will cover the modern challenges of Bayesian inference, including (but not limited to) speed of approximate inference, making use of distributed architectures, streaming data, and complex data interactions. We will study Bayesian nonparametric models, wherein model complexity grows with the size of the data; this allows us to learn, e.g., a greater diversity of topics as we read more documents from Wikipedia, identify more friend groups as we process more of Facebook's network structure, etc.
留言