University of Warsaw - Central Authentication System

Statistical machine learning

General data

Course ID:	1000-1M18SUM
Erasmus code / ISCED:	(unknown) / (unknown)
Course title:	Statistical machine learning
Name in Polish:	Statystyczne uczenie maszynowe
Organizational unit:	Faculty of Mathematics, Informatics, and Mechanics
Course groups:	(in Polish) Przedmioty obieralne na studiach drugiego stopnia na kierunku bioinformatyka Elective courses for 2nd stage studies in Mathematics
ECTS credit allocation (and other scores):	6.00 Basic information on ECTS credits allocation principles: the annual hourly workload of the student’s work required to achieve the expected learning outcomes for a given stage is 1500-1800h, corresponding to 60 ECTS; the student’s weekly hourly workload is 45 h; 1 ECTS point corresponds to 25-30 hours of student work needed to achieve the assumed learning outcomes; weekly student workload necessary to achieve the assumed learning outcomes allows to obtain 1.5 ECTS; work required to pass the course, which has been assigned 3 ECTS, constitutes 10% of the semester student load. view allocation of credits
Language:	English
Main fields of studies for MISMaP:	computer science mathematics
Type of course:	elective monographs
Prerequisites (description):	The lecture is a continuation of "Statistical data analysis" or "Statistics" or "Probability and Statistics".
Mode:	Classroom Remote learning
Short description:	The lecture is an introduction to the supervised learning, in other words statistical prediction, focused on modern linear methods for tabular data and based partly on "The Elements of Statistical Learning" by Hastie, Tibshirani and Friedman. In addition, I will discuss the use of linear methods and deep neural networks (ConvNets, Visual Transformers) for the prediction of image properties such as classification or segmentation.
Full description:	In recent years, the view has been clarified that deep neural networks are effective where there is a lot of data, the predictors have a certain spatial or temporal organization (such as signals, texts or images) and the signal-to-noise ratio is high. However, where: there is little data or no special organization, or the signal is weak or the results should be interpretable, modern linear methods, such as lasso or gradient boosting, are better (Robert Tibshirani in his presentation after receiving the ISI award, May 2021). The lecture is an introduction to the supervised learning, in other words statistical prediction, focused on such modern methods and partly based on "The Elements of Statistical Learning" by Hastie, Tibshirani and Friedman. In the first part, I discuss the basic prediction method of the continuous response, that is the linear regression model, and ridge regression, lasso and the best subset selection. In the second part, I present linear prediction methods of the discrete response, that is classifications such as Fisher's linear discriminant analysis, logistic regression, or support vector machines. In the third part, I discuss universal, non-linear predictors, such as the k-nearest neighbors method and decision trees. The fourth part is devoted to the regularization of the learning and boosting the power of prediction. In particular, I discuss penalization of the prediction error, kernelization of explanatory variables and boosting (linear combinations of "weak" predictors). In the last part I will discuss the use of linear methods and deep neural networks (ConvNets, Visual Transformers) for the prediction of image properties such as classification or segmentation. The lecture is focused on important machine learning methods, which are solutions of the penalized loss minimization on the training data. In this way, we get sets of predictors indexed by a "hyperparameter" (e.g. the weight of the function which penalizes the predictor parameters), which value is calculated on additional validation data or by means of cross validation on the training data. I will devote a lot of attention to rigorously explain popular validation procedures (also called the model selection).
Bibliography:	1. http://statweb.stanford.edu/~tibs/ftp/ISI.pdf 2. Hastie T., Tibshirani R. and Friedman J. The Elements of Statistical Learning, Springer 2009. 3. Shai Shalev-Shwartz S. and Ben-David S. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press 2014. 4. Bishop, C. M., & Bishop, H. Deep learning: Foundations and concepts. Springer 2023.
Learning outcomes:	Knowledge and skills: 1. Comprehend the basic methods of prediction. 2. Is able to learn a prediction function to the learning data, select its hyperparameter on validation data, estimate the prediction error on test data. Social competence: Can use prediction to study natural or social phenomena.
Assessment methods and assessment criteria:	The final grade will be equal to the maximum of: - grade for activity in the classroom (e.g. detecting an error in calculations, alternative proof or derivation of a prediction method), - grade for solving homework during the course, - grade for the oral exam or the programming project.

Classes in period "Winter semester 2024/25" (past)

Time span:	2024-10-01 - 2025-01-26	Selected timetable range: weekly course term Go to timetable MO WYK CW TU W TH FR
Type of class:	Classes, 30 hours more information Lecture, 30 hours more information
Coordinators:	Piotr Pokarowski
Group instructors:	Piotr Pokarowski
Students list:	(inaccessible to you)
Credit:	Examination

Classes in period "Winter semester 2025/26" (future)

Time span:	2025-10-01 - 2026-01-25	Selected timetable range: weekly course term Go to timetable MO WYK CW TU W TH FR
Type of class:	Classes, 30 hours more information Lecture, 30 hours more information
Coordinators:	Piotr Pokarowski
Group instructors:	Piotr Pokarowski
Students list:	(inaccessible to you)
Credit:	Course - Examination Lecture - Examination

Course descriptions are protected by copyright.
Copyright by University of Warsaw.