Machine Learning in Finance I
Informacje ogólne
Kod przedmiotu: | 2400-QFU1MLF |
Kod Erasmus / ISCED: |
14.3
|
Nazwa przedmiotu: | Machine Learning in Finance I |
Jednostka: | Wydział Nauk Ekonomicznych |
Grupy: |
Anglojęzyczna oferta zajęć WNE UW Przedmioty obowiązkowe dla I roku Quantitative Finance |
Punkty ECTS i inne: |
4.00
|
Język prowadzenia: | angielski |
Rodzaj przedmiotu: | obowiązkowe |
Skrócony opis: |
(tylko po angielsku) This course provides a broad perspective on theory of Machine Learning methods and its application in finance. This course covers supervised learning for regression and classification problems. The theoretical part of the course (first four classes) consists of the basis of machine learning including measuring performance, model testing, details of validation methods, feature engineering and selection, simple linear and logistic regression, discriminant analysis as well as K-nearest neighbours, Support Vector Machines, ridge and Lasso regression modelling methods, decision trees and random forest. Then in the practical part, each lecture tackles a particular financial problem faced by modellers and showcases an ML solution to it. The solutions focus on the end-to-end process, including data handling and feature generation as well as techniques for gaining executive support. |
Pełny opis: |
(tylko po angielsku) The course is conducted in two sequential stages: 1) lectures 2) case study laboratories. The lecture stage is followed by an exam which verifies the required theoretical knowledge. On the other hand, the laboratory stage is followed by practical projects realised by the student. I part - Lectures 1. Introduction to Machine Learning a. Types of Machine Learning b. Introduction to Supervised Statistical Learning c. Types of predictions, types of models, types of tabular data structures d. Notations and general concepts - loss function, cost function, gradient descent e. Simple Supervised Learning models - linear regression and logistic regression 2. Assessing model accuracy, machine learning diagnostics a. Evaluation metrics- regression and classification b. Learning curves c. Training, validation and testing sets d. Cross-validation technique e. The concept of bias and variance and their trade-off f. Possible remedies of underfitting or overfitting 3. Basic Supervised Learning models a. K-nearest neighbours b. Support Vector Machines c. Decision trees and Random Forest (bagging idea) 4. Crucial machine learning techniques a. Dataset preparation steps b. Initial feature selection methods c. Feature engineering d. Regularization e. Rebalancing f. Explainable Artificial Intelligence* Exam, in the convention of recruitment questions for the position of Data Scientist, Quantitative analysis, Machine Learning Engineer II part - Labs 5. Python - lightning fast course a. Preparation of the environment b. Variables, data type, operators and control structure c. Functions d. Modules e. Data science toolkit: NumPy, Pandas, Matplotlib, Sklearn 6. Case study - credit risk modeling a. Construction of the first classification machine learning end-to-end pipeline b. Solving the problem of unbalanced data set c. Testing multiple machine learning models d. Comparing the results of the models in the business context e. Solving the problem of explainability of the solution* 7. Case study - medical insurance premium prediction a. Construction of the first regression machine learning end-to-end pipeline b. Testing multiple machine learning models c. Comparing the results of the models in the business context d. Solving the problem of explainability of the solution* 8. Case study - simple algorithmic trading a. Construction of the first time series machine learning pipeline b. Solving the problem of specifics of data preparation for the problem of time series c. Testing multiple machine learning models d. Playing a simple investment strategy based on trained model Additional case study - life insurance assessment (multiclass classification) 9. Project presentations |
Literatura: |
(tylko po angielsku) - James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning. Springer, New York, NY - Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer-Verlag. - Harrington, P. (2012). Machine learning in action (Vol. 5). Greenwich, CT: Manning. - Intel (2018). Introduction to Machine Learning. Retrieved from https://www.intel.com/content/www/us/en/developer/learn/course-machine-learning.html - VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. O'Reilly Media, Inc. Along with additional literature assigned to the case studies. |
Efekty uczenia się: |
(tylko po angielsku) After completing the course, the student will have reliable, structured knowledge on a wide range of supervised learning algorithms for regression and classification problems, such as linear and logistic regression, linear discriminant analysis, kNN, ridge regression, LASSO, Support Vector Machine, decision trees, and random forest. They will know the theoretical foundations of these algorithms, as well as have programming skills allowing their application in finance. They will be able to select predictive modeling algorithms that are best suited to the specific research problem, perform reliable validation of models, select and transform variables, and perform an independent research project using the methods learned. K_U02, K_U05 |
Metody i kryteria oceniania: |
(tylko po angielsku) There are three elements that the final grade consists of. The first one is the theoretical part exam, which consists of 10 open-ended questions. The second is to prepare individual machine learning projects and write down an extended report in a Python notebook, containing blocks of code that will allow the teacher to fully reproduce the applied analysis. Each project should be prepared on a different dataset selected by the students - one reasonably small dataset and one large dataset - approved by the tutor (for example from https://www.kaggle.com). The third component is to present the results in public. The following weights are used to determine the final grade: 40% - Exam 20% - Presentation 40% - Extended report The threshold to pass is equal to 60%. |
Zajęcia w cyklu "Semestr zimowy 2023/24" (zakończony)
Okres: | 2023-10-01 - 2024-01-28 |
Przejdź do planu
PN WT CW
WYK
ŚR CZ PT |
Typ zajęć: |
Ćwiczenia, 15 godzin
Wykład, 15 godzin
|
|
Koordynatorzy: | Szymon Lis, Michał Woźniak | |
Prowadzący grup: | Szymon Lis, Michał Woźniak | |
Lista studentów: | (nie masz dostępu) | |
Zaliczenie: |
Przedmiot -
Egzamin
Ćwiczenia - Zaliczenie na ocenę Wykład - Egzamin |
Zajęcia w cyklu "Semestr zimowy 2024/25" (w trakcie)
Okres: | 2024-10-01 - 2025-01-26 |
Przejdź do planu
PN WT CW
WYK
ŚR CZ CW
PT |
Typ zajęć: |
Ćwiczenia, 15 godzin
Wykład, 15 godzin
|
|
Koordynatorzy: | Szymon Lis, Michał Woźniak | |
Prowadzący grup: | Szymon Lis, Michał Woźniak | |
Lista studentów: | (nie masz dostępu) | |
Zaliczenie: |
Przedmiot -
Egzamin
Ćwiczenia - Zaliczenie na ocenę Wykład - Egzamin |
Właścicielem praw autorskich jest Uniwersytet Warszawski.