OPTIMIZATION METHODS FOR MACHINE LEARNING Single channel
Chair (Coordinator) and Rapporteur: LAURA PALAGI
Lecturers
Objectives
Knowledge and understanding
The aim of the course is to introduce students to the application of optimization techniques to training problems arising in machine learning. Students are expected to gain insight into standard models in Machine Learning (Deep Networks and Support Vector Machines) and into more recent optimization algorithms for determining the parameters (training) of such models that best fit to the available data.
Applying knowledge and understanding
By the end of the course, students should be able to select the correct model for the problem at hand and either to use standard software specialized to the application and/or to develop their own optimization algorithm.
modelli di apprendimento automatico applicati ai casi studio in apprendimento automatico.
Making judgements
Lectures, practical exercises and project sessions will provide students with the ability to assess the main strengths and weaknesses of the different machine learning models applied to case studies in machine learning.
Communication
By the end of the course, students are able to point out the main features of a machine learning problem and explain techniques for its solution both with a specialized and a non-specialized audience. These abilities are tested and evaluated in the projects developed in small groups thus encouraging team building and a proactive learning process coupling with collaborative learning. These abilities can also be checked in the final oral exam.
Lifelong learning skills
Students are expected to develop those learning skills necessary to undertake additional studies on the relevant topics with a high degree of autonomy. During the classes, students are encouraged to work on projects into small groups thus stimulating student activity and engagement. They are pushed to consult supplementary research publications and internet sites to exploit tricks and detailed choices needed to accomplish the tasks effectively. These capabilities are tested and evaluated in the development of the final reports of the projects where students have to discuss the main issues of the addressed problems and their choices to overcome the difficulties, based on the topics and material covered in class.
Learning outcomes
By the end of the course, students should be able to select the correct machine learing model for the problem at hand and either to use standard libraries specialized to the application and/or to develop their own optimization algorithm.
Prerequisites
Linear algebra, principles of mathematical analisys for multivariate functions (Taylor, partial derivatives). Convexity
No propeudicity with other courses is envisaged.
Programme
1. Introduction.
Definition of learning systems. Goals and applications of machine learning (classification and regression). Basics on statistical learning theory (Vapnik
Chervonenkis bound). Underfitting and Overfitting. Use of data: training set, test set, validation set.
2. Review of optimization tools and comparison of learning algorithms from the optimization point of view. (3 lezioni)
3. Artificial Neural Networks.
Neurons and biological motivation. Linear threshold units. The Perceptron and its learning algorithm (proof of convergence). Classification of linearly
separable patterns.
Multi-Layer Feedforward Neural Networks. Gradient method: basics. Back-propagation (BP) algorithm. BP batch version: proof of convergence and
choice of the learning rate. BP on-line version: incremental method, theorem of convergence. Momentum updating rule.
Radial-Basis function (RBF) networks: regularized and generalized RBF networks. Their use in interpolation and approximation. learning strategies and
error functions. Unsupervised selection of center. Supervised selection of weights and centers: decomposition methods into two blocks and decomposition methods into more blocks. Convergence theory of decomposition methods.
Early stopping
4. Support Vector Machines (Kernel methods)
Soft and hard Maximum Margin Classifiers. Quadratic programming formulation of the soft/hard maximum margin separators. Kernels methods.
Dual formulation of the primal QP problem. Wolfe duality theory for QP. KKT conditions. Frank Wolfe method: basics. Decomposition methods: SMO-type
algorithms, MVP algorithm, SVMlight, cyclic methods. Convergence theory. Implementation tricks: Caching, shrinking.
Choosing parameters: k-fold cross-validation. Multiclass SVM problems: one-against-one and one-against-all.
5. Practical use of learning algorithms.
6. Use of open-source software
Books
Teaching material is made up of lectures slides and lectures notes.
The following books are also suggested
Pattern Recognition and Machine Learning - Bishop - 2006
Deep Learning - Goodfellow, Bengio, Courville - 2016
Bibliography
Pattern Recognition and Machine Learning - Bishop - 2006
Deep Learning - Goodfellow, Bengio, Courville - 2016
Lessons mode
in presence
Frequency
in presence
Exam mode
The evaluation involves one or two projects. If carried out during the semester of the course the projects are two and associated with two multiple-choice and/or open written questions. If carried out after the end of the course, only one project is to be awarded in conjunction with one written and one oral test.
The aim of the project is to acquire skills in the autonomous development of a machine learning system at various levels, starting from the use of open source software up to the development of own code.
The written and/or oral test is aimed at testing methodological skills.
Example exam questions
Past exams text are avialble on the teacher's websit
Sustainability goals
- Academic year2025/2026
- Degree program to which the course belongsManagement Engineering
- Lesson code1041415
- Year and semester2nd year - 1st semester
- Activity typeAttività formative affini ed integrative
- Academic areaAttività formative affini o integrative
- SSDMAT/09
- Mandatory presenceNo
- Languageeng
- CFU6 CFU
- Total duration60 hours
- Hours distribution36 classroom hours, 24 training hours