Mathematical models for neural networks

Course objectives

General objectives Acquiring basic knowledge on the mathematical methods used in artificial intelligence modeling, with particular attention to "machine learning". Specific objectives Knowledge and understanding: at the end of the course the student will have knowledge of the basic notions and results (mainly in the areas of stochastic processes and statistical mechanics) used in the study of the main models of neural networks (e.g., Hopfield networks, Boltzmann machines, feed-forward networks). Apply knowledge and understanding: the student will be able to identify the optimal architecture for a certain task and to solve the resulting model by determining a phase diagram; the student will have the basis to independently develop algorithms for learning and retrieval. Critical and judgmental skills: the student will be able to determine the parameters that control the qualitative behaviour of a neural network and to estimate the values of these parameters that allow a good performance of the network; she/he will also be able to investigate the analogies and relationships between the topics covered during the course and during courses dedicated to statistics and data analysis. Communication skills: ability to expose the contents in the oral and written part of the verification, possibly by means of presentations. Learning skills: the knowledge acquired will allow a study, individual or taught in a LM course, related to more specialised aspects of statistical mechanics, development of algorithms, usage of big data.

Channel 1
ELENA AGLIARI Lecturers' profile

Program - Frequency - Exams

Course program
The course starts with the introduction of simple models of neuron and neural network, whose investigation can be addressed by analytical techniques. Then, we will move to more sophisticated models which will be studied by a statistical-mechanics perspective. To this gaol, the basic concepts of statistical mechanics and information theory will be recalled. We will especially focus on models (e.g., the Hopfield network and the Boltzmann machines) able to mimic associative memory and (simple) learning processes, and their information processes capabilities will be treated as emergent collective properties. Some practical exercises on these cases will be proposed. Finally, state-of-the-art models will be briefly presented, both from a theoretical and algorithmic point of view. - Introduction Biological and artificial models McCulloch-Pitts neuron Attractor neural networks Deterministic and stochastic neuronal dynamics Hebbian storing - Introduction to equilibrium statistical mechanics Curie-Weiss model Phase transitions, ergodicity breaking, spontaneous symmetry breaking Mattis model Introduction to disordered systems: frustration, "quenched" and "annealed" media, self-average Hopfield model Solution of the Hopfield model in the low-load regime by saddle-point method and interpolating techniques Pure and spurious states Signal-to-noise technique Introduction to the Sherrington-Kirkpatrick model Solution of the Hopfield model in the high-load regime by interpolating techniques Non-local coupling with pseudo-reverse interaction matrix - Review of statistical inference Elements of information theory Bayesian approach (and Occam's razor) Maximum likelihood Maximum entropy - Introduction to machine learning The Rosenblatt perceptron Bayesian learning Definitions for supervised and unsupervised learning Notes on autoencoders, feed-forward networks, convolutional neuronal networks Boltzmann machines Contrastive divergence Phase diagrams for Boltzmann machines Notes on deep learning - Advanced models inspired by neurophysiology
Prerequisites
Basic concepts in statistical-mechanics (e.g., ensemble canonico, Curie-Weiss model) and in stochastic processes (e.g., Markov chains).
Books
Lecturer’s notes, available on classroom
Teaching mode
Lectures in the classroom with slides, proofs and examples on the blackboard with discussions, numerical exercises in the lab. Active learning methods like group projects, flipped classrooms, just-in-time teaching are also proposed and encouraged. Lectures in classroom are meant to introduce the main concepts of the statistical mechanics of complex systems, to define models of neural networks and quantitative tools for their analysis. The proofs and other parts (e.g., sketch of proofs for different models, links to other contexts, simple exercises) on the blackboard are meant to provide a deep and rigorous knowledge of the subject and to prompt the critical reasoning and a flexible mind. Exercises in the lab are meant to improve the computational skills of students and make them aware of the problems that may arise in the practical implementation of neural networks, as well as ready to find suitable solutions. Finally, active learning is meant to improve soft-skills like the ability to work in team and to give effective presentations.
Frequency
Attendance not mandatory
Exam mode
Two options are possible: - Oral examination on the whole program of the course (as detailed on classroom) where the student should prove to be able to present topics correctly, with logical reasoning skills, to provide details and give examples, to do links with other topics of the course. - Project to be carried on in small groups (2-4 people): the topic must be agreed with the lecturer and developed autonomously; group composition is chosen by students; each project must be accompanied with a report (10-20 pages) which has to be sent to the lecturer at least one week before the exam date and which shall be evaluated up to 13/30; on the exam date the students present their project (at the blackboard or by slides) in 20 minutes, followed by 10 minutes dedicated to questions and discussion on the specific topic of the project and by 5 minutes dedicated to questions on related topics that are part of the course program, which shall be evaluated 13/30 and 5/30, respectively. The cum laude corresponds to 31/30.
Bibliography
A.C.C. Coolen, R. Kühn, P. Sollich, Theory of Neural Information Processing Systems, Oxford Press (2005). C.M. Bishop, Neural Networks for Pattern Recognition, Oxford (1995). C.M. Bishop, Pattern Recognition and Machine Learning, Springer (2009). S.O. Haykin, Neural Networks and Learning Machines, Pearson (2009) D. J. Amit, Modeling Brain Function: The World of Attractor Neural Networks, Cambridge University Press (1989). B. Tirozzi, Modelli matematici di reti neurali, CEDAM (1995). Lecturer’s notes, available on classroom Lecturer’s slides, available on classroom
Lesson mode
Lectures in the classroom with slides, proofs and examples on the blackboard with discussions, numerical exercises in the lab (if available). Active learning methods like group projects, flipped classrooms, just-in-time teaching are also proposed and encouraged. Lectures in classroom are meant to introduce the main concepts of the statistical mechanics of complex systems, to define models of neural networks and quantitative tools for their analysis. The proofs and other parts (e.g., sketch of proofs for different models, links to other contexts, simple exercises) on the blackboard are meant to provide a deep and rigorous knowledge of the subject and to prompt the critical reasoning and a flexible mind. Numerical exercises are meant to improve the computational skills of students and make them aware of the problems that may arise in the practical implementation of neural networks, as well as ready to find suitable solutions. Finally, active learning is meant to improve soft-skills like the ability to work in team and to give effective presentations.
  • Lesson code10605752
  • Academic year2025/2026
  • CourseApplied Mathematics
  • CurriculumMatematica per Data Science
  • Year2nd year
  • Semester1st semester
  • SSDMAT/07
  • CFU6