Distributed Systems

Course objectives

Computing systems are at the heart of any information systems today. This course aims to provide students with a clear characterization of concurrency in a distributed system considering its characteristics like failures, variable latencies and the absence of a global clock. The course will analyze the main system models and abstractions for basic communications and synchronization. Finally, we will provide the concepts related to distributed consensus and an introduction to distributed ledgers. Knowledge and understanding The student will be able to design systems and distributed algorithms on top of different system models such as synchronous, asynchronous and partially synchronous, while understanding impossibilities and limitations in performance. Students will also have the ability to abstract systems and platforms using models that are easier to treat.

Channel 1
GIUSEPPE ANTONIO DI LUNA Lecturers' profile

Program - Frequency - Exams

Course program
Distributed systems are a fundamental concept in modern computing, enabling the creation of large-scale, fault-tolerant systems that can provide robust, reliable services to users. This university-level course is designed to provide an in-depth understanding of distributed systems, starting with a brief history and overview of the challenges and motivations that led to their development. Program: The course is divided into seven modules, each of which covers a different aspect of distributed systems: M1: Models, Abstractions, and Basic Concepts In this module, students will learn about the core models and abstractions used in distributed systems, including processes, communication, traces and execution, and I/O Automata. Students will also study different types of failures, such as crash-stop and Byzantine failures, and learn about event-based programming and its abstraction, specification, and implementation. Finally, students will explore different types of links, including fair-lossy, stubborn, and point-to-point links. M2: Time in Distributed Systems This module focuses on time in distributed systems, covering synchronous, asynchronous, and eventually synchronous systems. Students will learn about different clock synchronization techniques, including Christian, Berkeley, and NTP, and study tree-based versus fully distributed algorithms. The module also covers logical clocks and vector clocks and the happened-before relationship. Students will also learn about encapsulating time in failure detectors, leader election, eventual leader election, and distributed system models, such as fail-stop, fail-noisy, and fail-silent systems. M3: Basic Broadcast Primitives In this module, students will explore different types of broadcast primitives, including best-effort broadcast, reliable broadcast, uniform reliable broadcast, and causal broadcast. M4: Shared Memories This module focuses on shared memories and explores different types of consistency, including regular, sequential consistency, and linearizability. Students will also learn about different types of registers, such as (1, N) register and atomic (1, 1), (1, N), and (N, N) registers. The module also covers the non-composability of sequential consistency and the relationship between consistency properties. M5: Consensus This module covers consensus and its specification. Students will explore hierarchical consensus, including uniform and non-uniform consensus. The module also covers FLP, including the proof that is not required, only the statement, and Paxos. M6: Total Ordering and Replicated State Machine In this module, students will learn about total order broadcast and replicated state machines, including Raft. The module also covers active replication versus primary backup. M7: Byzantine Fault Tolerance (BFT) This module covers the basics of BFT, including Byzantine failures and authenticated channels and crypto assumptions. Students will learn about Byzantine broadcast, including its consistency and reliability. The module also covers synchronous systems, including an impossibility result and the necessity of 3f+1 and King Algorithm. Tentative course calendar: The course is designed to be taught over 12 weeks, with five hours of teaching each week. The tentative course calendar is as follows: Week 1: Introduction to Distributed Systems -Brief history; -Challenges; -Motivations. Models, Abstractions, and Basic Concepts (M1) -Processes, communication, traces and execution, I/O Automata; -Failures: crash-stop, and Byzantine; -Event-based programming: abstraction, specification, and implementations; -Links: fair-lossy, stubborn, point-to-point. Week 2/3: Time in Distributed Systems (M2) -Synchronous, asynchronous, and eventually synchronous systems; -Clock Synchronization: Christian, Berkley, NTP; -Clock sync. Tree-based vs Fully-Distributed Algorithms, Local Skew vs Global Skew, The limits of ordering with timestamps -Logical clocks and Vector Clock, Happened-before relationship. Week 3/4: Time in Distributed Systems (M2) continued -Encapsulating time in failure detectors: P, $\Diamond$-P; -Leader election and Eventual Leader Election -Distributed system models: fail-stop, fail-noisy, fail-silent." Week 3/4: Basic Broadcast Primitives (M3) -Best effort broadcast; -Reliable broadcast; -Uniform reliable broadcast; -Causal broadcast. Week 5: Shared Memories (M4) -Consistency: regular, sequential consistency, and linearizability; -Regular (1,N) register: Message Passing and (1,1)-Regular implementation -Atomic (1,1), (1,N), and (N,N) registers; -Non-composability of sequential consistency; -Relationship between consistency properties;" Week 7: Consensus (M5) -Consensus specification; -Hierarchical consensus: uniform and non-uniform; -FLP: proof not required only the statement -Paxos. Week 8: Total Ordering and Replicated State Machine (M6) -Total Order broadcast -Replicated State Machine and Raft. -Active Replication vs Primary Backup Week 9: BFT (M7) -Intro: Byzantine failures. -Authenticated Channel and crypto assumptions. -Byzantine Broadcast, consistent and reliable. Week 10: BFT (M7) continued -Consensus: synchronous systems, an impossibility result: the necessity of 3f+1. -Consensus: synchronous systems: King Algorithm Week 11: Recap and Exercises M1: Models, Abstractions, and Basic Concepts -Processes, communication, traces and execution, I/O Automata; -Failures: crash-stop, and byzantine; -Event based programming: abstraction, specification and implementations; -Links: fair-lossy, stubborn, point-to-point. References: See reading material on https://sites.google.com/view/distributed-systems-2020/lectures M2: Time in Distributed Systems -Synchronous, asynchronous, and eventually synchronous systems; -Clock Synchronization: Christian, Berkley, NTP; - Clock sync. Tree based vs Fully-Distributed Algorithms, Local Skew vs Global Skew, The limits of ordering with timestamps -Logical clocks and Vector Clock, Happened-before relationship. -Encapusaliting time in failure detectors: P, $\Diamond$-P; -Leader election and Eventual Leader Election -Distributed system models: fail-stop, fail-noisy, fail-silent." References: See reading material on https://sites.google.com/view/distributed-systems-2020/lectures M3: Basic Broadcast Primitives - Best effort broadcast; - Reliable broadcast; - Uniform reliable broadcast; - Causal broadcast. References: See reading material on https://sites.google.com/view/distributed-systems-2020/lectures M4: Shared Memories - Consistency: regular, sequential consistency and linearizability; - Regular (1,N) register: Message Passing and (1,1)-Regular implementation - Atomic (1,1), (1,N) and (N,N) registers; - Non composability of sequential consistency; -Relationship between consistency properties;" M5: Consensus -Consensus specification; -Hiearchical consensus: uniform and non-uniform; -FLP: proof not required only the statement -Paxos. References: See reading material on https://sites.google.com/view/distributed-systems-2020/lectures M6: Total Ordering and Replicated State Machine -Total Order broadcast -Replicated State Machine and Raft. -Active Replication vs Primary Backup M7: BFT Intro: - Byzantine failures. - Authenticated Channel and crypto assumptions. - Byzantine Broadcast, consistent and reliable. Consensus: -synchronous systems: an impossibility result: the necessity of 3f+1 (Not Formal see slides). -synchronous systems: King Algorithm -eventually-synchronous system: PBFT" References: See reading material on https://sites.google.com/view/distributed-systems-2020/lectures Blockchains: Permissioned Blockchains: Definition, Hyperledger fabric architecture and example of a smart contract, other blockchains: Corda and Sawtooth Permssionless Blockchains: Definition, A deep study of the bitcoin network, Pitfails and attacks against permissionless blockchain based on Proof of Work References: See reading material on https://sites.google.com/view/distributed-systems-2020/lectures
Prerequisites
Basic knowledge of algorithms and complexity: concept of algorithm, complexity measures. Basic knowledge of computer science and networks: communication protocols, client/server systems. Basic knowledge of logic: induction, implications, proofs by contradiction.
Books
Introduction to Reliable and Secure Distributed Programming - C. Chacin, R. Guerraoui, and L. Rodrigues. 2011 Bibliografia di riferimento Distributed Algorithms - N. Lynch. 1998. Distributed Computing - H. Attiya, J.L. Welch. 2004 Distributed Systems 3rd edition - M. Van Steen and A.S. Tanenbaum. The authors give away free copies: https://www.distributed-systems.net/index.php/books/distributed-systems-3rd-edition-2017/ Design and Analysis of Distributed Algorithms- N. Santoro, 2008.
Teaching mode
Mixed mode on zoom and physically
Frequency
Not Mandatory
Exam mode
The evaluation is a written exam that tests the student’s knowledge and understanding of the fundamental concepts and principles of distributed systems, as well as their ability to apply them to solve simple problems. The exam consists of multiple-choice questions, short-answer questions, and problem-solving exercises. The sufficient mark is 18/30 the highest possible mark is 30/30, the laude is awarded in special cases (complete mastery of the discipline). The marks (18-21) are given to students that show a sufficient knowledge and understanding of fundamentals concepts and principle, but that have difficulties to apply these concepts in a novel way in problem solving exercises. The marks (22-24) are given to students that show a good knowledge and understanding of fundamentals concepts and principle, but that can sufficiently apply these concepts in a novel way in problem solving exercises. The marks (25-27) are given to students that show a good or complete knowledge and understanding of fundamentals concepts and principle, but that can apply these concepts in a novel way in problem solving exercises with minimal mistakes. The marks (27-29) are given to students that show a complete knowledge and understanding of fundamentals concepts and principle, but that can apply these concepts in a novel way in problem solving exercises without mistakes. The marks (30-30 Laude) are given to students that show a complete knowledge and understanding of fundamentals concepts and principle, but that can apply these concepts in a novel way in problem solving exercises without mistakes and with an elegant and efficient methodology. The Laude is awarded only to students that show a complete and total mastery of the discipline during an additional oral examination reserved to students that reach the mark of 30 in the written exam.
Bibliography
Distributed Algorithms - N. Lynch. 1998. Distributed Computing - H. Attiya, J.L. Welch. 2004 Distributed Systems 3rd edition - M. Van Steen and A.S. Tanenbaum. The authors give away free copies: https://www.distributed-systems.net/index.php/books/distributed-systems-3rd-edition-2017/ Design and Analysis of Distributed Algorithms- N. Santoro, 2008.
Lesson mode
The teaching is divided as divided: 70% of lectures explaining algorithms and concepts. The explanation of each algorithm is done with an interactive approach in which the algorithm is iteratively refined until the correct form using the inputs of the students. 30% of the teaching is devoted to problem solving and exercises.
  • Lesson code1022807
  • Academic year2024/2025
  • CourseCybersecurity
  • CurriculumSingle curriculum
  • Year1st year
  • Semester1st semester
  • SSDING-INF/05
  • CFU6
  • Subject areaAmbito Tecnologico