MULTI-VARIED STATISTICS

Course objectives

Learning goals Know how to reorganize multivariate data for their statistical analysis. Acquire the tools for the analysis of multivariate statistical data. Knowledge and understanding. Knowledge of multivariate statistical methodologies and their formalization through matrix algebra. Applying knowledge and understanding. Understand which techniques are most appropriate to be able to make decisions based on empirical evidence, respond to corporate information requests and be able to extract relevant information from the observed data. Being able to carry out a statistical survey - using the skills already acquired in IT, Descriptive Statistics, Inferential Statistics and Sampling - and be able to analyze the multivariate data with the most appropriate methods of Multivariate Statistics. Making judgements. Students develop critical skills through the application of multivariate statistical methodologies to a wide range of statistical models. They learn to critically interpret the results obtained by applying the procedures to real data sets. Communication skills. Students, through the study and performance of practical exercises, acquire the technical-scientific language of the discipline, which must be properly used both in the intermediate and final written tests and in the oral tests. Communication skills are also developed through group activities. Learning skills. Students who pass the exam have learned a method of analysis that allows them to deal with work experience.

Channel 1
MAURIZIO VICHI Lecturers' profile

Program - Frequency - Exams

Course program
Data, their statistical organization (Data Warehouse), Control and correction (Preprocessing) The statistical and mathematical structures of the data: questionnaire, data matrix; binary coding of qualitative variables; contingency tables; block matrix of contingency tables; arrays of distances. The computer structures of data and their organization: records, files, relational model for databases; statistical information systems. Control and correction of data in the phases of a statistical survey: random and systematic errors; missing data; off-field data; logical inconsistencies; outliers; imputation of missing data. Data syntheses, their transformations and the main multivariate theoretical distributions of data The syntheses: mean vector; vector of the mean square deviations and variances; matrix of variances and covariances; correlation matrix; chi-square matrix; correlation relationship. Transformations: centered data matrix; matrix of standardized data; matrix of doubly centered distances. Multivariate distributions: multinormal distribution; multinomial distribution. Unsupervised classification (cluster analysis) The characteristics of the clusters on the basis of (dis) similarity and distance between multivariate units: Measures of (dis) similarity and distance. Measures of homogeneity and isolation of clusters. Non-hierarchical methods: K-means, Pam and fuzzy K-mean , EM ; Choice of variables, choice of the number of clusters, interpretation of results; The aggregative hierarchical methods: single link, middle link, complete link, centroid, Ward method. Interpretation of the dendrogram, methods for choosing a partition; Supervised classification (prediction model of a cluster), Methods: Linear discriminant analysis; Logistic regression; Classification trees. Evaluation of the classification and selection of variables model Dimensional reduction methodologies: analysis of the main components (PCA); rotation of the components; factorial analysis; biplots; correspondence analysis and multiple correspondence analysis; analysis of nonlinear main components (CATPCA); Interpretation of results
Prerequisites
Statistics and Inferential Statistics
Books
Course notes and matlab algorithms Suggested Books: G. McLachlan, D. Peel, (2000). Finite Mixture Models, Wiley Series in Probability and Statistics. A. C. Rencher, (2002). Methods of Multivariate Analysis, Wiley Series in Probability and Statistics; 2nd edition; A.D. Gordon (1999). Classification, Chapman & Hall, 2nd edition; Softwares: SPSS, SAS
Teaching mode
Oral Examination
Frequency
3 times per week from October to December
Exam mode
Oral examination
Lesson mode
Oral Examination
  • Lesson code1022894
  • Academic year2024/2025
  • CourseStatistics for management
  • CurriculumSingle curriculum
  • Year3rd year
  • Semester1st semester
  • SSDSECS-S/01
  • CFU9
  • Subject areaStatistico - probabilistico