de | en

Statistical Modeling and Machine Learning

Modul IN2332

Dieses Modul wird durch Fakultät für Informatik bereitgestellt.

Diese Modulbeschreibung enthält neben den eigentlichen Beschreibungen der Inhalte, Lernergebnisse, Lehr- und Lernmethoden und Prüfungsformen auch Verweise auf die aktuellen Lehrveranstaltungen und Termine für die Modulprüfung in den jeweiligen Abschnitten.


IN2332 ist ein Semestermodul in Englisch auf Master-Niveau das im Sommersemester angeboten wird.

Das Modul ist Bestandteil der folgenden Kataloge in den Studienangeboten der Physik.

  • Allgemeiner Katalog der nichtphysikalischen Wahlfächer
GesamtaufwandPräsenzveranstaltungenUmfang (ECTS)
240 h 120 h 8 CP

Inhalte, Lernergebnisse und Voraussetzungen


0. Univariate and simple multivariate calculus and summary of linear algebra with intuitive explanations
1. Concepts in machine learning: supervised vs. unsupervised learning, classification vs. regression, overfitting, curse of dimensionality
2. Probability theory, Bayes theorem, conditional independence, distributions (multinomial, Poisson, Gaussian, gamma, beta,...), central limit theorem, entropy, mutual information
3. Generative models for discrete data: likelihood, prior, posterior, Dirichlet-multinomial model, naive Bayes classifiers
4. Gaussian models: max likelihood estimation, linear discriminant analysis, linear Gaussian systems
5. Bayesian statistics: max posterior estimation, model selection, uninformative and robust priors, hierarchical and empirical Bayes, Bayesian decision theory
6. Frequentist statistics: Bootstrap, Statistical testing
7. Linear regression: Ordinary Least Square, Robust linear regression, Ridge Regression, Bayesian Linear Regression
8. Logistic regression and optimization: (Bayesian) logistic regression, optimization, L2-regularization, Laplace approximation, Bayesian information criterion
9. Generalized Linear Models: the exponential family, Probit regression
10. Expectation Maximization (EM) algorithm with applications
11. Latent linear models: Principle Component Anlaysis, Bayesian PCA


At the end of the module students are able to:
- 1. remember the concepts of supervised and unsupervised learning and to implement cross-validation procedures
- 2. remember the concepts of Bayesian probabilities, of conditional and unconditional dependences
- 3. derive mathematically the models and inference procedures of Bayesian linear regression, Generalized linear models, Bayesian Principal Component Analysis, and k-means.
- 4. identify use cases of the above mentioned models
- 5. apply the above mentioned models using the R programming language
- 6. assess the performance and significance of their results
- 7. develop simple novel Bayesian models and inference procedure thereof for situations for which the above mentioned models do not apply.


Linear algebra and multivariate calculus

Lehrveranstaltungen, Lern- und Lehrmethoden und Literaturhinweise

Lehrveranstaltungen und Termine

VO 4 Statistical Modeling and Machine Learning (IN2332) Gagneur, J. Di, 14:00–17:00, 5613.03.010
UE 4 Exercise Statistical Modeling and Machine Learning (IN2332) Gagneur, J. Fr, 12:30–15:30, 5613.02.010
Fr, 12:30–15:30, 5609.01.034

Lern- und Lehrmethoden

The class will be based on Christopher Bishop's book "Pattern Recognition and Machine Learning". The lecture will be held in inverted classroom style: Each week, we will give a ~30 min overview of the next reading assignment of a section of the book, pointing out the essential messages, thus facilitating the reading at home. Exercises to solve until next lecture will be given, including mathematical derivations of some book results. In the next lecture, the exercises will be discussed (~30 min), as well as questions and difficulties with the material are answered (~20 min). Then, practical exercises using the newly acquired material will be solved in teams, using the R statistics framework (100min). Further exercises will be performed during the Friday classes (3 hours) in smaller groups. The inverted classroom style is in our experience better suited than the conventional lecturing model for quantitative topics that require the students to think through or retrace mathematical derivations at their own speed.


Weekly posted exercises (math and programming) online, slides, chalk board, live demo


Pattern recognition and Machine Learning by Christopher Bishop


Beschreibung der Prüfungs- und Studienleistungen

The learning outcomes are assessed by a final exam. The final exam is a 2 hours written exam. It includes knowledge questions (learning outcomes 1,2,4) and statistical modeling questions (derivation of the likelihood and of the inference procedure of a model not seen during the class, learning outcomes 3,7), a bit of R programming (learning outcome 5), and interpretation of results (learning outcome 6).


Eine Wiederholungsmöglichkeit wird im Folgesemester angeboten.

Nach oben