Courses are together with exams the building blocks for modules. Please keep in mind that information on the contents, learning outcomes and, especially examination conditions are given on the module level only – see section "Assignment to Modules" above.
additional remarks |
This lab course covers various advanced Machine Learning techniques, mainly in the area of classification, for processing speech from audio signals. It is a programming course in which ML concepts are applied to audio corpora in Python. Proficiency in any object oriented language is sufficient as Python is amazingly easy to use.
Covered areas (a.o.):
- Voice activity detection
- Speaker detection
- Emotion detection
- Channel compensation
Covered methods (a.o.):
- Features extraction: Mel Frequency Cepstral Coefficients (MFCC)
- Generic classifiers: Random Forests, Support Vector Machines, Gaussian Mixture Models
- Classifiers for audio: GMM-UBMs and MAP adaption, supervector GMMs
- Factor Analysis for channel compensation: Joint Factor Analysis, I-Vectors
- Performance metrics |
Links |
TUMonline entry
|