Machine Learning for Regulatory Genomics
Module IN2393
This module handbook serves to describe contents, learning outcome, methods and examination type as well as linking to current dates for courses and module examination in the respective sections.
Basic Information
IN2393 is a semester module in English language at Master’s level which is offered in summer semester.
This Module is included in the following catalogues within the study programs in physics.
- Catalogue of non-physics elective courses
Total workload | Contact hours | Credits (ECTS) |
---|---|---|
180 h | 60 h | 6 CP |
Content, Learning Outcome and Preconditions
Content
This is a two-part module: (1) Six lectures
introduce biological mechanisms, experimental assays, and computational models for regulatory genomics. The six lectures are supported with modeling exercises in python. This is followed by (2) an eight-week hands-on project.
The lectures are organized around steps of gene expression:
- Introduction to gene regulation and sequence-based computational models of gene regulation
- Transcriptional regulation
- Chromatin-mediated regulation
- RNA splicing
- RNA modification and degradation
- Translation
Over these lectures, computational methods are introduced including:
- Fitting procedures of deep neural network
- Convolutional Neural Networks
- LSTM and transformers
- Embeddings for sequence data
- Multi-task learning and transfer learning
- End-to-end learning
- Analytical and visualisation techniques for model interpretation
introduce biological mechanisms, experimental assays, and computational models for regulatory genomics. The six lectures are supported with modeling exercises in python. This is followed by (2) an eight-week hands-on project.
The lectures are organized around steps of gene expression:
- Introduction to gene regulation and sequence-based computational models of gene regulation
- Transcriptional regulation
- Chromatin-mediated regulation
- RNA splicing
- RNA modification and degradation
- Translation
Over these lectures, computational methods are introduced including:
- Fitting procedures of deep neural network
- Convolutional Neural Networks
- LSTM and transformers
- Embeddings for sequence data
- Multi-task learning and transfer learning
- End-to-end learning
- Analytical and visualisation techniques for model interpretation
Learning Outcome
Gene expression refers to how cells read the information encoded in genomes. At the end of the module students are able to:
1. Describe major steps of gene expression from accessing DNA to determining protein abundance.
2. Describe genome-wide assays employed to assess various steps of gene expression
3. Describe the concept of massively parallel reporter assays
4. Describe and apply deep learning methods to perform sequence-based predictions
5. Describe and apply the concept of model interpretation
6. Describe and apply the concept of convolutional neural network
7. Describe and apply the concept of transformers
8. Apply deep learning for sequence-based modeling of a genome-wide assay. Evaluate model performance and provide biological interpretation of its application to real data.
1. Describe major steps of gene expression from accessing DNA to determining protein abundance.
2. Describe genome-wide assays employed to assess various steps of gene expression
3. Describe the concept of massively parallel reporter assays
4. Describe and apply deep learning methods to perform sequence-based predictions
5. Describe and apply the concept of model interpretation
6. Describe and apply the concept of convolutional neural network
7. Describe and apply the concept of transformers
8. Apply deep learning for sequence-based modeling of a genome-wide assay. Evaluate model performance and provide biological interpretation of its application to real data.
Preconditions
Bachelor in mathematics, bioinformatics, computer science,
physics, statistics or a related field. One lecture on machine
learning (e.g IN2064; MA4802). Strong interest in
biological and biomedical research questions.
physics, statistics or a related field. One lecture on machine
learning (e.g IN2064; MA4802). Strong interest in
biological and biomedical research questions.
Courses, Learning and Teaching Methods and Literature
Courses and Schedule
Type | SWS | Title | Lecturer(s) | Dates | Links |
---|---|---|---|---|---|
VO | 2 | Lecture Machine Learning for Regulatory Genomics (IN2393) | Gagneur, J. Heinig, M. |
Tue, 14:00–15:30, MI 00.08.038 |
eLearning |
UE | 2 | Exercise Machine Learning for Regulatory Genomics (IN2393) | Gagneur, J. Heinig, M. |
Tue, 15:30–17:00, MI 00.08.038 |
Learning and Teaching Methods
We design this module as a two-part module (with 4 SWS contact hours per week): (1) Six lectures introduce the students to the most relevant topics and methods for regulatory genomics. This is followed by (2) an eight-week project work where students focus on specific gene regulatory modeling topics in one partner research lab to get hands-on experience.
Lectures provide the state-of-the-art of regulatory genomics modeling approaches. These concepts are first applied with in-class tutorials following each lecture. During the project work, these concepts are applied on real biological or biomedical data problems under mentoring of the teaching team. The results of the project work are summarized in a final talk and a written report.
Lectures provide the state-of-the-art of regulatory genomics modeling approaches. These concepts are first applied with in-class tutorials following each lecture. During the project work, these concepts are applied on real biological or biomedical data problems under mentoring of the teaching team. The results of the project work are summarized in a final talk and a written report.
Media
Weekly posted exercises online, slides, script, live demo
Literature
Goodfellow et al, Deep Learning, MIT press https://www.deeplearningbook.org/
Eraslan et al. Deep learning: New computational modeling techniques for genomics, Nature Reviews Genetics, 2019
Eraslan et al. Deep learning: New computational modeling techniques for genomics, Nature Reviews Genetics, 2019
Module Exam
Description of exams and course work
The students are individually evaluated on a project work by the project supervisors, according to their performance:
- During the project work (motivation, problem solving
capacity, data analysis skills, programming
capabilities).
- At the final presentation (clearness of presentation
and slides, used methods, achieved results). 10 minutes.
- In the written report (conciseness, language, used
methods). 20 pages maximum.
The final mark will be given by the supervisors who attend the final lectures.
- During the project work (motivation, problem solving
capacity, data analysis skills, programming
capabilities).
- At the final presentation (clearness of presentation
and slides, used methods, achieved results). 10 minutes.
- In the written report (conciseness, language, used
methods). 20 pages maximum.
The final mark will be given by the supervisors who attend the final lectures.
Exam Repetition
The exam may be repeated at the end of the semester.