Advanced Deep Learning for Computer Vision: Visual Computing
Module IN2390
This module handbook serves to describe contents, learning outcome, methods and examination type as well as linking to current dates for courses and module examination in the respective sections.
Basic Information
IN2390 is a semester module in English language at Master’s level which is offered every semester.
This Module is included in the following catalogues within the study programs in physics.
- Catalogue of non-physics elective courses
Total workload | Contact hours | Credits (ECTS) |
---|---|---|
240 h | 75 h | 8 CP |
Content, Learning Outcome and Preconditions
Content
Note, this lecture is closely related (and mutually exclusive) to the lecture “Advanced Deep Learning for Computer Vision: Dynamic Vision”. The two lectures share some theoretical content, but the “Visual Computing” module provides a clear focus on visual computing tasks, which is especially important for the practical part in the form of a semester-long project.
Common lectures with “ADL4CV: Dynamic Vision” lecture:
- Recap of Neural Networks and CNNs
- Advanced Auto-encoders: Probabilistic approaches and the mathematical foundations (e.g., variational auto-encoders)
- Generative Adversarial Networks (from Goodfellow to CycleGANs and Progressive GANs)
- Autoregressive Networks and their parallelization
- Probabilistic vs deterministic generative methods
- Graph neural networks
- Transformer Networks
- Open Problems in Deep Learning for Computer Vision
Unique lectures for this module:
- Multi-dimensional CNNs: from audio to 3D scene environments; 3D vs multi-view CNNs, sparse CNNs (e.g., Octrees);
- Pointer Networks (focus on Scene Understanding, Meshes and 3D Geometry)
- Neural rendering: From Novel View Point synthesis, Video Generation and Editing
- Deep Fakes: Creation and Detection
- CNNs on meshes: learning on structured and unstructured graphs. Mesh data structures and derived convolution operators on meshes using differential geometry.
Common lectures with “ADL4CV: Dynamic Vision” lecture:
- Recap of Neural Networks and CNNs
- Advanced Auto-encoders: Probabilistic approaches and the mathematical foundations (e.g., variational auto-encoders)
- Generative Adversarial Networks (from Goodfellow to CycleGANs and Progressive GANs)
- Autoregressive Networks and their parallelization
- Probabilistic vs deterministic generative methods
- Graph neural networks
- Transformer Networks
- Open Problems in Deep Learning for Computer Vision
Unique lectures for this module:
- Multi-dimensional CNNs: from audio to 3D scene environments; 3D vs multi-view CNNs, sparse CNNs (e.g., Octrees);
- Pointer Networks (focus on Scene Understanding, Meshes and 3D Geometry)
- Neural rendering: From Novel View Point synthesis, Video Generation and Editing
- Deep Fakes: Creation and Detection
- CNNs on meshes: learning on structured and unstructured graphs. Mesh data structures and derived convolution operators on meshes using differential geometry.
Learning Outcome
Upon completion of this module, students will have acquired extensive theoretical concepts behind advanced architectures of neural networks, in particular in the context of computer vision tasks in visual computing. In addition to the theoretical foundations, a significant aspect lies on the practical realization and training of neural networks.
Preconditions
MA0902 Analysis for Informatics
MA0901 Linear Algebra for Informatics
IN2346 Introduction to Deep Learning (expert knowledge required!)
This is the advanced lecture for deep learning with a specific focus on computer vision. Taking the “Introduction to Deep Learning” course is expected.
MA0901 Linear Algebra for Informatics
IN2346 Introduction to Deep Learning (expert knowledge required!)
This is the advanced lecture for deep learning with a specific focus on computer vision. Taking the “Introduction to Deep Learning” course is expected.
Courses, Learning and Teaching Methods and Literature
Courses and Schedule
Type | SWS | Title | Lecturer(s) | Dates | Links |
---|---|---|---|---|---|
VO | 2 | Advanced Deep Learning for Computer Vision: Visual Computing (IN2390) | Chen, Z. Franzmann, A. Nie, Y. Nießner, M. Rössle, B. … (insgesamt 6) |
Wed, 10:00–12:00, MI 01.09.014 |
eLearning |
Learning and Teaching Methods
The lectures will provide extensive theoretical aspects of neural networks and in particular deep learning architectures, specifically for advanced methods in the field of Computer Vision.
The practical sessions will be key, students shall get familiar with Deep Learning through hours of training and testing. They will work with PyTorch and implement advanced network architectures. The project will have a focus on visual computing, including the following topics:
- neural rendering
- generative neural networks (GANs)
- neural radiance fields
- deep fake generation
- media forensics (forgery detection)
- scene reconstruction (multi-view, depth sensors, etc.)
- generative geometric models
- semantic scene understanding (object detection, instance segmentation, semantic segmentation)
- 3D scene understanding for autonomous driving (e.g., with Lidar/Radar)
- reinforcement learning (e.g., for 3d modeling, 3d auto-scanning, 3d navigation)
- natural language processing (NLP) for scene understanding
We recommend to take a look at the recent list of publications at https://niessnerlab.org/ to get a better idea of recent research projects.
The practical sessions will be key, students shall get familiar with Deep Learning through hours of training and testing. They will work with PyTorch and implement advanced network architectures. The project will have a focus on visual computing, including the following topics:
- neural rendering
- generative neural networks (GANs)
- neural radiance fields
- deep fake generation
- media forensics (forgery detection)
- scene reconstruction (multi-view, depth sensors, etc.)
- generative geometric models
- semantic scene understanding (object detection, instance segmentation, semantic segmentation)
- 3D scene understanding for autonomous driving (e.g., with Lidar/Radar)
- reinforcement learning (e.g., for 3d modeling, 3d auto-scanning, 3d navigation)
- natural language processing (NLP) for scene understanding
We recommend to take a look at the recent list of publications at https://niessnerlab.org/ to get a better idea of recent research projects.
Media
Projector, blackboard, PC
Literature
- Slides given during the course
- www.deeplearningbook.org
- www.deeplearningbook.org
Module Exam
Description of exams and course work
- Written test of 60 minutes at the end of the course (for lecture)
- The lecture will have reading assignments (e.g., from the DeepLearning book and recent CVPR/ICCV/ECCV papers)
- After each practical session, the students will have to provide the written working code to the teaching assistant for evaluation. The students will be awarded a bonus in case they successfully complete all practical assignments.
- In the written exam, we will ask questions regarding lecture theory
- In addition, to the written exam, the results of the projects will be evaluated; we will evaluate projects on a (bi-)weekly basis including reports (33.33%), oral presentations (33.33%), and code/submissions (33.33%).
- The lecture will have reading assignments (e.g., from the DeepLearning book and recent CVPR/ICCV/ECCV papers)
- After each practical session, the students will have to provide the written working code to the teaching assistant for evaluation. The students will be awarded a bonus in case they successfully complete all practical assignments.
- In the written exam, we will ask questions regarding lecture theory
- In addition, to the written exam, the results of the projects will be evaluated; we will evaluate projects on a (bi-)weekly basis including reports (33.33%), oral presentations (33.33%), and code/submissions (33.33%).
Exam Repetition
The exam may be repeated at the end of the semester.