The automatic analysis and indexing (indexing) of media data is a prerequisite for an effective search in multimedia data sets. The learning objectives of the lecture include understanding the methods necessary for the analysis of image, sound and video data (focus: visual data) and being able to evaluate the advantages and disadvantages. Furthermore, students will learn about different measures of quality for evaluating such methods, methods for visualizing and exploring media assets, as well as the structure of multimedia search engines and understand their basic principles in each case. Finally, students will gain insight into how analysis methods - based on software libraries - can be implemented.
Recommended: Foundations of Information Retrieval, Computer Vision or Image Processing, Pattern Recognition.
The course will cover the following topics: 1. Introduction, building search engines for media data; 2. Semantic image, sound and video analysis, recognition of objects, scenes and events ("concept detection"); 3. Face detection and person recognition in images; 4. Multimodal person recognition in videos; 5. Temporal video segmentation ("slice detection"); 6. Text recognition in images & video OCR; 7. Similarity search; 8. Recognition of camera motion; 9. Visualizations for exploration of media data.
Prof. Dr. Ralph Ewerth