Dr. Wenguan Wang is a Lecturer and DECRA Fellow in University of Technology Sydney, Australia. His research interests include Autonomous Driving, Human-Centered AI, and Embodied AI. He has published over 70 journal and conference papers such as TPAMI, TIP, TVCG, NeurIPS, CVPR, ICCV, ECCV, AAAI, and Siggraph Asia, including one CVPR Best Paper Finalist, one CVPR workshop Best Paper, 12 top-conference Oral papers, and 16 TPAMI papers. He also serves as Associate Editor for TCSVT and Neurocomputing. He has more than 10,000 Google Scholar citations with 44 H-index. He has won awards in 15 international academic competitions. He has obtained several honors including Australian Research Council (ARC) –Discovery Early Career Researcher (DECRA) Award (2021), Elsevier Highly Cited Chinese Researchers (2020, 2021), World Artificial Intelligence Conference Youth Outstanding Paper Award (2020), China Association of Artificial Intelligence Doctoral Dissertation Award (2019), ACM China Doctoral Dissertation Award (2018), and Baidu Scholarship (2016).
Personal website: https://sites.google.com/view/wenguanwang
Modeling Data Structures for Effective and Explainable Visual Recognition
Modern deep learning models for visual recognition are predominantly built upon the fully parametric, discriminative softmax classifier. Though straightforward, this de facto regime neglects the underlying data structure, and thereby suffers from poor explainability, and inferior robustness to out-of-distribution inputs. In this talk, taking image classification and segmentation as examples, I will present my recent work that addresses these limitations by revisiting the classic prototype theory and exemplar-based classifier. We further propose a new family of deep visual recognition models that learn generative, Gaussian Mixture Model (GMM) based classifiers with end-to-end discriminative representation in a compact and collaborative manner. Above efforts lay the solid foundation of effective, robust, and explainable visual recognition.
Nov 24 2022, 19:30PM
Zoom
Reference:
[1] Rethinking Semantic Segmentation: A Prototype View. CVPR22, Oral
[2] Visual Recognition with Deep Nearest Centroids.
[3] GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models. NeurIPS22