Primary supervisor
Munawar HayatResearch area
Computational and Collective IntelligenceZero-shot detection aims to simultaneously identify and localize (by predicting bounding box coordinates) objects which have never been observed during training time. The existing zero-shot detection approaches project visual features to the semantic domain for seen objects using textual embeddings learned in a stand-alone manner without any joint incorporation of image data. This project will aim to leverage from recent developments in joint image-text modeling, to find the more meaningful correspondence between visual features and their semantic embedding. Some specific challenges to address include improving localization for unseen objects, addressing confusion between background and unseen objects, and class imbalance.