Faithful and Salient Multimodal Data-to-Text Generation

Primary supervisor

Teresa Wang

Co-supervisors

Yuan-Fang Li
Derry Wijaya
Mohammed Eunus Ali

Research area

Vision and Language

While large multimodal models (LMMs) have obtained strong performance on many multi-modal tasks, they may still hallucinate while generating text. Their performance on detecting salient features from visual data is also unclear. In this project, we develop a framework to generate faithful and salient text from mixed-modal data, which includes images and structured data.

Faithful and Salient Multimodal Data-to-Text Generation

Primary supervisor

Co-supervisors

Research area

Primary supervisor

Teresa Wang

Supervisor Connect

Browse

Recently added