Audio retrieval using text prompts

Primary supervisor

Thanh Thi Nguyen

This project aims to develop techniques that enable users to find relevant audio content by inputting textual queries. This process leverages machine learning models, particularly natural language processing and audio signal processing, to bridge the gap between text and audio. When a user submits a query, the system analyses the text to understand its intent and context. It then searches a database of audio files, employing techniques such as keyword extraction, semantic understanding, and even speech recognition, to match the query with relevant audio clips. Recent state-of-the-art deep learning methods will be thoroughly reviewed to identify their strengths and weaknesses. Additionally, these methods will undergo empirical evaluation to assess their performance in practical applications, providing insights into their effectiveness and potential improvements. These approaches enhance the efficiency of locating specific sounds, speeches, or music within large collections, making it especially useful in many applications.

Student cohort

Single Semester

Double Semester

Required knowledge

Python programming

Machine learning background

Primary supervisor

Student cohort

Required knowledge

Honours projects

Supervisor Connect

Browse

Recently added