Social media has become a dominant means for users to share their opinions, emotions and daily experience of life. A large body of work has shown that informal exchanges such as online forums can be leveraged to supplement traditional approaches to a broad range of public health questions such as monitoring depression, domestic abuse, cancer, and epidemics. In this project, we will be looking at public discourse through the lens of online forums (i.e., Reddit) in order to identify intersecting topics (e.g., health-related conditions, treatment costs, medical support accessibility, finance, misinformation, food & nutrition) around eye disorders. The student is expected to gain highly-sought-after skills in the space of data collection (i.e., crawling subreddits), text analytics (i.e., topic modelling), and biomedical text processing (i.e., entity linking, grounding on biomedical knowledge graphs such as SNOMED CT, UMLS). The student will have the opportunity to share the findings with world-renowned Ophthalmologist (Professor Hiroshi Ishikawa), and work closely with Dr Bhavna Antony (former scientist with IBM, current research coordinator with Alfred Hospital). The findings of the work will be submitted to Bioinformatics.
- Proficiency in Python is required
- Working knowledge of NLP (in particular Named Entity Recognition, and Topic Modelling) is required
- Familiarity with the gensim and pytorch libraries is desired
- Familiarity with Text-based Transformer models and HuggingFace is desired
- Very good verbal and written communication skill is required