Skip to main content

Primary supervisor

Wray Buntine

Co-supervisors

  • Khoa Doan, VinUniversity

AI is starting to be used for reviewing in international AI conferences.  For instance, http://paperreview.ai as announced by Andrew Ng as an “Agentic Reviewer” trained on ICLR 2025 reviews, and Google is field testing an AI-based Paper Assistant Tool for authors submtting to NeurIPS 2026.  Significant research is going into developing these tools, and some good test data exists collected from previous conferences.  In this project we will work with the VinUniversity AI group who is developing a benchmarking framework for evaluating AI reviewing.  Research papers usually contain important components like hypotheses, claims, empirical evidence and literature reviews.  One aspect not often explicity mentioned are the assumptions underlying the research.   One critical assumption is about construct validity, the empirical measurements in the paper do in fact measure the right thing.  We will explore how these might be evaluated.

Aim/outline

Take the testbed of 1000 AI papers from VinUniversity and explore using AI to extract their assumptions about construct validity and other assumptions.  Develop a strategy for proposing assumptions for AI empirical papers, or other areas.  Demonstrate the approach within their testbed.

URLs/references

http://paperreview.ai

Required knowledge

Good experience in using AI chatbots and some reading of AI research papers.  Python coding.