Skip to main content

Primary supervisor

Lizhen Qu

This project develops a focused prototype for LLM-assisted causal claim extraction and verification in mental health research. Clinical psychologists and psychiatrists increasingly rely on the rapidly growing biomedical literature to identify risk factors, evaluate treatments, and update practice, but the volume of new publications makes manual synthesis impossible. Large language models (LLMs) can read and summarise this literature at scale, but they routinely conflate correlation with causation, hallucinate non-existent findings, and offer no straightforward way for clinicians to check whether a stated causal claim is actually supported by the underlying evidence. This project investigates whether a lightweight multi-agent pipeline can extract causal claims from mental health publications, classify the strength of evidence supporting each claim, and flag claims that are unsupported, overstated, or contradicted by other sources.

The project aims to:

  • Build a pipeline that uses an LLM to extract structured causal claims (cause, effect, study population, study design, evidence type) from a curated corpus of mental health abstracts or full-text papers in a single chosen subdomain, such as depression risk factors or anxiety treatments.
  • Implement a small multi-agent setup with an extractor agent and a critic agent, in which the critic checks each extracted claim against its source passage and against related claims surfaced by a retrieval agent, producing a calibrated confidence score and a short justification.
  • Construct a small evaluation set with expert or proxy-expert annotations and report precision, recall, and calibration of the pipeline, comparing single-LLM and multi-agent configurations and conducting an error analysis that distinguishes hallucinations, correlation-to-causation overreaches, and overlooked confounders.

The expected outcome is a working prototype and an empirical study that characterises where current LLMs succeed and fail at causal-claim extraction in mental health, providing a baseline and a curated evaluation set for follow-on PhD-scale work on full neuro-symbolic causal discovery from clinical and literature data combined.