Large Language Models (LLMs) such as GPT, Llama, Qwen, and Mistral are increasingly used in commercial and academic applications. As more models become available, identifying which model generated a particular response becomes important for copyright auditing, model verification, and AI transparency.
Current fingerprinting methods often rely on manually selected benchmark questions. However, manually designing discriminative questions is time-consuming and may not capture unique behavioral differences between models.