Retrieval-Augmented Generation (RAG) systems offer a promising approach for automating educational content creation, but challenges remain in optimizing retrieval configurations and ensuring instructional quality in outputs such as multiple-choice questions (MCQs). This thesis investigates how retrieval-stage design choices and large language model (LLM) selection affect the semantic precision and pedagogical alignment of MCQ generation in Java programming education. A modular two-phase RAG architecture was developed and evaluated. Phase 1 tested 432 retrieval configurations, varying chunking strategies, query processing, retrieval backends, and reranking methods. Configurations were scored using four reference-free metrics and ranked via a weighted composite scoring framework. Phase 2 used the best-performing retrieval setup to generate and evaluate 240 MCQs using five instruction-tuned LLMs under standardized conditions. Generated MCQs were scored using a rubric-based framework implemented via GPT-4o, assessing clarity, relevance, distractor quality, and cognitive alignment based on Bloom’s taxonomy. Results show that sliding window chunking and hybrid retrieval yield superior semantic performance. Claude 3 Opus and GPT-4o achieved the highest rubric scores, with 81% and 75% of their MCQs rated as Excellent, respectively. Qwen 2.5-7B followed closely at 67%, emerging as a competitive open-source alternative, and outperforming Mistral 7B (40%) and LLaMA 3.1-8B (31%). The use of a standardized prompt schema and grounded retrieval contributed to cognitive diversity, including coverage of higher-order Bloom levels, in generated questions. The system offers a reproducible framework for scalable MCQ generation and retrieval evaluation, with implications for AI-assisted assessment in structured learning domains.
| Date of Award | 2025-Jun |
|---|
| Original language | English |
|---|
| Supervisor | Charlotte Sennersten (Supervisor) & Craig Lindley (Supervisor) |
|---|
- Bachelor programme in Computer Software Development
Enhancing RAG-Based MCQ Generation for Java Programming Education: A Modular Evaluation of Chunking, Retrieval and LLM Performance
Olibo, E. (Author). 2025-Jun
Student thesis: Bachelor