Background
Type: Conference Paper

Leveraging Retrieval-Augmented Generation for Persian University Knowledge Retrieval

Journal: ()Year: 2024/01/01Volume: Issue: Pages: 279 - 286
Hemmat A. Vadaei K. Heydari M.H.Fatemi A.a
GreenDOI:10.1109/IKT65497.2024.10892716Language: English

Abstract

This paper introduces an innovative approach using Retrieval-Augmented Generation (RAG) pipelines with Large Language Models (LLMs) to enhance information retrieval and query response systems for university-related question answering. By systematically extracting data from the university’s official website, primarily in Persian, and employing advanced prompt engineering techniques, we generate accurate and contextually relevant responses to user queries. We developed a comprehensive university benchmark, UniversityQuestionBench (UQB), to rigorously evaluate our system’s performance. UQB focuses on Persian-language data, assessing accuracy and reliability through various metrics and real-world scenarios. Our experimental results demonstrate significant improvements in the precision and relevance of generated responses, enhancing user experiences, and reducing the time required to obtain relevant answers. In summary, this paper presents a novel application of RAG pipelines and LLMs for Persian-language data retrieval, supported by a meticulously prepared university benchmark, offering valuable insights into advanced AI techniques for academic data retrieval and setting the stage for future research in this domain. © 2024 IEEE.


Author Keywords

Academic Question AnsweringKnowledge RetrievalLLMsLocal DatasetsData accuracyData miningInformation retrievalQuery languagesStructured Query Language

Other Keywords

Data accuracyData miningInformation retrievalQuery languagesStructured Query LanguageAcademic question answeringData retrievalInnovative approachesKnowledge retrievalLanguage modelLarge language modelLocal datasetPersian languagesPersiansQuestion Answering