Optimizing Abstractive Arabic Summarization via RLHF and DPO with Llama 2

Mram Kahla; Zijian Győző Yang

doi:10.14232/actacyb.316434

Authors

Mram Kahla Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary https://orcid.org/0000-0001-9885-8184
Zijian Győző Yang HUN-REN Hungarian Research Centre for Linguistics, Budapest, Hungary https://orcid.org/0000-0001-9955-860X

DOI:

https://doi.org/10.14232/actacyb.316434

Keywords:

abstractive summarization, Arabic, reinforcement learning, Direct Preference Optimization, RLHF, DPO, Llama 2

Abstract

Given the advantages observed with Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) in English, it is promising to explore their effectiveness for abstractive summarization in languages with complex morphological and syntactic features, such as Arabic. In this study, we fine-tune the Llama~2 model, which demonstrates a significant capability to enhance summarization results. We highlight how Llama 2, combined with advanced techniques like RLHF and DPO, markedly improves the quality of Abstractive Arabic summarization, showcasing the model's superior performance in this challenging task. Furthermore, the AraSum corpus plays a critical role in achieving outstanding results, highlighting its effectiveness in improving the performance of summarization models. While this work focuses on Arabic, the techniques and insights presented are language-agnostic, offering broader applications for abstractive summarization in other languages.

Downloads

Download data is not yet available.

Optimizing Abstractive Arabic Summarization via RLHF and DPO with Llama 2

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Developed By

Information

Make a Submission

Current Issue