Authors:
Tomohito Minami
;
Ryohei Orihara
;
Yasuyuki Tahara
;
Akihiko Ohsuga
and
Yuichi Sei
Affiliation:
The University of Electro-Communications, Chofu, Japan
Keyword(s):
Processing, Natural Language Generation, Language Models, Preference Learning, Humor Generation, Pun Generation.
Abstract:
Puns are clever wordplays that exploit sound similarities while contrasting different meanings. Such complex puns remain challenging to create, even with today’s advanced large language models. This study focuses on generating Japanese juxtaposed puns while preserving the original meaning of input sentences. We propose a novel approach, applying Direct Preference Optimization (DPO) after supervised fine-tuning (SFT) of a pre-trained language model, utilizing synthetic data generated from the SFT model to refine pun generation. Experimental results indicate that our approach yields a marked improvement, evaluated using neural network-based and rule-based metrics designed to measure pun-ness, with a 2.3-point increase and a 7.9-point increase, respectively, over the baseline SFT model. These findings suggest that integrating SFT with DPO enhances the model’s ability to capture phonetic nuances essential for generating juxtaposed puns.