Less is More for Long Document Summary Evaluation by LLMs

Yunshu Wu; Hayate Iso; Pouya Pezeshkpour; Nikita Bhutani; Estevam Hruschka

Less is More for Long Document Summary Evaluation by LLMs

Yunshu Wu, Hayate Iso, Pouya Pezeshkpour, Nikita Bhutani, Estevam Hruschka

Add to Favorites

Main: Summarization Oral Paper

Session 8: Summarization (Oral)

Conference Room: Marie Louise 2

Conference Time: March 19, 16:00-17:30 (CET) (Europe/Malta)

TLDR:

RocketChat
Abstract

You can open the #paper-303-Oral channel in a separate window.

Abstract: Large Language Models (LLMs) have shown promising performance in summary evaluation tasks, yet they face challenges such as high computational costs and the \textit{Lost-in-the-Middle} problem where important information in the middle of long documents is often overlooked. To address these issues, this paper introduces a novel approach, Extract-then-Evaluate, which involves extracting key sentences from a long source document and then evaluating the summary by prompting LLMs. The results reveal that the proposed method not only significantly reduces evaluation costs but also exhibits a higher correlation with human evaluations. Furthermore, we provide practical recommendations for optimal document length and sentence extraction methods, contributing to the development of cost-effective yet more accurate methods for LLM-based text generation evaluation.