Skip to main content

WHAT IS NEEDED FOR A GOOD SUMMARY? TWO DIFFERENT TYPES OF DOCUMENT SETS YET SEEMINGLY INDISTINGUISHABLE TO HUMAN USERS

Research Authors
Q. ZHAO, E. J. SANTOS, H. NGUYEN, AND A. MOHAMED
Research Department
Research Journal
IN PROCEEDINGS OF HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, HAWAII,
Research Rank
1
Research Year
2006
Research Abstract

Working with the DUC 2002 collection for multi-document summarization, we considered two types of document sets: sets consisting of closely correlated documents with highly overlapped content; and sets of diverse documents covering a wide scope of topics. Intuitively, this suggests that creating a quality summary would be more difficult for the latter case. The two types of document sets can be identified automatically by our document graph approach. However, human evaluators were shown to be fairly insensitive to this difference. This was identified when they were asked to rank the performance of automated summarizers. In this paper, we examine and analyze our experiment in order to better understand this phenomenon and how we might address it to improve summarization.