Working with the DUC 2002 collection for multi-document summarization, we considered two types of document sets: sets consisting of closely correlated documents with highly overlapped content; and sets of diverse documents covering a wide scope of topics. Intuitively, this suggests that creating a quality summary would be more difficult for the latter case. The two types of document sets can be identified automatically by our document graph approach. However, human evaluators were shown to be fairly insensitive to this difference. This was identified when they were asked to rank the performance of automated summarizers. In this paper, we examine and analyze our experiment in order to better understand this phenomenon and how we might address it to improve summarization.
Research Department
Research Journal
IN PROCEEDINGS OF HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, HAWAII,
Research Member
Research Rank
1
Research Year
2006
Research Abstract