Conference paper Open Access
Many articles on the same news are daily published by online newspapers and by various social media. To ease news article exploration sentence-based summarization algorithms aim at automatically generating for each news a summary consisting of the most salient sentences in the original articles. However, since sentence selection is error-prone, the automatically generated summaries are still subject to manual validation by domain experts. If the validation step not only focuses on pruning less relevant content but also on enriching summaries with missing yet relevant sentences this activity may become extremely time consuming.
The paper focuses on summarizing news articles by means of an itemset-based technique. To tune summarizer performance a relevance feedback given on sentences is exploited to drive the generation of a new, more targeted summary. The feedback indicates the pertinence of the sentences that are already in the summary. Among the words or the word combinations selected by the summarization model, those occurring in sentences with high feedback score represent concepts that may be deemed as particularly relevant. Therefore, they are exploited to drive the new sentence selection process.
The proposed approach was tested on collections of news articles reporting emergency situations. The results show the effectiveness of the proposed approach.