Multi-Document Summarization Using Dependency Grammars

Multi-Document Summarization Using Dependency Grammars

Advisor: 

Arzucan Ozgur

Assigned to: 

Saziye Betul Bilgin

Type: 

Year: 

2014

Status: 

Summary:

Information overload is one of the greatest challenges in recent years, especially due to the rapid increase of data produced on the Internet. Automatic summarization of documents about similar topics is a salient solution to overcome this problem. There are mainly two approaches for this task, extractive multi-document summarization where the summary is created by selecting salient sentences from documents, and abstractive multi-document summarization where new sentences are generated using natural language generation methods. Sentence similarity calculation is significant in most of the extractive multi-document summarization approaches. In this study we introduce usage of dependency grammars to compute sentence similarity for extractive multi-document summarization problem. We adapt and investigate the effects of two untyped dependency tree based sentence similarity kernels, which have originally been proposed for relation extraction, to multi-document summarization problem. In addition, we propose a series of new dependency grammar based kernels to better represent the syntactic and semantic similarities among the sentences. The proposed methods incorporate type information of dependency relations for sentence similarity calculation. Our best method achieves significantly better scores than the untyped dependency tree based kernels. We observe that using dependency grammar representation of sentences leads better results in finding the similarities between sentences and the type of dependency relations is crucial in identifying the important parts in sentences.

Özet:

Information overload is one of the greatest challenges in recent years, especially due to the rapid increase of data produced on the Internet. Automatic summarization of documents about similar topics is a salient solution to overcome this problem. There are mainly two approaches for this task, extractive multi-document summarization where the summary is created by selecting salient sentences from documents, and abstractive multi-document summarization where new sentences are generated using natural language generation methods. Sentence similarity calculation is significant in most of the extractive multi-document summarization approaches. In this study we introduce usage of dependency grammars to compute sentence similarity for extractive multi-document summarization problem. We adapt and investigate the effects of two untyped dependency tree based sentence similarity kernels, which have originally been proposed for relation extraction, to multi-document summarization problem. In addition, we propose a series of new dependency grammar based kernels to better represent the syntactic and semantic similarities among the sentences. The proposed methods incorporate type information of dependency relations for sentence similarity calculation. Our best method achieves significantly better scores than the untyped dependency tree based kernels. We observe that using dependency grammar representation of sentences leads better results in finding the similarities between sentences and the type of dependency relations is crucial in identifying the important parts in sentences.

Contact us

Department of Computer Engineering, Boğaziçi University,
34342 Bebek, Istanbul, Turkey

  • Phone: +90 212 359 45 23/24
  • Fax: +90 212 2872461
 

Connect with us

We're on Social Networks. Follow us & get in touch.