Skip to content

steph1793/DistilBertAbs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DistilBertSumm

This code is entirely based on the work of @nlpyang (https://github.com/nlpyang/PreSumm) which is based on EMNLP 2019 paper Text Summarization with Pretrained Encoders

Presumm, the original project is based on building summarization models using pretrained encoders, BERT. I was interested in this work in a way of building lighter versions of those models based on distilbert.

My first experiment was to build an extractive summarization model half the size of the orginal BertSumExt. Despite the drastic reduction of size, I reached the same results as BertSumExt (losing 1 point at most).

My next experiments will be to build an abstractive summarizartion model and try to achieve state of the art results.

Results on CNN/DailyMail (20/8/2019):

Models ROUGE-1 ROUGE-2 ROUGE-L
Extractive
TransformerExt 40.90 18.02 37.17
BertSumExt 43.23 20.24 39.63
BertSumExt (large) 43.85 20.34 39.90
Abstractive
TransformerAbs 40.21 17.76 37.09
BertSumAbs 41.72 19.39 38.76
BertSumExtAbs 42.13 19.60 39.18
My experiments
DistilBertSumExt (mine) 42.74 19.98 39.22
DistilBertSumExtAbs (mine) -- -- --
DistilBertSumAbs (mine) -- -- --

Python version: This code is in Python3.6

Package Requirements: torch==1.1.0 pytorch_transformers tensorboardX multiprocess pyrouge

Updates: For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Some codes are borrowed from ONMT(https://github.com/OpenNMT/OpenNMT-py)

About

📕🔀📄 Summarize with lighter models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages