Skip to content

Commit fda918d

Browse files
authored
Update README.md
1 parent cc9408b commit fda918d

File tree

1 file changed

+27
-19
lines changed

1 file changed

+27
-19
lines changed

README.md

Lines changed: 27 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,35 @@
1-
# Summarization
1+
# Summary Loop
22

3-
This repository groups the code to train a summarizer model without Summary supervision.
3+
This repository contains the code to apply the Summary Loop procedure to train a Summarizer in an unsupervised way, without example summaries.
44

5-
## Training procedure
5+
<p align="center">
6+
<img width="460" height="300" src="https://people.eecs.berkeley.edu/~phillab/images/summary_loop.png">
7+
</p>
68

7-
First need to have three models ready & pre-trained:
8-
- Coverage model, based on a BERT model, finetuned using the `pretrain_coverage.py` script. A keyword_extrator model is required as well. You can ask me for my file for standard BERT vocab or use the training script in `coverage.py` to make a keyword_extractor for your own vocab.
9-
- Fluency model, based on a GPT2 model, can use GPT2 directly or a finetuned version using `train_generator.py` (recommended finetuning on domain of summaries, such as news, legal, etc.)
10-
- Summarizer mode, based on a GPT2 model. Should use a GPT2 model finetuned to copy at first (using `train_generator.py --task copy`). The copy finetuning is recommended to teach the model to use the <END> token.
9+
## Training Procedure
1110

12-
Once the three model initializations are ready, the main training script can be run: `train_summarizer.py`. This script outputs a log file with 1 example / minute of summaries produced.
13-
14-
## Using Scorer models separately
11+
We provide pre-trained models for each component needed in the Summary Loop Release:
1512

16-
The Coverage and Fluency model scores can be used separately for comparison. They are respectively in `coverage.py` and `fluency.py`, each model is implemented as a class with a `score(document, summary)` function.
13+
- `keyword_extractor.joblib`: An sklearn pipeline that will extract can be used to compute tf-idf scores of words according to the BERT vocabulary, which is used by the Masking Procedure,
14+
- `bert_coverage.bin`: A bert-base-uncased finetuned model on the task of Coverage for the news domain,
15+
- `fluency_news_bs32.bin`: A GPT2 (base) model finetuned on a large corpus of news articles, used as the Fluency model,
16+
- `gpt2_copier23.bin`: A GPT2 (base) model that can be used as an initial point for the Summarizer model.
17+
18+
We also provide:
19+
- `pretrain_coverage.py` script to train a coverage model from scratch,
20+
- `train_generator.py` to train a fluency model from scratch (we recommend Fluency model on domain of summaries, such as news, legal, etc.)
21+
22+
Once all the pretraining models are ready, training a summarizer can be done using the `train_summary_loop.py`:
23+
```
24+
python train_summary_loop.py --experiment wikinews_test --dataset_file data/wikinews.db
25+
```
26+
27+
## Scorer Models
28+
29+
The Coverage and Fluency model scores can be used separately for analysis, evaluation, etc.
30+
They are respectively in `coverage.py` and `fluency.py`, each model is implemented as a class with a `score(document, summary)` function.
1731
Examples of how to run each model are included in the class files, at the bottom of the files.
1832

19-
# Obtaining the datasets & models
33+
## Further Questions
2034

21-
Contact me at phillab@berkeley.edu to obtain:
22-
- Datasets used for training (for now a large corpus of news articles).
23-
- Pretrained models:
24-
- Coverage model & Keyword Extractor
25-
- Fluency model (GPT2 finetuned on news)
26-
- Initial Summarizer (finetuned to copy)
27-
- Final Summarization models (three models: target length 10, 24, 45).
35+
Feel free to contact me at phillab@berkeley.edu to discuss the results, the code or future steps.

0 commit comments

Comments
 (0)