- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19
Description
Hi,
I use spanbert large model with default parameters in config file, and I get Avg F1 78.27, lower than Avg.F1 79.9 in paper.
config as following:
num_docs = 2802
bert_learning_rate = 1e-05
task_learning_rate = 0.0003
max_segment_len = 512
ffnn_size = 3000
cluster_ffnn_size = 3000
max_training_sentences = 3
bert_tokenizer_name = bert-base-cased
max_top_antecedents = 50
max_training_sentences = 5
top_span_ratio = 0.4
max_num_extracted_spans = 3900
max_num_speakers = 20
max_segment_len = 256
Learning
bert_learning_rate = 1e-5
task_learning_rate = 2e-4
loss_type = marginalized  # {marginalized, hinge}
mention_loss_coef = 0
false_new_delta = 1.5  # For loss_type = hinge
adam_eps = 1e-6
adam_weight_decay = 1e-2
warmup_ratio = 0.1
max_grad_norm = 1  # Set 0 to disable clipping
gradient_accumulation_steps = 1
Model hyperparameters.
coref_depth = 1  # when 1: no higher order (except for cluster_merging)
higher_order = attended_antecedent # {attended_antecedent, max_antecedent, entity_equalization, span_clustering, cluster_merging}
coarse_to_fine = true
fine_grained = true
dropout_rate = 0.3
ffnn_size = 1000
ffnn_depth = 1
cluster_ffnn_size = 1000   # For cluster_merging
cluster_reduce = mean  # For cluster_merging
easy_cluster_first = false  # For cluster_merging
cluster_dloss = false  # cluster_merging
num_epochs = 24
feature_emb_size = 20
max_span_width = 30
use_metadata = true
use_features = true
use_segment_distance = true
model_heads = true
use_width_prior = true  # For mention score
use_distance_prior = true  # For mention-ranking score
Other.
conll_eval_path = dev.english.v4_gold_conll  # gold_conll file for dev
conll_test_path = test.english.v4_gold_conll  # gold_conll file for test
genres = ["bc", "bn", "mz", "nw", "pt", "tc", "wb"]
eval_frequency = 1000
report_frequency = 100