The reproduction result is not good on the Overall indicator.

The reproduction of the results on Overall is not very good. I ran it on V100, and here are my parameter settings and experimental results. May I ask what the reason is, or how should I reproduce it correctly? Thank you!
python main.py --token_level word-level \
            --model_type roberta \
            --model_dir dir_base \
            --task mixatis  \
            --data_dir data \
            --attention_mode label \
            --do_train \
            --do_eval \
            --num_train_epochs 100 \
            --intent_loss_coef 0.5 \
            --learning_rate 1e-5 \
            --train_batch_size 32 \
            --num_intent_detection \
            --use_crf

python main.py --token_level word-level \
            --model_type roberta \
            --model_dir misca \
            --task mixatis \
            --data_dir data \
            --attention_mode label \
            --do_train \
            --do_eval \
            --num_train_epochs 100 \
            --intent_loss_coef 0.5 \
            --learning_rate 1e-5 \
            --num_intent_detection \
            --use_crf \ 
            --base_model dir_base \
            --intent_slot_attn_type coattention
![not_good_overall](https://github.com/VinAIResearch/MISCA/assets/145429002/4e2ac516-38eb-4bdc-9f97-d9388bf466dd)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The reproduction result is not good on the Overall indicator. #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The reproduction result is not good on the Overall indicator. #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions