Loading Hierarchical models #22
Replies: 2 comments
-
Hi @Just-Strato, Thanks for raising this issue. It's true that the current implementation does not support post-hoc evaluation for hierarchical models, i.e., train model and then recall script to evaluate only. The saved parameters ( For this reason, I recently uploaded this demo script (https://github.com/coastalcph/lex-glue/blob/main/utils/load_hierbert.py), which loads the mode using I think the easiest (fastest) way to do that is by modifying the Line 286 in 22b6513 Then you can re-load model as presented in the demo code, something along these lines: if not training_args.do_train and model_args.hierarchical:
# Load Hierarchical BERT model
model_state_dict = torch.load(f'{training_args.output_dir}/pytorch_model.bin', map_location=torch.device('cpu'))
model.load_state_dict(model_state_dict) So, the model will use all saved parameters from the Would you like to give a try? I am for reviewing your PR, merging to the codebase, and give you credits 😄 |
Beta Was this translation helpful? Give feedback.
-
Thank you for your time @iliaschalkidis. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, i used the scripts and everything worked fine, i was able to train the models without any trouble.
The results shown with the testing after training are also coherent.
But the issue (at the end of the message) occurred when i tried to load the model it order to test to predict other samples.
It is not possible to load the model because there is a difference between the names of the layers expected and the layers in the file. As we can see in the error message (at the end), there are double occurences of "encoder" in some layer names of the saved file. When loading, the model does not use those layer names.
This problem happens with ECtHR (A & B) and Scotus tasks (maybe even others) with Bert models, it seems that the issue occurs when using hierarchical variant. When not using hierarchical, we dont have any problem to load the models after saving them. But the results are not as performant as they should be.
Do you have the same issue ? I am using Ubuntu 20.04 with python 3.8.
Beta Was this translation helpful? Give feedback.
All reactions