Analog Model Training Taking Longer in AIHWKIT, is this Normal? #695
Unanswered
adnanrana88
asked this question in
General
Replies: 1 comment 1 reply
-
Can you share the configuration you used for the Analog HWA. This is not a normal behavior. It might be an issue for the way you configured the experiment. Please look at the example we used: https://github.com/IBM/aihwkit/blob/master/examples/06_lenet5_hardware_aware.py |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am working with a 2-layer MLP model and using IBM AIHWKIT for Hardware-Aware (HWA) training. While training a digital model in PyTorch, I reach around 98% train accuracy and 91% validation accuracy in about 150 epochs.
However, after converting to an analog model, it takes nearly 1000 epochs to achieve similar accuracy. Sometimes, I notice that fewer epochs are needed in HWA, but in general, it takes significantly longer.
• My RPU configuration seems correct for both loading and HWA training.
• Is this longer training time typical for HWA? Does training directly in analog (without starting with a digital model) make a difference or correct instead of using the digital?
Also, is it always necessary to start with a digital model before converting to analog, or can I train directly in analog from the start? What would you recommend?
Thank you for your suggestions!
Beta Was this translation helpful? Give feedback.
All reactions