Skip to content

Commit 7705a7e

Browse files
committed
Added documentation
1 parent 8797fb3 commit 7705a7e

File tree

1 file changed

+62
-0
lines changed

1 file changed

+62
-0
lines changed

docs/user-guide/training.rst

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,3 +280,65 @@ finished training. However if the user wants to restart the model from a
280280
specific point they can do this by setting
281281
``config.hardware.files.warm_start`` to be the checkpoint they want to
282282
restart from..
283+
284+
*******************
285+
Transfer Learning
286+
*******************
287+
288+
Transfer learning allows the model to reuse knowledge from a previously
289+
trained checkpoint. This is particularly useful when the new task is
290+
related to the old one, enabling faster convergence and often improving
291+
model performance.
292+
293+
To enable transfer learning, set the config.training.transfer_learning
294+
flag to True in the configuration file.
295+
296+
.. code:: yaml
297+
298+
training:
299+
# start the training from a checkpoint of a previous run
300+
fork_run_id: '51a97d40a49e48d284494a3b5d87ef2b'
301+
load_weights_only: True
302+
transfer_learning: True
303+
304+
When this flag is active and a checkpoint path is specified in
305+
config.hardware.files.warm_start or self.last_checkpoint, the system
306+
loads the pre-trained weights using the `transfer_learning_loading`
307+
function. This approach ensures only compatible weights are loaded and
308+
mismatched layers are handled appropriately.
309+
310+
For example, transfer learning might be used to adapt a weather
311+
forecasting model trained on one geographic region to another region
312+
with similar characteristics.
313+
314+
****************
315+
Model Freezing
316+
****************
317+
318+
Model freezing is a technique where specific parts (submodules) of a
319+
model are excluded from training. This is useful when certain parts of
320+
the model have been sufficiently trained or should remain unchanged for
321+
the current task.
322+
323+
To specify which submodules to freeze, use the
324+
config.training.submodules_to_freeze field in the configuration. List
325+
the names of submodules to be frozen. During model initialization, these
326+
submodules will have their parameters frozen, ensuring they are not
327+
updated during training.
328+
329+
For example with the following configuration, the processor will be
330+
frozen and only the encoder and decoder will be trained:
331+
332+
.. code:: yaml
333+
334+
training:
335+
# start the training from a checkpoint of a previous run
336+
fork_run_id: '51a97d40a49e48d284494a3b5d87ef2b'
337+
load_weights_only: True
338+
339+
submodules_to_freeze:
340+
- processor
341+
342+
Freezing can be particularly beneficial in scenarios such as fine-tuning
343+
when only specific components (e.g., the encoder, the decoder) need to
344+
adapt to a new task while keeping others (e.g., the processor) fixed.

0 commit comments

Comments
 (0)