[Enhancement] From-scratch model pre-training

This sample currently demonstrates:
- Fine-tuning existing models for downstream tasks (NER), and
- **Continuation** pre-training with unlabelled data *from an existing model checkpoint*.

From-scratch pre-training is considerably more resource-intensive. For example the [LayoutXLM paper](https://arxiv.org/pdf/2104.08836.pdf) describes using 64 V100 GPUs (i.e. 8x `p3.16xlarge` or `p3dn.24xlarge` instances for several hours) over ~30M documents.

However, some users may still be interested in from-scratch pre-training - especially for low-resource languages or specialised domains - if tested example code was available. Please drop a 👍 or a comment if this is an enhancement that you'd find useful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Enhancement] From-scratch model pre-training #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Enhancement] From-scratch model pre-training #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions