0.0.3
Some updates:
- Added some more known bugs on seeding and non-deterministic cuda behavior
- Fixed bugs in padding for second stage.
- Differentiated scheduler between training transformers (LLM) in second or end-to-end stage vs training encoders in first stage
- A bit of re-organization to the structure of the configs/args
- Preprocessing pipeline cleanups