Releases: instructlab/training
Releases · instructlab/training
v0.10.3
v0.11
What's Changed
- ci: Remove workflow that doesn't utilize training library (medium, -mp) by @booxter in #478
- Obey the FSDP sharding option default by @Maxusmusti in #486
- Change default internal sharding strategy to HYBRID_SHARD by @Maxusmusti in #488
- chore: Update the large e2e job to use fallback logic for selecting EC2 instances by @courtneypacheco in #491
- moves deepspeed requirements into their own file; add deepspeed extras by @JamesKunstle in #455
- chore: introduce dummy workflow by @cdoern in #497
- ci: Search for necessary instance for smoke job in multiple AZs by @booxter in #481
- ci: Fix -sdk fake workflow failure on actionlint by @booxter in #501
- build(deps): Bump actions/setup-python from 5.5.0 to 5.6.0 by @dependabot in #493
- use instructlab
constraints-dev.txt
in e2e test by @ktdreyer in #499 - build(deps): Bump step-security/harden-runner from 2.11.1 to 2.12.0 by @dependabot in #490
- ci: Use tox-current-env to reuse prepared venv with torch by @booxter in #482
- fix: extend nccl timeout by @cdoern in #507
- always log storage by @RobotSail in #510
- deps: Remove caps on ROCm dependencies by @courtneypacheco in #517
- ci: don't trigger pull_request_target job on its own workflow by @booxter in #519
- Enable pylint 'unused-argument' check by @fynnsu in #528
New Contributors
Full Changelog: v0.10.0...v0.11
v0.10.2 - Remove ROCm dependency caps
What's Changed
Full Changelog: v0.10.1...v0.10.2
v0.10.1 - Updating Default FSDP Sharding
What's Changed
- ci: Remove workflow that doesn't utilize training library (medium, -mp) by @booxter in #478
- Obey the FSDP sharding option default (backport #486) by @mergify in #487
- Change default internal sharding strategy to HYBRID_SHARD (backport #488) by @mergify in #489
Full Changelog: v0.10.0...v0.10.1
v0.10.0 - Updated FSDP Mixed Precision and Liger Kernel Model Option Support
What's Changed
- disables e2e-nvidia-l4-x1 test by @JamesKunstle in #454
- ci: Fix unit test run due to no tests found to execute by @booxter in #466
- ci: Don't run smoke tests when only irrelevant files are touched by @booxter in #460
- ci: don't waste ec2 resources on unit tests by @booxter in #464
- ci: Trigger unit test run on tox.ini change by @booxter in #469
- ci: Fix path filter for unit tests for the workflow file by @booxter in #461
- chore: Don't install pytest dependencies for coverage reports by @booxter in #468
- chore: Remove spell checks from the repo by @booxter in #458
- chore: Don't set ec2_runner_variant for unit tests by @booxter in #475
- Remove CHANGELOG.md by @booxter in #457
- Fix FSDP mixed precision setting and loss w/ accelerate by @Maxusmusti in #465
- fixes non-granite model instantiation with Liger Kernel by @JamesKunstle in #476
- ci: Install torch before flash-attn by @booxter in #474
- ci: Use pull_request as trigger for unit tests by @booxter in #473
- ci: Run unit tests for all supported python version, 3.11+ by @booxter in #472
- chore: Require python3.11+ by @booxter in #470
- chore: Drop pytest-asyncio by @booxter in #467
- chore: don't trigger unit tests for cuda and rocm requirements changes by @booxter in #463
- build(deps): Bump step-security/harden-runner from 2.10.4 to 2.11.1 by @dependabot in #452
- build(deps): Bump machulav/ec2-github-runner from 2.3.8 to 2.3.9 by @dependabot in #450
- build(deps): Bump aws-actions/configure-aws-credentials from 4.0.2 to 4.1.0 by @dependabot in #451
Full Changelog: v0.9.0...v0.10.0
v0.9.0
What's Changed
- build(deps): Bump machulav/ec2-github-runner from 2.3.8 to 2.3.9 by @dependabot in #431
- build(deps): Bump step-security/harden-runner from 2.11.0 to 2.11.1 by @dependabot in #439
- Adds Liger Kernels as optional optimization by @JamesKunstle in #441
- fix: model.forward now accepts return_dict via kwargs by @booxter in #443
- Adds smoke test workflow and tests by @JamesKunstle in #424
- change pytest targets.
test-unit
andtest-smoke
tounit
andsmoke
by @JamesKunstle in #453
Full Changelog: v0.8.0...v0.9.0
Training Release v0.8.1
What's Changed
Full Changelog: v0.8.0...v0.8.1
Training Release v0.8.0
What's Changed
- fixes unit ec2-name by @JamesKunstle in #409
- build(deps): Bump pypa/gh-action-pypi-publish from 1.12.3 to 1.12.4 by @dependabot in #410
- build(deps): Bump machulav/ec2-github-runner from 2.3.7 to 2.3.8 by @dependabot in #408
- syncs unit testing workflow setup with e2e setup by @JamesKunstle in #411
- ci: Add OpenAI keys into CI by @alimaredia in #415
- build(deps): Bump actions/setup-python from 5.3.0 to 5.4.0 by @dependabot in #418
- build(deps): Bump sarisia/actions-status-discord from 1.15.2 to 1.15.3 by @dependabot in #413
- ci: Don't require secrets in medium e2e test by @danmcp in #421
- Using AutoConfig to load model config file. by @abhi1092 in #416
- build(deps): Bump aws-actions/configure-aws-credentials from 4.0.2 to 4.0.3 by @dependabot in #417
- build(deps): Bump step-security/harden-runner from 2.10.4 to 2.11.0 by @dependabot in #426
- build(deps): Bump aws-actions/configure-aws-credentials from 4.0.3 to 4.1.0 by @dependabot in #425
- Generic notebook support by @Maxusmusti in #432
- build(deps): Bump actions/setup-python from 5.4.0 to 5.5.0 by @dependabot in #435
- Updating README and adding reasoning SFT example by @Maxusmusti in #436
- Updates data processing logic to remove dependency on hardcoded chat templates by @RobotSail in #428
New Contributors
Full Changelog: v0.7.0...v0.8.0
v0.7.0
What's Changed
- docs: include docs on installing deepspeed w/ cpuadam by @RobotSail in #333
- ci: Upload phase 1 & phase 2 training logs for loss graphs by @alimaredia in #356
- Add disk check after tests run by @danmcp in #361
- Updated token masking for new data "unmask" option (for pretraining samples) by @Maxusmusti in #357
- build(deps): Bump hynek/build-and-inspect-python-package from 2.10.0 to 2.11.0 by @dependabot in #366
- build(deps): Bump pypa/gh-action-pypi-publish from 1.12.2 to 1.12.3 by @dependabot in #364
- build(deps): Bump step-security/harden-runner from 2.10.1 to 2.10.2 by @dependabot in #355
- feat: add discord e2e status reporting by @RobotSail in #376
- Adjust to slack-github-action 2.0 api changes by @danmcp in #351
- build(deps): Bump slackapi/slack-github-action from 1.27.0 to 2.0.0 by @dependabot in #349
- adds pytest to tox via
py3-unit
by @JamesKunstle in #378 - build(deps): Bump rhysd/actionlint from 1.7.4 to 1.7.6 in /.github/workflows by @dependabot in #383
- build(deps): Bump DavidAnson/markdownlint-cli2-action from 18.0.0 to 19.0.0 by @dependabot in #381
- gh/actions unit test workflows by @JamesKunstle in #384
- changes Fast unit CI runner, m8g->m7i by @JamesKunstle in #389
- chore: Change default temporary write directory in all e2e CI jobs from
tmpfs
to/home/tmp
by @courtneypacheco in #390 - feat: retain only last checkpoint directory by @leseb in #358
- fix:
--keep_last_checkpoint_only
does not accept any values by @courtneypacheco in #397 - build(deps): Bump rhysd/actionlint from 1.7.6 to 1.7.7 in /.github/workflows by @dependabot in #400
- build(deps): Bump hynek/build-and-inspect-python-package from 2.11.0 to 2.12.0 by @dependabot in #406
- build(deps): Bump actions/stale from 9.0.0 to 9.1.0 by @dependabot in #405
- build(deps): Bump sarisia/actions-status-discord from 1.15.1 to 1.15.2 by @dependabot in #403
- build(deps): Bump step-security/harden-runner from 2.10.2 to 2.10.4 by @dependabot in #402
- build(deps): Bump DavidAnson/markdownlint-cli2-action from 19.0.0 to 19.1.0 by @dependabot in #401
- Remove optimum dependency by @fabiendupont in #407
New Contributors
- @alimaredia made their first contribution in #356
- @courtneypacheco made their first contribution in #390
- @leseb made their first contribution in #358
Full Changelog: v0.6.1...v0.7.0
v0.6.1
What's Changed
- fix: disable loss exporting for medium training job by @RobotSail in #347
- build(deps): Bump DavidAnson/markdownlint-cli2-action from 17.0.0 to 18.0.0 by @dependabot in #348
- Update Dependencies to Move DeepSpeed to CUDA Extras by @Maxusmusti in #350
Full Changelog: v0.6.0...v0.6.1