Releases: pytorch/rl
v0.2.1: Faster parallel envs, fixes in transforms and M1 wheel fix
What's Changed
- [Feature] Warning for
init_random_frames
rounding in collectors by @matteobettini in #1616 - [Feature] Add support of non-pickable gym env by @duburcqa in #1615
- [BugFix] Add keys to GAE in PPO/A2C by @vmoens in #1618
- [BugFix] Fix gym benchmark by @vmoens in #1619
- [BugFix] Fix shape setting in CompositeSpec by @vmoens in #1620
- [Deprecation] Deprecate ambiguous device for memmap replay buffer by @vmoens in #1624
- [CI] Fix CI (python and cuda versions) by @vmoens in #1621
- [Feature] Max Value Writer by @albertbou92 in #1622
- [CI] Cython<3 for d4rl by @vmoens in #1634
- [BugFix] make cursor a torch.long tensor by @vmoens in #1639
- [BugFix] Gracefully handle C++ import error in TorchRL by @vmoens in #1640
- [Feature] step_and_maybe_reset in env by @vmoens in #1611
- [BugFix] Avoid overlapping temporary dirs during training by @vmoens in #1635
- [Feature] Exclude all private keys in collectors by @vmoens in #1644
- [BugFix] Fix tutos by @vmoens in #1648
- [Feature] Lazy imports for implement_for during torchrl import by @vmoens in #1646
- [Refactor] Put all buffers on CPU in examples by @vmoens in #1645
- [BugFix] Fix storage device by @vmoens in #1650
- [BugFix] Fix EXAMPLES.md by @vmoens in #1649
- [Release] 0.2.1 by @vmoens in #1642
New Contributors
Full Changelog: v0.2.0...v0.2.1
0.2.0: Faster collection, MARL compatibility and RLHF prototype
TorchRL 0.2.0
This release provides many new features and bug fixes.
TorchRL now publishes Apple Silicon compatible wheels.
We drop coverage of python 3.7 in favour of 3.11.
New and updated algorithms
Most algorithms have been cleaned and designed to reach (at least) SOTA results.
Compatibility with MARL settings has been drastically improved, and we provide a good amount of MARL examples within the library:
A prototype RLHF training script is also proposed (#1597)
A whole new category of offline RL algorithms have been integrated: Decision transformers.
- [Algorithm] Update offpolicy examples by @BY571 in #1206
- [Algorithm] Online Decision transformer by @BY571 in #1149
- [Algorithm] QMixer loss and multiagent models by @matteobettini in #1378
- [Algorithm] RLHF end-to-end, clean by @vmoens in #1597
- [Algorithm] Update A2C examples by @albertbou92 in #1521
- [Algorithm] Update DDPG Example by @BY571 in #1525
- [Algorithm] Update DT by @BY571 in #1560
- [Algorithm] Update PPO examples by @albertbou92 in #1495
- [Algorithm] Update SAC Example by @BY571 in #1524
- [Algorithm] Update TD3 Example by @BY571 in #1523
New features
One of the major new features of the library is the introduction of the terminated / truncated / done distinction at no cost within the library. All third-party and primary environments are now compatible with this, as well as losses and data collection primitives (collector etc). This feature is also compatible with complex data structures, such as those found in MARL training pipelines.
All losses are now compatible with tensordict-free inputs, for a more generic deployment.
New transforms
Atari games can now benefit from a EndOfLifeTransform that allows to use the end-of-life as a done state in the loss (#1605)
We provide a KL transform to add a KL factor to the reward in RLHF settings.
Action masking is made possible through the ActionMask transform (#1421)
VC1 is also integrated for better image embedding.
- [Feature] Allow sequential transforms to work offline by @vmoens in #1136
- [Feature] ClipTransform + rename
min/maximum
->low/high
by @vmoens in #1500 - [Feature] End-of-life transform by @vmoens in #1605
- [Feature] KL Transform for RLHF by @vmoens in #1196
- [Features] Conv3dNet and PermuteTransform by @xmaples in #1398
- [Feature, Refactor] Scale in ToTensorImage based on the dtype and new from_int parameter by @hyerra in #1208
- [Feature] CatFrames used as inverse by @BY571 in #1321
- [Feature] Masking actions by @vmoens in #1421
- [Feature] VC1 integration by @vmoens in #1211
New models
We provide GRU alongside LSTM for POMDP training.
MARL model coverage is now richer of a MultiAgentMLP and MultiAgentCNN! Other improvments for MARL include coverage for nested keys in most places of the library (losses, data collection, environments...)/
- [Feature] Support for GRU by @vmoens in #1586
- [Feature] TanhModule by @vmoens in #1213
- [Features] Conv3dNet and PermuteTransform by @xmaples in #1398
- [Feature] CNN version of MultiAgentMLP by @MarkHaoxiang in #1479
Other features (misc)
- [Feature] RLHF Rollouts (reopened) by @vmoens in #1329
- [Feature] Add CQL by @BY571 in #1239
- [Feature] Allow multiple (nested) action, reward, done keys in
env
,vec_env
andcollectors
by @matteobettini in #1462 - [Feature] Auto-DoubleToFloat by @vmoens in #1442
- [Feature] CompositeSpec.lock by @vmoens in #1143
- [Feature] Device transform by @vmoens in #1472
- [Feature] Dispatch DiscreteSAC loss module by @Blonck in #1248
- [Feature] Dispatch PPO loss module by @Blonck in #1249
- [Feature] Dispatch REDQ loss module by @Blonck in #1251
- [Feature] Dispatch SAC loss module by @Blonck in #1244
- [Feature] Dispatch TD3 loss module by @Blonck in #1254
- [Feature] Dispatch for DDPG loss module by @Blonck in #1215
- [Feature] Dispatch for SAC loss module by @Blonck in #1223
- [Feature] Dispatch reinforce loss module by @Blonck in #1252
- [Feature] Distpatch IQL loss module by @Blonck in #1230
- [Feature] Fix DType casting lazy init by @vmoens in #1589
- [Feature] Heterogeneous Environments compatibility by @matteobettini in #1411
- [Feature] Log hparams from python dict by @matteobettini in #1517
- [Feature] MARL exploration e-greedy compatibility by @matteobettini in #1277
- [Feature] Make advantages compatible with Terminated, Truncated, Done by @vmoens in #1581
- [Feature] Make losses inherit from TDMBase by @vmoens in #1246
- [Feature] Making action masks compatible with q value modules and e-greedy by @matteobettini in #1499
- [Feature] Nested keys in
OrnsteinUhlenbeckProcess
by @matteobettini in #1305 - [Feature] Optional mapping of "state" in gym specs by @matteobettini in #1431
- [Feature] Parallel environments lazy heterogenous data compatibility by @matteobettini in #1436
- [Feature] Pettingzoo: add multiagent dimension to single agent groups by @matteobettini in #1550
- [Feature] RLHF Reward Model (reopened) by @vmoens in #1328
- [Feature] RLHF dataloading by @vmoens in #1309
- [Feature] RLHF networks by @apbard in #1319
- [Feature] Refactor categorical dists: Masked one-hot and pass-through gradients by @vmoens in #1488
- [Feature] ReplayBuffer.empty by @vmoens in #1238
- [Feature] Separate losses by @MateuszGuzek in #1240
- [Feature] Single call to value network in advantages [bis] by @vmoens in #1263
- [Feature] Single call to value network in advantages by @vmoens in #1256
- [Feature] TensorStorage by @vmoens in #1310
- [Feature] Threaded collection and parallel envs by @vmoens in #1559
- [Feature] Unbind specs by @vmoens in #1555
- [Feature] VMAS obs dict by @matteobettini in #1419
- [Feature] VMAS: choose between categorical or one-hot actions by @matteobettini in #1484
- [Feature] dispatch for DQNLoss by @vmoens in #1194
- [Feature] log histograms by @vmoens in #1306
- [Feature] make csv logger
exist_ok
on logging folder by @matteobettini in #1561 - [Feature] shifted for all adv by @vmoens in #1276
New environments and third-party improvements
We now cover SMAC-v2, PettingZoo, IsaacGymEnvs (prototype) and RoboHive. The D4RL dataset can now be used without the eponym library, which permit training with more recent or older versions of gym.
- [Environment, Docs] SMACv2 and docs on action masking by @matteobettini in #1466
- [Environment] Petting zoo by @matteobettini in #1471
- [Feature] D4rl direct download by @MateuszGuzek in #1430
- [Feature] Gym 'vectorized' envs compatibility by @vmoens in #1519
- [Feature] Gym compatibility: Terminal and truncated by @vmoens in #1539
- [Feature] IsaacGymEnvs integration by @vmoens in #1443
- [Feature] RoboHive integration by @vmoens in #1119
Performance improvements
We provide several speed improvements, in particular for data collection.
v0.1.1
What's Changed
- [Feature] Stacking specs by @vmoens in #892
- [Feature] Multicollector interruptor by @albertbou92 in #963
- [BugFix] VMAS api fix by @matteobettini in #978
- [CI] Fix D4RL tests in CI by @vmoens in #976
- [CI] Fix CI by @vmoens in #982
- [Refactor] Binary spec inherits from discrete spec by @matteobettini in #984
- [Feature]
_DataCollector
->DataCollectorBase
by @vmoens in #985 - [Feature] Discrete SAC by @BY571 in #882
- [Refactor, Doc] Refactor refs to SafeModule to TensorDictModule unless necessary by @vmoens in #986
- [BugFix] Quickfix by @vmoens in #991
- [Feature] Add Dropout to MLP module by @BY571 in #988
- [Feature] Warn when collectors collect more frames than requested by @matteobettini in #989
- [BugFix] make "_reset", "step_count", and other done_based keys follow done_spec by @matteobettini in #981
- [Feature] Bandit datasets by @vmoens in #912
- [BugFix] Fix sampling in PPO tutorial by @vmoens in #996
- [Refactor] Refactor losses (value function, doc, input batch size) by @vmoens in #987
- [BugFix,Feature,Doc] Fix replay buffers sampling info, docstrings and iteration by @vmoens in #1003
- [Feature] Replace ValueError by warning in collectors when total_frames is not an exact multiple of frames_per_batch by @albertbou92 in #999
- [BugFix] Only call replay buffer transforms when there are by @vmoens in #1008
- [BugFix] Patch tests in 1008 by @vmoens in #1009
- [Feature] Multidim value functions by @vmoens in #1007
- [BugFix] Fix exploration (OU and Gaussian) by @vmoens in #1006
- [CI] Fix python version in habitat by @vmoens in #1010
- Advantages pass
time_dim
and docfix by @matteobettini in #1014 - [Refactor] Faster transformed distributions by @vmoens in #1017
- [WIP, CI] Upgrade cuda channel by @vmoens in #1019
- [BugFix] Fix collector reset with truncation by @vmoens in #1021
- [Refactor] Improve collector performance by @matteobettini in #1020
- [BugFix] Fix params and buffer casting for policies by @vmoens in #1022
- [Feature] PPO allow entropy logging when entropy_coeff is 0 by @matteobettini in #1025
- [Feature] Distributed data collector (ray) by @albertbou92 in #930
- [Refactor] Minor changes in tensordict construction by @vmoens in #1029
- [CI] Fix Brax 0.9.0 by @vmoens in #1011
- [Feature] Multiagent API in vmas by @matteobettini in #983
- [Feature] Benchmarking worflow by @vmoens in #1028
- [Benchmark] Fix adv benchmark by @vmoens in #1030
- [Doc] Refactor DDPG and DQN tutos to narrow the scope by @vmoens in #979
- Revert "[Doc] Refactor DDPG and DQN tutos to narrow the scope" by @vmoens in #1032
- [BugFix] Advantage normalisation in ClipPPOLoss is done after computing gain1 by @albertbou92 in #1033
- [BugFix] Codecov SHA error by @vmoens in #1035
- [Doc] DDPG and DQN refactoring -- Doc cleaning by @vmoens in #1036
- [BugFix,CI] Fix macos codecov install by @vmoens in #1039
- [BugFix] kwargs update in distributed collectors by @vmoens in #1040
- [Feature]
make_composite_from_td
by @vmoens in #1042 - [Refactor] Import envpool locally to avoid importing gym at root level by @vmoens in #1041
- [Minor] Fix a typo by @FrankTianTT in #1046
- [BugFix] Fix param tying in loss modules by @vmoens in #1037
- [Refactor] less ad-hoc disable_env_checker check by @vmoens in #1047
- [Refactor] Improve distributed collectors by @vmoens in #1044
- [Doc] Document tensordict modules by @vmoens in #1053
- [Doc] Minor changes to contributing.md by @vmoens in #1054
- [Doc] A bit more doc on modules by @vmoens in #1056
- [Refactor] Import enum and interaction_type utils by @Goldspear in #1055
- [Feature] Deduplicate calls to common layers in PPO by @vmoens in #1057
- [BugFix] CompositeSpec nested key deletion by @btx0424 in #1059
- [Feature] Add MaskedCategorical distribution by @xiaomengy in #1012
- [Refactor] resetting envs in collectors always passes the _reset entry by @vmoens in #1061
- [Refactor] Better integration of QValue tools by @vmoens in #1063
- MUJOCO_INSTALLATION.md: Fix typo by @traversaro in #1064
- [Refactor] Removes "reward" from root tensordicts by @vmoens in #1065
- [Test] Fix tests for older pytorch versions by @vmoens in #1066
- [Feature] Reward2go Transform by @BY571 in #1038
- [CI] Reduce tests by @vmoens in #1071
- [Feature] Skip existing for advantage modules by @vmoens in #1070
- [BugFix] Fix parallel env data passing on cuda by @vmoens in #1024
- [Refactor] Deprecate interaction_mode by @vmoens in #1067
- [Doc] Update KB: cannot find -lGL by @vmoens in #1073
- [Doc] fix figures display issues in documentation of actors.py by @DamienAllonsius in #1074
- [Example] PPO simplified example by @albertbou92 in #1004
- [Feature] Update td in step (not overwrite) by @vmoens in #1075
- [CI] Remove migrated CircleCI macOS jobs by @seemethere in #1069
- [Feature] Target Return Transform by @BY571 in #1045
- [Test] Fix tensorboard tests with ImageIO 2.26 by @vmoens in #1083
- [Feature] LSTMModule by @vmoens in #1084
- [BugFix] Change default of skip_existing to None by @tcbegley in #1082
- [Example] A2C simplified example by @albertbou92 in #1076
- [BugFix] Fix output_spec transform calls by @vmoens in #1091
- [Feature] Indexing Discrete and OneHot specs by @remidomingues in #1081
- [Refactor] Refactor DQN by @vmoens in #1085
- [Feature] Auto-init updaters and raise a warning if not present by @vmoens in #1092
- [BugFix] Remove false warnings in losses by @vmoens in #1096
- [CI, BugFix] Fix CI warnings and errors by @vmoens in #1100
- [Refactor] Update vmap imports to torch by @vmoens in #1102
- [Refactor] Make advantages non-differentiable by default (except in losses) by @vmoens in #1104
- [Feature] Indexing specs by @remidomingues in #1105
- [BugFix] Fix EnvPoool by @vmoens in #1106
- [Feature,Doc] QValue refactoring and QNet + RNN tuto by @vmoens in #1060
- [BugFix] Fix Gym imports by @vmoens in #1023
- [CI] pytest should not skip tests for dependencies by @rohitnig in #1048
- [BugFix, Doc] Fix tutos by @vmoens in #1107
- [CI] Fix tutos (2) by @vmoens in #1109
- [Doc] Fix doc rendering by @vmoens in #1112
- Added the entry for skip-tests in the environment.yml by @rohitnig in #1113
- [CI] Upgrade ubuntu version in GHA by @vmoens in #1116
- Fix in windows unit test by @mischab in #1099
- Revert "Fix in windows unit test" by @mischab in #1117
- [Nova] Lint job on GHA by @osalpekar in #1114
- [Nova] Remove CircleCI Wheels Builds by @osalpekar in #1121
- [BugFix] Set exploration mode to MODE in all losses by default by @vmoens in #1123
- [BugFix] Instruct the v...
v0.1.0 - Beta
First official beta release of the library!
What's Changed
- QuickFix Versioning by @fedebotu in #958
- Version 0.0.5 by @vmoens in #957
- [Minor] Warning when loading memmap storage on uninitialized td by @vmoens in #961
- [Refactor] Defaults split_trajs to False by @vmoens in #947
- [Feature] InitTracker transform by @vmoens in #962
- [Feature] RenameTransform by @vmoens in #964
- [Feature] Implicit Q-Learning (IQL) by @BY571 in #933
- [Refactor] Refactor data collectors constructors by @vmoens in #970
- [Feature, Refactor] Iterable replay buffers by @vmoens in #968
- [Doc] README rewrite by @vmoens in #971
- [Refactor] A less verbose torchrl by @vmoens in #973
- [Feature]
torch.distributed
collectors by @vmoens in #934 - [Feature] Offline datasets: D4RL by @vmoens in #928
Full Changelog: v0.0.5...v0.1.0
0.0.5
We change the env.step API, see #941 for more info.
What's Changed
- [BugFix] Fix dreamer training loop by @vmoens in #915
- [Doc] PPO Tutorial by @vmoens in #913
- [Doc] Create your pendulum tutorial by @vmoens in #911
- [BugFix] Deploy doc by @vmoens in #920
- [BugFix] Nvidia not found fix by @vmoens in #922
- [Feature] Rework
to_one_hot
andto_categorical
to take a tensor as parameter by @riiswa in #816 - [Doc] Tutorial revamp by @vmoens in #926
- [BugFix] Fix EnvPool spec shapes by @vmoens in #932
- [BugFix] Fix CompositeSpec.to_numpy method by @riiswa in #931
- [CI] Do not run nightly workflows on forked repos by @XuehaiPan in #936
- [Refactor] set_default -> setdefault by @tcbegley in #935
- [BugFix] Step and maybe reset by @vmoens in #938
- [Doc] Minor doc improvements by @vmoens in #907
- [Doc] Add debug doc by @acohen13 in #940
- [BugFix] Propagate args to TransformedEnv's
state_dict
by @fedebotu in #944 - [BugFix] Vmas expanded specs by @matteobettini in #942
- [Quality] RB constuctors cleanup by @vmoens in #945
- [Doc] Refactor KB by @vmoens in #946
- [BugFix] Upgrade vision's functional import by @vmoens in #948
- [BugFix] Deprecate tensordict.set check skips in transforms by @vmoens in #951
- [BugFix] Upgrade tensordict deps by @vmoens in #953
- [CI] Fix windows CI by @vmoens in #954
- [Refactor] Refactor composite spec keys to match tensordict by @vmoens in #956
- [Refactor] Refactor the step to include reward and done in the 'next' tensordict by @vmoens in #941
New Contributors
- @XuehaiPan made their first contribution in #936
- @acohen13 made their first contribution in #940
- @fedebotu made their first contribution in #944
Full Changelog: v0.0.4...v0.0.5
v0.0.4-beta
What's Changed
- [CI, Doc] Update functorch source installation command by @zou3519 in #446
- [BugFix] TransformedEnv attributes inheritance by @vmoens in #467
- [Feature] Cleanup mocking envs init and new by @vmoens in #469
- [Tests] Adding tensordict
__repr__
tests by @sladebot in #435 - [Logging]: implement MLFlow logging integration by @rayanht in #432
- [BugFix] MLFlow import fix by @vmoens in #473
- [BugFix] Fixed pip install by @brandonsj in #475
- [Features]: Changed
_inplace_update
cls parameter passing in__new__
by @nicolas-dufour in #464 - [Feature]: ModelBased Envs by @nicolas-dufour in #333
- [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
- [Tutorial] DQN tutorial by @vmoens in #474
- [Feature] reader hooks for GymLike by @vmoens in #478
- [BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
- [Feature]: Support for planners and CEM by @nicolas-dufour in #384
- [Feature] Replaced
device_safe()
withdevice
by @ordinskiy in #485 - [Feature]:
TensorDictPrimer
transform by @nicolas-dufour in #456 - [Feature]:
erase()
method fortorchrl.timeit
by @nicolas-dufour in #480 - [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
- [BugFix] removing unwanted device_safe() by @vmoens in #486
- [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
- [Feature] VIP Integration by @JasonMa2016 in #487
- [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
- [Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
- [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
- [BugFix] Fix TensorDictPrimer init by @vmoens in #491
- [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
- [BugFix] Defaulting
passing_devices
toNone
by @himjohntang in #477 - Revert "[BugFix] Defaulting
passing_devices
toNone
" by @vmoens in #494 - [BugFix] Multi-agent fixes by @vmoens in #488
- [BugFix] Defaulting
passing_devices
toNone
by @vmoens in #495 - [Feature] Lazy initialization of CatTensors by @vmoens in #497
- [Cleanup] Removing cuda 10.2 references by @vmoens in #498
- [BugFix] Migration to pytorch org by @vmoens in #499
- [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
- [BugFix] python version for linting checks by @vmoens in #502
- [Feature] Replay Buffers refactor by @bamaxw in #330
- [Feature] Rename
step_tensordict
instep_mdp
by @romainjln in #512 - [Lint] re-instantiate F821 by @vmoens in #516
- [BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
- [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
- [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
- [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
- [CI] Add coverage with codecov by @silvestrebahi in #523
- Revert "[CI] Add coverage with codecov" by @vmoens in #525
- [Quality] Use relative imports for local c++ deps by @apbard in #526
- [Feature] Nightly release by @vmoens in #519
- [Feature] Add make_tensordict() function by @sicong-huang in #522
- [Doc] Misc readme fixes by @GavinPHR in #532
- [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
- [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
- [Doc] Add coverage banner by @vmoens in #533
- [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
- [BugFix] Fix optional imports by @vmoens in #535
- [BugFix] Restore missing keys in data collector output by @tcbegley in #521
- [Lint] reorganize imports by @apbard in #545
- [BugFix] Single-cpu compatibility by @vmoens in #548
- [BugFix] vision install and other deps in optdeps by @vmoens in #552
- [Feature] Implemented
device
argument formodules.models
by @yushiyangk in #524 - [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
- [BugFix] Additive gaussian exploration spec fix by @vmoens in #560
- [BugFix] Disabling video step for wandb by @vmoens in #561
- [BugFix] Various device fix by @vmoens in #558
- [Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
- [BugFix] Fix push binary nightly action by @psolikov in #566
- [BugFix] TensorDict comparison by @vmoens in #567
- [BugFix] Fix SyncDataCollector reset by @jrobine in #571
- [Doc] Banners on README.md by @vmoens in #572
- [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
- [BugFix] Add eps to reward normalization by @vmoens in #574
- [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
- [Feature] Restructure torchrl/objectives by @sgrigory in #580
- [Docs] Documentation revamp by @vmoens in #581
- [Doc] Publishing on pytorch.org by @vmoens in #582
- Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
- [Doc] Publishing on pytorch.org by @vmoens in #585
- Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
- [Doc] Publishing on pytorch.org by @vmoens in #587
- [Feature] More restrictive tests on docstrings by @vmoens in #457
- [BugFix] Wrong stack import in tests by @vmoens in #590
- [Feature] Exclude
"_"
out_keys in tensordictmodel by @jlesuffleur in #589 - [Feature]: Dreamer support by @nicolas-dufour in #341
- [Doc] Missing doc for prototype RB by @vmoens in #595
- [Feature] Update list of supported libraries by @vmoens in #594
- [BugFix] Fix timeit count registration by @vmoens in #598
- [Naming] Renaming
ProbabilisticTensorDictModule
keys by @vmoens in #603 - [Feature] Categorical encoding for action space by @artkorenev in #593
- [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
- [Doc] Typos in tensordict tutorial by @PaLeroy in #621
- [Doc] Integrate knowledge base in docs by @hatala91 in #622
- [Doc] Updating docs requirements by @vmoens in #624
- [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
- [Feature] Habitat integration by @vmoens in #514
- [Feature] Checkpointing by @vmoens in #549
- Add support for null
dim
argument inTensorDict.squeeze
by @jgonik in #608 - [Version] Updating to torch 1.13 by @vmoens in #627
- [Feature] Sub-memmap tensors by @vmoens in #626
- [BugFix]
copy_
changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631 - [F...
v0.0.4
What's Changed
- [CI, Doc] Update functorch source installation command by @zou3519 in #446
- [BugFix] TransformedEnv attributes inheritance by @vmoens in #467
- [Feature] Cleanup mocking envs init and new by @vmoens in #469
- [Tests] Adding tensordict
__repr__
tests by @sladebot in #435 - [Logging]: implement MLFlow logging integration by @rayanht in #432
- [BugFix] MLFlow import fix by @vmoens in #473
- [BugFix] Fixed pip install by @brandonsj in #475
- [Features]: Changed
_inplace_update
cls parameter passing in__new__
by @nicolas-dufour in #464 - [Feature]: ModelBased Envs by @nicolas-dufour in #333
- [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
- [Tutorial] DQN tutorial by @vmoens in #474
- [Feature] reader hooks for GymLike by @vmoens in #478
- [BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
- [Feature]: Support for planners and CEM by @nicolas-dufour in #384
- [Feature] Replaced
device_safe()
withdevice
by @ordinskiy in #485 - [Feature]:
TensorDictPrimer
transform by @nicolas-dufour in #456 - [Feature]:
erase()
method fortorchrl.timeit
by @nicolas-dufour in #480 - [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
- [BugFix] removing unwanted device_safe() by @vmoens in #486
- [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
- [Feature] VIP Integration by @JasonMa2016 in #487
- [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
- [Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
- [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
- [BugFix] Fix TensorDictPrimer init by @vmoens in #491
- [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
- [BugFix] Defaulting
passing_devices
toNone
by @himjohntang in #477 - Revert "[BugFix] Defaulting
passing_devices
toNone
" by @vmoens in #494 - [BugFix] Multi-agent fixes by @vmoens in #488
- [BugFix] Defaulting
passing_devices
toNone
by @vmoens in #495 - [Feature] Lazy initialization of CatTensors by @vmoens in #497
- [Cleanup] Removing cuda 10.2 references by @vmoens in #498
- [BugFix] Migration to pytorch org by @vmoens in #499
- [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
- [BugFix] python version for linting checks by @vmoens in #502
- [Feature] Replay Buffers refactor by @bamaxw in #330
- [Feature] Rename
step_tensordict
instep_mdp
by @romainjln in #512 - [Lint] re-instantiate F821 by @vmoens in #516
- [BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
- [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
- [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
- [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
- [CI] Add coverage with codecov by @silvestrebahi in #523
- Revert "[CI] Add coverage with codecov" by @vmoens in #525
- [Quality] Use relative imports for local c++ deps by @apbard in #526
- [Feature] Nightly release by @vmoens in #519
- [Feature] Add make_tensordict() function by @sicong-huang in #522
- [Doc] Misc readme fixes by @GavinPHR in #532
- [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
- [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
- [Doc] Add coverage banner by @vmoens in #533
- [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
- [BugFix] Fix optional imports by @vmoens in #535
- [BugFix] Restore missing keys in data collector output by @tcbegley in #521
- [Lint] reorganize imports by @apbard in #545
- [BugFix] Single-cpu compatibility by @vmoens in #548
- [BugFix] vision install and other deps in optdeps by @vmoens in #552
- [Feature] Implemented
device
argument formodules.models
by @yushiyangk in #524 - [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
- [BugFix] Additive gaussian exploration spec fix by @vmoens in #560
- [BugFix] Disabling video step for wandb by @vmoens in #561
- [BugFix] Various device fix by @vmoens in #558
- [Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
- [BugFix] Fix push binary nightly action by @psolikov in #566
- [BugFix] TensorDict comparison by @vmoens in #567
- [BugFix] Fix SyncDataCollector reset by @jrobine in #571
- [Doc] Banners on README.md by @vmoens in #572
- [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
- [BugFix] Add eps to reward normalization by @vmoens in #574
- [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
- [Feature] Restructure torchrl/objectives by @sgrigory in #580
- [Docs] Documentation revamp by @vmoens in #581
- [Doc] Publishing on pytorch.org by @vmoens in #582
- Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
- [Doc] Publishing on pytorch.org by @vmoens in #585
- Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
- [Doc] Publishing on pytorch.org by @vmoens in #587
- [Feature] More restrictive tests on docstrings by @vmoens in #457
- [BugFix] Wrong stack import in tests by @vmoens in #590
- [Feature] Exclude
"_"
out_keys in tensordictmodel by @jlesuffleur in #589 - [Feature]: Dreamer support by @nicolas-dufour in #341
- [Doc] Missing doc for prototype RB by @vmoens in #595
- [Feature] Update list of supported libraries by @vmoens in #594
- [BugFix] Fix timeit count registration by @vmoens in #598
- [Naming] Renaming
ProbabilisticTensorDictModule
keys by @vmoens in #603 - [Feature] Categorical encoding for action space by @artkorenev in #593
- [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
- [Doc] Typos in tensordict tutorial by @PaLeroy in #621
- [Doc] Integrate knowledge base in docs by @hatala91 in #622
- [Doc] Updating docs requirements by @vmoens in #624
- [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
- [Feature] Habitat integration by @vmoens in #514
- [Feature] Checkpointing by @vmoens in #549
- Add support for null
dim
argument inTensorDict.squeeze
by @jgonik in #608 - [Version] Updating to torch 1.13 by @vmoens in #627
- [Feature] Sub-memmap tensors by @vmoens in #626
- [BugFix]
copy_
changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631 - [F...
v0.0.4-alpha
What's Changed
- [CI, Doc] Update functorch source installation command by @zou3519 in #446
- [BugFix] TransformedEnv attributes inheritance by @vmoens in #467
- [Feature] Cleanup mocking envs init and new by @vmoens in #469
- [Tests] Adding tensordict
__repr__
tests by @sladebot in #435 - [Logging]: implement MLFlow logging integration by @rayanht in #432
- [BugFix] MLFlow import fix by @vmoens in #473
- [BugFix] Fixed pip install by @brandonsj in #475
- [Features]: Changed
_inplace_update
cls parameter passing in__new__
by @nicolas-dufour in #464 - [Feature]: ModelBased Envs by @nicolas-dufour in #333
- [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
- [Tutorial] DQN tutorial by @vmoens in #474
- [Feature] reader hooks for GymLike by @vmoens in #478
- [BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
- [Feature]: Support for planners and CEM by @nicolas-dufour in #384
- [Feature] Replaced
device_safe()
withdevice
by @ordinskiy in #485 - [Feature]:
TensorDictPrimer
transform by @nicolas-dufour in #456 - [Feature]:
erase()
method fortorchrl.timeit
by @nicolas-dufour in #480 - [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
- [BugFix] removing unwanted device_safe() by @vmoens in #486
- [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
- [Feature] VIP Integration by @JasonMa2016 in #487
- [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
- [Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
- [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
- [BugFix] Fix TensorDictPrimer init by @vmoens in #491
- [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
- [BugFix] Defaulting
passing_devices
toNone
by @himjohntang in #477 - Revert "[BugFix] Defaulting
passing_devices
toNone
" by @vmoens in #494 - [BugFix] Multi-agent fixes by @vmoens in #488
- [BugFix] Defaulting
passing_devices
toNone
by @vmoens in #495 - [Feature] Lazy initialization of CatTensors by @vmoens in #497
- [Cleanup] Removing cuda 10.2 references by @vmoens in #498
- [BugFix] Migration to pytorch org by @vmoens in #499
- [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
- [BugFix] python version for linting checks by @vmoens in #502
- [Feature] Replay Buffers refactor by @bamaxw in #330
- [Feature] Rename
step_tensordict
instep_mdp
by @romainjln in #512 - [Lint] re-instantiate F821 by @vmoens in #516
- [BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
- [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
- [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
- [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
- [CI] Add coverage with codecov by @silvestrebahi in #523
- Revert "[CI] Add coverage with codecov" by @vmoens in #525
- [Quality] Use relative imports for local c++ deps by @apbard in #526
- [Feature] Nightly release by @vmoens in #519
- [Feature] Add make_tensordict() function by @sicong-huang in #522
- [Doc] Misc readme fixes by @GavinPHR in #532
- [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
- [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
- [Doc] Add coverage banner by @vmoens in #533
- [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
- [BugFix] Fix optional imports by @vmoens in #535
- [BugFix] Restore missing keys in data collector output by @tcbegley in #521
- [Lint] reorganize imports by @apbard in #545
- [BugFix] Single-cpu compatibility by @vmoens in #548
- [BugFix] vision install and other deps in optdeps by @vmoens in #552
- [Feature] Implemented
device
argument formodules.models
by @yushiyangk in #524 - [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
- [BugFix] Additive gaussian exploration spec fix by @vmoens in #560
- [BugFix] Disabling video step for wandb by @vmoens in #561
- [BugFix] Various device fix by @vmoens in #558
- [Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
- [BugFix] Fix push binary nightly action by @psolikov in #566
- [BugFix] TensorDict comparison by @vmoens in #567
- [BugFix] Fix SyncDataCollector reset by @jrobine in #571
- [Doc] Banners on README.md by @vmoens in #572
- [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
- [BugFix] Add eps to reward normalization by @vmoens in #574
- [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
- [Feature] Restructure torchrl/objectives by @sgrigory in #580
- [Docs] Documentation revamp by @vmoens in #581
- [Doc] Publishing on pytorch.org by @vmoens in #582
- Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
- [Doc] Publishing on pytorch.org by @vmoens in #585
- Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
- [Doc] Publishing on pytorch.org by @vmoens in #587
- [Feature] More restrictive tests on docstrings by @vmoens in #457
- [BugFix] Wrong stack import in tests by @vmoens in #590
- [Feature] Exclude
"_"
out_keys in tensordictmodel by @jlesuffleur in #589 - [Feature]: Dreamer support by @nicolas-dufour in #341
- [Doc] Missing doc for prototype RB by @vmoens in #595
- [Feature] Update list of supported libraries by @vmoens in #594
- [BugFix] Fix timeit count registration by @vmoens in #598
- [Naming] Renaming
ProbabilisticTensorDictModule
keys by @vmoens in #603 - [Feature] Categorical encoding for action space by @artkorenev in #593
- [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
- [Doc] Typos in tensordict tutorial by @PaLeroy in #621
- [Doc] Integrate knowledge base in docs by @hatala91 in #622
- [Doc] Updating docs requirements by @vmoens in #624
- [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
- [Feature] Habitat integration by @vmoens in #514
- [Feature] Checkpointing by @vmoens in #549
- Add support for null
dim
argument inTensorDict.squeeze
by @jgonik in #608 - [Version] Updating to torch 1.13 by @vmoens in #627
- [Feature] Sub-memmap tensors by @vmoens in #626
- [BugFix]
copy_
changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631 - [F...
v0.0.3
The main changes introduced by this release are:
- dependency on the standalone tensordict repo;
- refactoring of the "next" API
What's Changed
- [Versioning] MacOs versioning and release bugfix by @vmoens in #247
- [Versioning] Setup metadata by @vmoens in #248
- [BugFix] Fix setup instructions by @vmoens in #250
- [BugFix] Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in #251
- [Feature] Added test for RewardRescale transform by @nicolas-dufour in #252
- [Feature] Empty TensorDict population in loops by @vmoens in #253
- [BugFix] Memmap del bugfix by @vmoens in #254
- [Feature] Implement padding for tensordicts by @ajhinsvark in #257
- [BugFix]: recursion error when calling
permute(...).to_tensordict()
by @vmoens in #260 - [Feature] Differentiable PPOLoss for IRL by @vmoens in #240
- [BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in #261
- [Feature] Add issue and pull request template by @Benjamin-eecs in #263
- [Feature] Nested tensordicts by @vmoens in #256
- [Feature]: Index nested tensordicts using tuples by @vmoens in #262
- [Feature]: flatten nested tensordicts by @vmoens in #264
- [Test]: test nested CompositeSpec by @vmoens in #265
- [Test]: test squeezed TensorDict by @vmoens in #269
- [Doc] Added TensorDict tutorial by @nicolas-dufour in #255
- [Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in #268
- Refactor the
torch.stack
with destination by @khmigor in #245 - [Feature]: faster meta-tensor API for TensorDict by @vmoens in #272
- [Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in #270
- Small tweaks to make the replay buffer code more consistent by @shagunsodhani in #275
- [BugFix]: Minor bugs in docstrings by @vmoens in #276
- [Doc]: TorchRL demo by @vmoens in #284
- [BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in #286
- [BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in #283
- [Doc]: remove pip install from CONTRIBUTING.md by @vmoens in #288
- [Feature]: faster safetanh transform via C++ bindings by @vmoens in #289
- [BugFix]: fix GLFW3 error when installing dm_control by @vmoens in #291
- [BugFix]: Fix examples by @vmoens in #290
- [Doc] Simplify PR template by @vmoens in #292
- [BugFix]: Replay buffer bugfixes by @vmoens in #294
- [Doc] MacOs M1 troubleshooting by @ramonmedel in #296
- [Feature]: Improving training efficiency by @vmoens in #293
- [Feature] Wandb logger by @nicolas-dufour in #274
- [QuickFix]: update issue and pr template by @Benjamin-eecs in #303
- [Test] tests for
BinarizeReward
by @srikanthmg85 in #302 - [BugFix]: L2-priority for PRB by @vmoens in #305
- [Feature] Transforms:
Compose.insert
andTransformedEnv.insert_transform
by @rmartimov in #304 - [BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in #306
- [BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in #307
- ufmt issue if imports in order requested by distutils by @nairbv in #308
- [BugFix]: Conda to pip for circleci by @vmoens in #310
- [BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in #299
- [Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in #295
- [Doc] Tensordictmodule tutorial by @nicolas-dufour in #267
- [Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in #316
- [Release]: v0.0.1b versioning by @vmoens in #317
- [Feature] Adding additional checks to
TensorDict.view
to remove unnecessaryViewedTensorDict
object creation by @bamaxw in #319 - [BugFix]: Safe state normalization when std=0 by @vmoens in #323
- [BugFix]: gradient propagation in advantage estimates by @vmoens in #322
- [BugFix]: make training example gracefully exit by @vmoens in #326
- [Setup]: Exclude tutorials from wheels by @vmoens in #325
- [BugFix]: Tensor map for subtensordict.set_ by @vmoens in #324
- [Versioning]: Wheels v0.0.1c by @vmoens in #327
- [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in #328
- [BugFix] functorch installation in CircleCI by @vmoens in #336
- [Refactor] VecNorm inference API by @vmoens in #337
- [BugFix] TransformedEnv sets added Transforms into eval mode by @alexanderlobov in #331
- [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in #334
- [CircleCI] Fix dm_control rendering by @vmoens in #339
- [BugFix]: joining processes when they're done by @vmoens in #311
- [Test] pass the OS error in case the file isn't closed by @tongbaojia in #344
- [Feature] Make default rollout tensordict contiguous by @vmoens in #343
- [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in #340
- [CI] Using latest gym by @vmoens in #346
- [Doc] Coding your first DDPG tutorial by @vmoens in #345
- [Doc] Minor: typos in DDPG by @vmoens in #354
- [Feature] Register lambda and gamma in buffers by @vmoens in #353
- [Feature] Implement eq for TensorSpec by @omikad in #358
- [Doc] Multi-tasking tutorial by @vmoens in #352
- [Feature] Env refactoring for model based RL by @nicolas-dufour in #315
- [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in #332
- [BugFix] Add lock to vec norm transform by @jaschmid-fb in #356
- [Perf]: Improve PPO training performance by @vmoens in #297
- [BugFix] Functorch-Tensordict bug fixes by @vmoens in #361
- Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in #362
- [BugFix] Functorch-Tensordict bug fixes by @vmoens in #363
- [Feature] CSVLogger (ABBANDONED) by @vmoens in #371
- [Feature] Support tensor-based decay in TD-lambda by @tcbegley in #360
- [Feature] CSVLogger by @vmoens in #372
- [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in #378
- [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in #379
- [BugFix] Representation of indexed nested tensordict by @vmoens in #370
- [BugFix] In-place
__setitem__
for SubTensorDict by @vmoens in #369 - [Feature] Add
ProbabilisticTensorDictModule
dist key mapping support by @nicolas-dufour in #376 - [Feature]: R3M integration by @vmoens in #321
- [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in #385
- [Feature] AdditiveGaussian exploration strategy by @vmoens in #388
- [Feature] Multi-images R3M by @vmoens in #389
- [Feature] Flatten multi-images in R3M by @vmoens in #391
- [Quality] Code cleanup for fbsync by @vmoens in #392
- [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/pytorch...
0.0.2a
What's Changed
- [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in #328
- [BugFix] functorch installation in CircleCI by @vmoens in #336
- [Refactor] VecNorm inference API by @vmoens in #337
- TransformedEnv sets added Transforms into eval mode by @alexanderlobov in #331
- [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in #334
- [CircleCI] Fix dm_control rendering by @vmoens in #339
- [BugFix]: joining processes when they're done by @vmoens in #311
- [Test] pass the OS error in case the file isn't closed by @tongbaojia in #344
- [Feature] Make default rollout tensordict contiguous by @vmoens in #343
- [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in #340
- [CI] Using latest gym by @vmoens in #346
- [Doc] Coding your first DDPG tutorial by @vmoens in #345
- [Doc] Minor: typos in DDPG by @vmoens in #354
- [Feature] Register lambda and gamma in buffers by @vmoens in #353
- [Feature] Implement eq for TensorSpec by @omikad in #358
- [Doc] Multi-tasking tutorial by @vmoens in #352
- [Feature] Env refactoring for model based RL by @nicolas-dufour in #315
- [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in #332
- [BugFix] Add lock to vec norm transform by @jaschmid-fb in #356
- [Perf]: Improve PPO training performance by @vmoens in #297
- [BugFix] Functorch-Tensordict bug fixes by @vmoens in #361
- Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in #362
- [BugFix] Functorch-Tensordict bug fixes by @vmoens in #363
- [Feature] CSVLogger (ABBANDONED) by @vmoens in #371
- [Feature] Support tensor-based decay in TD-lambda by @tcbegley in #360
- [Feature] CSVLogger by @vmoens in #372
- [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in #378
- [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in #379
- [BugFix] Representation of indexed nested tensordict by @vmoens in #370
- [BugFix] In-place
__setitem__
for SubTensorDict by @vmoens in #369 - [Feature] Add
ProbabilisticTensorDictModule
dist key mapping support by @nicolas-dufour in #376 - [Feature]: R3M integration by @vmoens in #321
- [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in #385
- [Feature] AdditiveGaussian exploration strategy by @vmoens in #388
- [Feature] Multi-images R3M by @vmoens in #389
- [Feature] Flatten multi-images in R3M by @vmoens in #391
- [Quality] Code cleanup for fbsync by @vmoens in #392
- [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in #387
- [Quality] Code cleanup for fbsync by @vmoens in #397
- [Doc] Add charts to examples by @nicolas-dufour in #374
- [Feature] Vectorized GAE by @vmoens in #365
- [BugFix] Temporarily fix gym to 0.25.1 to fix CI by @vmoens in #411
- [Feature] Create a Squeeze transform and update Unsqueeze transform by @reachsumit in #408
- [Naming] Recurse kwarg to match pytorch by @matt-fff in #410
- [Feature] Add all implemented loggers to the init of loggers by @flinder in #402
- [BugFix] Fix gym 0.26 compatibility by @vmoens in #403
- [BugFix] Remove submodules by @vmoens in #414
- [Feature] lock tensordict when calling
share_memory_()
by @fdabek1 in #412 - [BugFix] Updated TensorDict.expand to work as Tensor.expand by @AnshulSehgal in #409
- [BugFix] Looser check for test_recorder assertion by @vmoens in #415
- [Feature] Allow spec to be passed directly to exploration wrappers by @vmoens in #418
- [BugFix] Collector revert to default exploration mode if empty string is passed by @vmoens in #421
- [Naming] Rename _TargetNetUpdate to TargetNetUpdater, making it public by @yushiyangk in #422
- [Doc] Re-run tutorials by @vmoens in #381
- Revert "[Doc] Re-run tutorials" (colab links broken) by @vmoens in #423
- [Feature] Switch back to latest gym by @vmoens in #425
- [Feature] TensorDict without device by @tcbegley in #413
- Updated the README.md file by @bashnick in #427
- [Feature] Adding support for initialising TensorDicts from nested dicts by @zeenolife in #404
- [Features] Make image_size a cfg param by @nicolas-dufour in #430
- Make TensorDict.expand accept Sequence arguments by @nicolasgriffiths in #424
- [Doc] Readme revamp for efficiency/modularity display by @vmoens in #382
- [Feature] New
biased_softplus
semantic to allow for minimum scale setting by @nicolas-dufour in #428 - [Tutorial] Re-run tutos by @vmoens in #434
- [BugFix] mixed device_safe vs device by @vmoens in #429
- [BugFix] Explicit params and buffers by @agrotov in #436
- [BugFix] Fixed Additive noise by @nicolas-dufour in #441
- [Tests] Test loggers video saving by @bashnick in #439
- Revert "[BugFix] Fixed Additive noise" by @vmoens in #442
- [Refactor] Rename TensorDictSequence to TensorDictSequential by @ronert in #440
- [Refactor] Refactoring
set*()
methods forTensorDictBase
class by @zeenolife in #438 - [Cleanup] Removing gym-retro interface by @vmoens in #444
- [BugFix]: Fix additive noise by @nicolas-dufour in #447
- [BugFix] CatTensors: Prepended
next_
to the out_key by @ggimler3 in #449 - [BugFix] Fix AdditiveGaussian exploration tests by @vmoens in #450
- [BugFix] Wrong call to
device_safe
in replay buffer code by @vmoens in #454 - [BugFix] Add transform_observation_spec _R3MNet by @ymwdalex in #443
- [Doc] Add a knowledge base by @shagunsodhani in #375
- [Feature] Allow for actions and rewards to be in the reset tensordict by @vmoens in #458
- [Doc] Readme for knowledge base by @vmoens in #459
- [Feature] Added
batch_lock
attribute in EnvBase by @nicolas-dufour in #399 - [BugFix] deepcopy specs before transforming by @vmoens in #461
- [BugFix]: Fixed dm_control action type casting by @nicolas-dufour in #463
- [Versioning] Version 0.0.2a0 by @vmoens in #465
New Contributors
- @alexanderlobov made their first contribution in #331
- @tongbaojia made their first contribution in #344
- @omikad made their first contribution in #358
- @jaschmid-fb made their first contribution in #356
- @tcbegley made their first ...