Replies: 1 comment
-
Hi, I too had difficulties using MaskedCategorical (and MaskedOneHotCategorical -- not sure of the difference) with ProbabilisticActor using CompositeDistribution. and was not able to get it to work. I ended up using OneHotCategorical (https://pytorch.org/rl/0.6/reference/generated/torchrl.modules.OneHotCategorical.html#torchrl.modules.OneHotCategorical), and applied the masking myself in the forward pass (replace masked locations with float('-inf')). Note that this may require passing in both the input and the mask as separate inputs to your nn.Module, which to the best of my knowledge is not permitted if you're using nn.Sequential (and thus you must use the functional API). |
Beta Was this translation helpful? Give feedback.
-
I’m working on a reinforcement learning scenario with discrete action types and continuous parameters. I’m using a ProbabilisticActor with a CompositeDistribution. Initially, I used Categorical for the discrete action type and masked invalid actions directly in the logits. As a result, the KL divergence started to explode during training.
I’m now considering switching to torchrl.modules.MaskedCategorical instead of Categorical. However, it seems that the mask is not being passed correctly.
Question: Has anyone successfully used MaskedCategorical with a ProbabilisticActor and could share some hints?
Beta Was this translation helpful? Give feedback.
All reactions