File tree Expand file tree Collapse file tree 1 file changed +8
-1
lines changed Expand file tree Collapse file tree 1 file changed +8
-1
lines changed Original file line number Diff line number Diff line change @@ -31,7 +31,14 @@ class LLMEnv(EnvBase):
31
31
integers representing a sequence of tokens.
32
32
The action is also a string or a tensor of integers, which is concatenated to the previous observation to form the
33
33
new observation.
34
- Prompts to the language model can be loaded when the environment is ``reset`` if the environment is created via :meth:`~from_dataloader`
34
+
35
+ By default, this environment is meant to track history for a prompt. Users can append transforms to tailor
36
+ this to their use case, such as Chain of Thought (CoT) reasoning or other custom processing.
37
+
38
+ Users must append a transform to set the "done" condition, which would trigger the loading of the next prompt.
39
+
40
+ Prompts to the language model can be loaded when the environment is ``reset`` if the environment is created via :meth:`~from_dataloader`
41
+
35
42
Args:
36
43
observation_key (NestedKey, optional): The key in the tensordict where the observation is stored. Defaults to
37
44
``"observation"``.
You can’t perform that action at this time.
0 commit comments