generated from opentensor/bittensor-subnet-template
-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
- Judge score uses
StreamedSyntheticPartialDataset
, which cuts off the messages randomly, so the last 'role' of the conversation to generate from can be bothuser
andassistant
:
allowed_len = min(len(messages), self.max_messages)
if random.random() < self._cut_message_chain_early:
# Choose a random cutoff between at least half of allowed_len and allowed_len
min_cut = max(1, allowed_len // 2)
cutoff = random.randint(min_cut, allowed_len)
else:
cutoff = allowed_len
truncated_messages = messages[:cutoff]
- The
original_conversation
is created by appendingtruncated_messages
with the LAST assistant response, not the next assistant response, so the response may be irrelevant to the user message. This doesn't make any sense. For example, if the current user message is "How are you today?", what's the point of generating a better answer than "Alright, see you later!" (the last assistant message)?
Have you guys considered these problems?
BKM1804, nntoan209 and duongna21
Metadata
Metadata
Assignees
Labels
No labels