Potential Bug in Attention Mask Implementation for Listwise Ranking in M-Falcon

Thank you for sharing the **DLRM** implementation which has significantly clarified the **M-Falcon** methodology mentioned in the paper ❤ 

## Understanding of M-Falcon's Attention Mask

From my understanding, **M-Falcon** utilizes the attention mask to control the visibility of historical items for multiple targets, ensuring efficient training and inference. For example, in a decoder-only approach with a sequence length of 4, the attention mask would look like:

```
T, F, F, F
T, T, F, F
T, T, T, F
T, T, T, T
```

With **M-Falcon** applied to a pairwise ranking task, using a sequence length of 4, 2 interaction histories, and 2 targets, the attention mask is as follows:

```
T, F, F, F
T, T, F, F
T, T, T, F
T, T, F, T
```

## Issue with Listwise Ranking Attention Mask

However, in the context of a **listwise ranking** task, I would expect the attention mask with a sequence length of 4, 2 interaction histories, and 2 targets to be:

```
T, F, F, F
T, T, F, F
T, T, T, T
T, T, T, T
```

This configuration allows all target items to see each other, which is essential for effective listwise ranking. 

## Observed Behavior in Current Implementation

In the current implementation of DLRM, it appears that the default attention mask is being used instead of the expected **M-Falcon** attention mask for listwise ranking. This default mask restricts high-score retrieval items from seeing low-score items, which might inadvertently affect the performance of the ranking task.

## Inquiry

Is there a specific reason why the default attention mask is used for listwise ranking instead of the **M-Falcon**-designed mask that allows all target items to see each other? If this is unintended behavior, I wanted to bring it to your attention in case it affects the performance of listwise ranking tasks.

Thank you once again for your excellent work and for providing such a valuable resource to the community!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential Bug in Attention Mask Implementation for Listwise Ranking in M-Falcon #244

Understanding of M-Falcon's Attention Mask

Issue with Listwise Ranking Attention Mask

Observed Behavior in Current Implementation

Inquiry

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential Bug in Attention Mask Implementation for Listwise Ranking in M-Falcon #244

Description

Understanding of M-Falcon's Attention Mask

Issue with Listwise Ranking Attention Mask

Observed Behavior in Current Implementation

Inquiry

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions