Learning to Rank: Weights and Label difference normalization in pairwise full query ranking.

There are two distinct categories of use cases in Learning to Rank (LTR):

#### 1. Ranking Relevant Items Within a Query
This is the standard scenario in information retrieval, such as search engine result ranking or some recommendation systems. Its main characteristics include:

- Use of relevance-based metrics focused on top-ranked items, such as **MAP** or **NDCG**.
- **Position bias correction** mechanisms.
- **Truncation** of candidate pairs based on the most relevant items (according to the labels or predictions).
- Other types of **normalizations** specific to this context.

#### 2. Full Ranking of a Dataset
Another important and often overlooked use case is the **complete ranking of all elements in a dataset**. This can be framed as LTR with a **single query** (or several queries representing different periods in time series datasets), and it is applicable to problems where the evaluation metric is, for instance, **Spearman correlation**, or even **binary classification** problems with **AUC** as the metric.

The nature of this use case makes many LTR implementations unsuitable (for example, LightGBM does not support it well).

XGBoost, however, does support LTR through the `rank:pairwise` objective. Still, there are some impactful aspects that could be improved:

#### Weights

In LTR, **weights are always considered at the query level**.  
But what happens in pairwise use cases where there is only **one query**, or when multiple queries exist but we want to assign **instance-level weights**?

Since the `weight` parameter in the `DMatrix` constructor is the same, this behavior should be generalized. It should be possible to:
- Provide weights of **length equal to the number of queries** (to be applied per group), or
- Provide weights of **length equal to the number of observations** (to be applied per instance).

A consistent internal approach (aligned with other objectives) would be:
- Always interpret `weight` as **per-instance**, and
- If a **per-query weight array** is passed (with length equal to the number of queries), internally expand it into a vector matching the number of instances by **repeating the group weight** according to the group size.

#### Label Difference Normalization

In full-dataset ranking scenarios, labels are often **quasi-continuous** or have **high granularity**.  
(It is up to the user to discretize or bin the labels if needed.)

Pairs with **similar labels** are generally **less informative** than those with very different labels. Therefore, introducing a **normalization based on label difference** is a natural and useful idea.

Assuming labels are preprocessed to lie within the **[0, 1]** percentile scale, the following logic (from XGBoost source code):

https://github.com/dmlc/xgboost/blob/4e24639d7de3d8e0aae0ae0ab061c14f704c0c35/src/objective/lambdarank_obj.h#L123C3-L125C4

Can be generalized as:

```cpp
if (norm_by_diff && best_score != worst_score) {
  if (param_.IsMean()) {
    delta_metric *= std::pow(std::abs(y_high - y_low), label_diff_normalization);
  } else {
    delta_metric /= (delta_score + 0.01);
  }
}
```

Where `label_diff_normalization` is a user-defined parameter, with default value **= 0**.

Since `y_high` and `y_low` are percentiles, their absolute difference is **bounded in [0, 1]**.  
- When `label_diff_normalization == 0`, `delta_metric` remains unchanged.  
- As `label_diff_normalization` increases, `delta_metric` decreases, effectively **penalizing pairs with similar labels**.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Learning to Rank: Weights and Label difference normalization in pairwise full query ranking. #11424

1. Ranking Relevant Items Within a Query

2. Full Ranking of a Dataset

Weights

Label Difference Normalization

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Learning to Rank: Weights and Label difference normalization in pairwise full query ranking. #11424

Description

1. Ranking Relevant Items Within a Query

2. Full Ranking of a Dataset

Weights

Label Difference Normalization

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions