Skip to content

SOLR-17760: solving bug in LTR dense/sparse format #3354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

aruggero
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-17760

Description

A bug in the LTR dense/sparse format has been fixed with this PR.
Dense/sparse format was not working for some features (e.g. FieldValueFeature) since default values were returned in the sparse format.
A check if the scored feature has the default value has been added. If it is, the value is excluded from logging in the sparse format.

Solution

The if (featuresInfo[featureId].getValue() != scFW.getDefaultValue()) condition has been added in both the org.apache.solr.ltr.LTRScoringQuery.ModelWeight.ModelScorer.SparseModelScorer#score method of the SparseModelScorer and in the org.apache.solr.ltr.LTRScoringQuery.ModelWeight.ModelScorer.DenseModelScorer#score method of the DenseModelScorer in the LTRScoringQuery class.

Also, the isUsed variable of the org.apache.solr.ltr.LTRScoringQuery.FeatureInfo class has been renamed to isDefaultValue for more clarity.

Finally, the documentation has been updated, integrating the defaultValue parameter, which is a parameter supported by all the LTR features but was not present and explained in the reference guide.

Tests

TestSelectiveWeightCreation and TestFieldValueFeature have been adapted to reflect the fixed behaviour.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@github-actions github-actions bot added documentation Improvements or additions to documentation module:ltr tests labels May 16, 2025
@aruggero aruggero force-pushed the bug/SOLR-17760-dense_sparse branch from 35643d9 to 7a672c3 Compare May 19, 2025 07:20
solr/CHANGES.txt Outdated
@@ -54,7 +54,7 @@ Optimizations

Bug Fixes
---------------------
(No changes)
* SOLR-16667: Fixed dense/sparse representation in LTR module. (Anna Ruggero)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will likely go into 9.9, please change it (we can also do it just before the merge).
If I merge it I'll also add 'via Alessandro Benedetti' that indicates the committer that reviewed and merged

Copy link
Contributor

@alessandrobenedetti alessandrobenedetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes should be done, reverting tests unless necessary and fixing the CHANGES.TXT

@@ -250,6 +250,31 @@ public void testIfADocumentDoesntHaveAFieldDefaultValueIsReturned() throws Excep
assertJQ("/query" + query.toQueryString(), "/response/numFound/==1");
assertJQ("/query" + query.toQueryString(), "/response/docs/[0]/id=='42'");

final String docs0fv_dense_csv =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these tests just refactoring? It's hard to review, and if there's no actual test change, I would strongly encourage leaving them as they were.
When bug-fixing, the less we touch the tests, the better, so it's clear nothing has been broken by the bugfix,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed since, depending on the sparse/dense format, we could have default values to be returned.
Before the bug fix, zero values were returned even with the sparse format.
With these changes, we are aligning the expected output of the tests to the randomly selected feature format.

@aruggero aruggero force-pushed the bug/SOLR-17760-dense_sparse branch from 1f7df5d to b71fe4c Compare May 20, 2025 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation module:ltr tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants