-
Notifications
You must be signed in to change notification settings - Fork 95
Add setting for number of documents stored by HybridCollapsingTopDocsCollector #1471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ryan Bogan <rbogan@amazon.com>
…Collector Signed-off-by: Ryan Bogan <rbogan@amazon.com>
Open to feedback on default value and upper limit for the new setting |
Signed-off-by: Ryan Bogan <rbogan@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add test for checking user provided value for new setting, like what is behavior if they put some invalid string or float value.
|
||
public static final Setting<Integer> HYBRID_COLLAPSE_DOCS_PER_GROUP = Setting.intSetting( | ||
"index.neural_search.hybrid_collapse_docs_per_group", | ||
10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that is default value for this setting, is this based on some experimental data? does it give good balance between accuracy and performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm changing this to default to the current behavior, which is using the size of the query.
@@ -47,4 +47,13 @@ public final class NeuralSearchSettings { | |||
Setting.Property.IndexScope, | |||
Setting.Property.Dynamic | |||
); | |||
|
|||
public static final Setting<Integer> HYBRID_COLLAPSE_DOCS_PER_GROUP = Setting.intSetting( | |||
"index.neural_search.hybrid_collapse_docs_per_group", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setting name is lacking visibility that it's per sub-query. Can we add that context, maybe something like hybrid_collapse_docs_per_group_per_subquery
or a bit less informative hybrid_collapse_max_docs_per_group
Signed-off-by: Ryan Bogan <rbogan@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you Ryan
Description
Adds a setting that allows users to control the number of documents that are stored in each queue within HybridCollapsingTopDocsCollector. Lowering this setting prioritizes latency over recall, while raising the setting does the opposite.
Related Issues
Resolves #1381
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.