-
Notifications
You must be signed in to change notification settings - Fork 744
[SOLR-17799] Replace segment merge metrics using gauges with consistent counters #3416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Will update documentation and add tests after we confirm this is the pattern of metrics/naming that we want to use |
+1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a bit hard to review this without seeing the output changes in parallel. Could you post the before and after?
I have some concern for backwards compatibility. I am a fan of this technically being on by default but I think it is too late for that. We should probably keep it optional as before, otherwise we run the risk of blowing up cardinality for clouds with many cores without them knowing the moment they call the /admin/metrics endpoint without filtering.
Also as this is probably going for 9.x, we would probably need to keep the old metrics there and its format for backwards compat reasons as well. Just add the news ones in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit surprised this isn't implementer of SolrMetricProducer
and all metrics are initialized in the constructor. Maybe the SolrMetricProducer
was create later on and this never moved. I think we should move this to implement that interface which basically does your initMetrics()
.
private final Map<String, Counter> majorMergeStartedMetrics = new HashMap<>(); | ||
private final Map<String, Counter> minorMergeStartedMetrics = new HashMap<>(); | ||
private final Map<String, Counter> majorMergeFinishedMetrics = new HashMap<>(); | ||
private final Map<String, Counter> minorMergeFinishedMetrics = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you are updating these with updateMergeMetrics
. But this isn't thread safe as these can be modified concurrently from multiple threads from a merge. This should probably be a ConcurrentHashMap
?
private static final String MERGE_METRIC_PATH_COMPONENT = "merge"; | ||
private static final String MERGE_METRIC_PATH_COMPONENT_MAJOR = "major"; | ||
private static final String MERGE_METRIC_PATH_COMPONENT_MINOR = "minor"; | ||
private static final String MERGE_METRIC_PATH_COMPONENT_FINISHED = "finished"; | ||
private static final String MERGE_METRIC_PATH_COMPONENT_STARTED = "started"; | ||
|
||
private static final String MERGES_METRIC_NAME = "merges"; | ||
private static final String MERGE_DOCS_METRIC_NAME = "docs"; | ||
private static final String MERGE_DELETED_DOCS_METRIC_NAME = "deletedDocs"; | ||
private static final String MERGE_SEGMENTS_METRIC_NAME = "segments"; | ||
private static final String MERGE_TIMES_METRIC_NAME = "mergeTimes"; | ||
private static final String MERGE_FLUSH_METRIC_NAME = "flushes"; | ||
private static final String MERGE_ERRORS_METRIC_NAME = "errors"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm usually a fan of constant static strings but in this case there are so many, I think I preferred it the other way of just in-lining them as I found it easier to read that it went something like
solrMetricsContext.gauge(
() -> runningMajorMerges.get(),
true,
"running",
SolrInfoBean.Category.INDEX.toString(),
"merge",
"major");
running.index.merge.major
. I don't feel strongly about it though so if you prefer this way thats fine I guess.
runningMinorMergesDocs.addAndGet(-totalNumDocs); | ||
runningMinorMergesSegments.addAndGet(-segmentsCount); | ||
} | ||
updateMergeMetrics(totalNumDocs, deletedDocs, segmentsCount, true, context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be negative totalNumDocs
and -segmentsCount
?
iw.forceMerge(1, true); | ||
|
||
assertEquals( | ||
"should be mulitple merge operations", | ||
initialMerges + 1, | ||
minorMergeStartedMetrics.getCount()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this technically just 1 merge from line 123? Why would initial be higher than 0? Is it because minor merges on itself with less docs?
Just want to react to something:
This is already a risk. Any real-world use of fetching /admin/metrics is going to have a filtering parameter. It'd be reasonable to detect that no filter was given and then potentially return an error. If the user is just kicking the tires, they may want to ignore the filter requirement so the error message might contain a trivial suggestion like to see all Solr metrics. |
https://issues.apache.org/jira/browse/SOLR-17799
Description
Replace existing segment merge metrics with consistent counters
Solution
INDEX.merge.major.started.docs
INDEX.merge.major.started.deletedDocs
INDEX.merge.major.started.segments
INDEX.merge.minor.started.docs
INDEX.merge.minor.started.deletedDocs
INDEX.merge.minor.started.segments
INDEX.merge.major.finished.docs
INDEX.merge.major.finished.deletedDocs
INDEX.merge.major.finished.segments
INDEX.merge.minor.finished.docs
INDEX.merge.minor.finished.deletedDocs
INDEX.merge.minor.finished.segments
Tests
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.