- 
                Notifications
    You must be signed in to change notification settings 
- Fork 0
Add metrics support for imageprefetch and nodeimageset controllers #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Prometheus metrics support to both the imageprefetch and nodeimageset controllers, enabling monitoring of image prefetching operations across the cluster.
Key changes include:
- New metrics package exposing gauges for image availability, pull success/failure counts, image metadata, size, and prefetch duration
- Enhanced image pull tracking to capture timing and size information
- Integration of metrics collection in both controller reconciliation loops
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description | 
|---|---|
| internal/metrics/metrics.go | Defines Prometheus gauge vectors for tracking image prefetch operations and a registration function | 
| internal/imgmanager/image_puller.go | Adds timing and size tracking to ImagePullStatus and exposes getter methods for metrics | 
| internal/imgmanager/fake_containerd.go | Updates mock to return image size from PullImage | 
| internal/imgmanager/containerd.go | Modifies PullImage to return image size alongside error | 
| internal/controller/nodeimageset_controller.go | Integrates metrics recording during reconciliation and cleanup on deletion | 
| internal/controller/imageprefetch_controller.go | Adds metrics setters/removers and calls them during reconciliation and finalization | 
| go.mod | Moves prometheus/client_golang from indirect to direct dependency | 
| config/manager/nodeimageset-controller.yaml | Adds metrics server configuration flags | 
| config/manager/imageprefetch-controller.yaml | Adds metrics server configuration flags | 
| cmd/nodeimageset-controller/main.go | Registers metrics with controller-runtime metrics registry | 
| cmd/imageprefetch-controller/main.go | Registers metrics with controller-runtime metrics registry | 
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
ce401bc    to
    e0628e1      
    Compare
  
    Signed-off-by: zeroalphat <taichi-takemura@cybozu.co.jp>
e0628e1    to
    fb97b00      
    Compare
  
    | // Delete all metrics for this NodeImageSet | ||
| metrics.ImageInfoVec.DeletePartialMatch(map[string]string{"name": nodeImageSet.Name}) | ||
| metrics.ImageSizeBytesVec.DeletePartialMatch(map[string]string{"name": nodeImageSet.Name}) | ||
| metrics.ImagePrefetchDurationSecondsVec.DeletePartialMatch(map[string]string{"name": nodeImageSet.Name}) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting metrics should be done before removing the finalizer.
If an error occurs during the update, metrics may leak depending on the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
I fixed the code to ensure that deleting metrics is done before removing the finalizer.
        
          
                internal/metrics/metrics.go
              
                Outdated
          
        
      | ReadyVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
| Namespace: metricsNamespace, | ||
| Name: "imageprefetch_ready", | ||
| Help: "1 if the ImagePrefetch resource is ready, 0 otherwise", | ||
| }, []string{"namespace", "name"}) | ||
|  | ||
| ImagePulledNodesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
| Namespace: metricsNamespace, | ||
| Name: "imageprefetch_image_pull_successful_nodes", | ||
| Help: "Number of nodes where images have been successfully prefetched for thins Imageprefetch", | ||
| }, []string{"namespace", "name"}) | ||
|  | ||
| ImagePullFailedNodesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
| Namespace: metricsNamespace, | ||
| Name: "imageprefetch_image_pull_failed_nodes", | ||
| Help: "Number of nodes where image prefetching has failed for this ImagePrefetch", | ||
| }, []string{"namespace", "name"}) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to make the label's key and value like kube-state-metrics.
The key is the resource kind, and the label is the resource name.
So then, label_replace becomes unnecessary when combined with other metrics in MetricsQL.
| ReadyVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "imageprefetch_ready", | |
| Help: "1 if the ImagePrefetch resource is ready, 0 otherwise", | |
| }, []string{"namespace", "name"}) | |
| ImagePulledNodesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "imageprefetch_image_pull_successful_nodes", | |
| Help: "Number of nodes where images have been successfully prefetched for thins Imageprefetch", | |
| }, []string{"namespace", "name"}) | |
| ImagePullFailedNodesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "imageprefetch_image_pull_failed_nodes", | |
| Help: "Number of nodes where image prefetching has failed for this ImagePrefetch", | |
| }, []string{"namespace", "name"}) | |
| ReadyVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "imageprefetch_ready", | |
| Help: "1 if the ImagePrefetch resource is ready, 0 otherwise", | |
| }, []string{"namespace", "imageprefetch"}) | |
| ImagePulledNodesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "imageprefetch_image_pull_successful_nodes", | |
| Help: "Number of nodes where images have been successfully prefetched for thins Imageprefetch", | |
| }, []string{"namespace", "imageprefetch"}) | |
| ImagePullFailedNodesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "imageprefetch_image_pull_failed_nodes", | |
| Help: "Number of nodes where image prefetching has failed for this ImagePrefetch", | |
| }, []string{"namespace", "imageprefetch"}) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it to change the metrics labels to be like kube-state-metrics.
        
          
                internal/metrics/metrics.go
              
                Outdated
          
        
      | ImageInfoVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
| Namespace: metricsNamespace, | ||
| Name: "nodeimageset_image_info", | ||
| Help: "Information about NodeImageSet image", | ||
| }, []string{"name", "image_name", "registry_policy", "node_name"}) | ||
|  | ||
| ImageSizeBytesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
| Namespace: metricsNamespace, | ||
| Name: "nodeimageset_image_size_bytes", | ||
| Help: "Size of images in NodeImageSets in bytes", | ||
| }, []string{"name", "image_name", "node_name"}) | ||
|  | ||
| ImagePrefetchDurationSecondsVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
| Namespace: metricsNamespace, | ||
| Name: "nodeimageset_image_prefetch_duration_seconds", | ||
| Help: "Duration taken to prefetch images in NodeImageSets in seconds", | ||
| }, []string{"name", "image_name", "node_name"}) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
It is better to use similar label keys to kube-state-metrics.
| ImageInfoVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "nodeimageset_image_info", | |
| Help: "Information about NodeImageSet image", | |
| }, []string{"name", "image_name", "registry_policy", "node_name"}) | |
| ImageSizeBytesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "nodeimageset_image_size_bytes", | |
| Help: "Size of images in NodeImageSets in bytes", | |
| }, []string{"name", "image_name", "node_name"}) | |
| ImagePrefetchDurationSecondsVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "nodeimageset_image_prefetch_duration_seconds", | |
| Help: "Duration taken to prefetch images in NodeImageSets in seconds", | |
| }, []string{"name", "image_name", "node_name"}) | |
| ImageInfoVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "nodeimageset_image_info", | |
| Help: "Information about NodeImageSet image", | |
| }, []string{"nodeimageset", "image", "registry_policy", "node"}) | |
| ImageSizeBytesVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "nodeimageset_image_size_bytes", | |
| Help: "Size of images in NodeImageSets in bytes", | |
| }, []string{"nodeimageset", "image", "node"}) | |
| ImagePrefetchDurationSecondsVec = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | |
| Namespace: metricsNamespace, | |
| Name: "nodeimageset_image_prefetch_duration_seconds", | |
| Help: "Duration taken to prefetch images in NodeImageSets in seconds", | |
| }, []string{"nodeimageset", "image", "node"}) | 
Signed-off-by: zeroalphat <taichi-takemura@cybozu.co.jp>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.