Skip to content

bug: The prometheus format output is not standard #5386

@Spoutnik97

Description

@Spoutnik97

Describe the bug

I am trying to scrape the BentoMl /metrics route with fluent-bit.
Fluent bit prometheus_scrape input throw an error : [2025/06/19 14:19:35] [error] [input:prometheus_scrape:prometheus_scrape.0] error decoding Prometheus Text format

The issues seems to come from the order of histogram metrics.

All the _sum keys are at the begginning of the metric, then the _buckets and _count

To reproduce

  1. Deploy a basic Bentoml container with metrics enabled
  2. Install fluent-bit (brew install fluent-bit on macos)
  3. Create a basic configuration: fluent-bit.conf
[SERVICE]
    Flush         2
    Log_level     debug
    Daemon        off
    HTTP_Server   on
    HTTP_Listen   0.0.0.0
    HTTP_PORT     2020

[INPUT]
    Name                  prometheus_scrape
    Tag                   local_metrics
    Scrape_interval       2s
    Host                  localhost
    Port                  8080
    Metrics_path          /test-metrics.txt

[OUTPUT]
    Name                  stdout
    Match                 *
    Format                json_lines
  1. create a test-metrics.txt file with the content of the metrics below
  2. launch a basic http server python3 -m http.server 8080
  3. launch fluent-bit : fluent-bit -c fluent-bit.conf

Content of the test-metrics.txt file working :

# HELP prediction_time_seconds Time taken for predictions
# TYPE prediction_time_seconds histogram
prediction_time_seconds_sum{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs"} 56.312395095825195
prediction_time_seconds_sum{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs"} 2.419936180114746
prediction_time_seconds_sum{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs"} 0.5229167938232422
prediction_time_seconds_sum{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs"} 4.157390356063843
prediction_time_seconds_sum{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs"} 8.153648376464844
prediction_time_seconds_sum{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs"} 0.32573604583740234
prediction_time_seconds_sum{company_id="e3874ca4-3ea0-46d7-8e8c-359065b0fab9",endpoint="predict_process_collection_and_costs"} 1.031454086303711
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="0.5"} 219.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="1.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="2.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="5.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="10.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="30.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="60.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="+Inf"} 220.0
prediction_time_seconds_count{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs"} 220.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="0.5"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="1.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="2.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="5.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="10.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="30.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="60.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="+Inf"} 8.0
prediction_time_seconds_count{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs"} 8.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="0.5"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="1.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="2.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="5.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="10.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="30.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="60.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="+Inf"} 2.0
prediction_time_seconds_count{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs"} 2.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="0.5"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="1.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="2.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="5.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="10.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="30.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="60.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="+Inf"} 15.0
prediction_time_seconds_count{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs"} 15.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="0.5"} 31.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="1.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="2.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="5.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="10.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="30.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="60.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="+Inf"} 32.0
prediction_time_seconds_count{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs"} 32.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="0.5"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="1.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="2.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="5.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="10.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="30.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="60.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="+Inf"} 1.0
prediction_time_seconds_count{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs"} 1.0

Expected behavior

No response

Environment

bentoml==1.3.20
python>=3.10

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions