-
Notifications
You must be signed in to change notification settings - Fork 289
Open
Description
System Info
Cargo version: cargo 1.88.0 (873a06493 2025-05-10)
OS Version: Fedora 42
CPU: AMD Ryzen 7 5700U
GPU: Integrated Graphics
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
Step to reproduce the error:
text-embeddings-router --model-id BAAI/bge-reranker-v2-m3 --port 8085
The behavior:
2025-07-10T06:06:30.967571Z INFO text_embeddings_router: router/src/main.rs:189: Args { model_id: "BAA*/***-********-*2-m3", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hf_token: None, hostname: "miata", port: 8085, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, disable_spans: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", prometheus_port: 9000, cors_allow_origin: None }
2025-07-10T06:06:31.014787Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-07-10T06:06:31.014813Z INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json`
2025-07-10T06:06:31.200781Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/BAAI/bge-reranker-v2-m3/resolve/main/1_Pooling/config.json)
2025-07-10T06:06:32.144008Z INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
2025-07-10T06:06:32.277826Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:36: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/BAAI/bge-reranker-v2-m3/resolve/main/config_sentence_transformers.json)
2025-07-10T06:06:32.277939Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
2025-07-10T06:06:32.608462Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
2025-07-10T06:06:34.075790Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 3.061005516s
2025-07-10T06:06:34.847050Z WARN text_embeddings_router: router/src/lib.rs:189: Could not find a Sentence Transformers config
2025-07-10T06:06:34.847075Z INFO text_embeddings_router: router/src/lib.rs:193: Maximum number of tokens per request: 8192
2025-07-10T06:06:34.847328Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:38: Starting 16 tokenization workers
2025-07-10T06:06:40.242017Z INFO text_embeddings_router: router/src/lib.rs:235: Starting model backend
2025-07-10T06:06:40.242366Z INFO text_embeddings_backend: backends/src/lib.rs:507: Downloading `model.safetensors`
2025-07-10T06:08:51.591003Z INFO text_embeddings_backend: backends/src/lib.rs:391: Model weights downloaded in 131.348634368s
2025-07-10T06:08:51.591974Z INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:251: Starting Bert model on Cpu
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
Intel oneMKL ERROR: Parameter 10 was incorrect on entry to SGEMM .
2025-07-10T06:08:53.828476Z INFO text_embeddings_router: router/src/lib.rs:252: Warming up model
Falha de segmentação (imagem do núcleo gravada)
after that error, I tried to run the command again:
╭─ 💁 henrique_1 at 💻 miata in 📁 ~/development/text-embeddings-inference on (🌿 main ⌀1 ✗)
╰λ text-embeddings-router --model-id BAAI/bge-reranker-v2-m3 --port 8085
2025-07-10T06:29:24.511923Z INFO text_embeddings_router: router/src/main.rs:189: Args { model_id: "BAA*/***-********-*2-m3", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hf_token: None, hostname: "miata", port: 8085, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, disable_spans: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", prometheus_port: 9000, cors_allow_origin: None }
2025-07-10T06:29:24.542832Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-07-10T06:29:24.542857Z INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json`
2025-07-10T06:29:24.752917Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/BAAI/bge-reranker-v2-m3/resolve/main/1_Pooling/config.json)
2025-07-10T06:29:25.739465Z INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
2025-07-10T06:29:25.875486Z WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:36: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/BAAI/bge-reranker-v2-m3/resolve/main/config_sentence_transformers.json)
2025-07-10T06:29:25.875523Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
2025-07-10T06:29:25.875770Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
2025-07-10T06:29:25.875824Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 1.332994198s
2025-07-10T06:29:26.472711Z WARN text_embeddings_router: router/src/lib.rs:189: Could not find a Sentence Transformers config
2025-07-10T06:29:26.472733Z INFO text_embeddings_router: router/src/lib.rs:193: Maximum number of tokens per request: 8192
2025-07-10T06:29:26.472947Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:38: Starting 16 tokenization workers
2025-07-10T06:29:30.886601Z INFO text_embeddings_router: router/src/lib.rs:235: Starting model backend
2025-07-10T06:29:30.887156Z INFO text_embeddings_backend: backends/src/lib.rs:507: Downloading `model.safetensors`
2025-07-10T06:29:30.887816Z INFO text_embeddings_backend: backends/src/lib.rs:391: Model weights downloaded in 661.468µs
2025-07-10T06:29:30.892915Z INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:251: Starting Bert model on Cpu
Falha de segmentação (imagem do núcleo gravada)
Expected behavior
It was expected that the command run and the Text Embeddings Inference start.
Metadata
Metadata
Assignees
Labels
No labels