Skip to content

Commit d4d3ed6

Browse files
papgmezcrmne
andauthored
Support Embedding Dimensions (#73)
This PR implements the ability to specify custom dimensions when generating embeddings, as requested in issue #47. ### What's included - Added support for passing a dimensions parameter to the embed method - Implemented dimensions handling in both OpenAI and Gemini providers - Added tests to verify dimension param works correctly - Optimized the Gemini provider's `embed` method to reduce unnecessary API calls when embedding texts, resulting in lower token usage. From now on, it uses `batchEmbedContents` endpoint within one request, for both single and multiple text embeddings. - Modernize Gemini embeddings following DIP principle, as implemented in `openai/embeddings.rb`. - The Gemini embeddings API response does not contain the promptTokenCount attribute, so I have removed it. ### Implementation notes I've decided to only implement the per-request dimension configuration and not the global configuration option that was initially proposed in the issue. This is because each embedding model has its own default dimensions, making a global setting potentially confusing. With this implementation, users can set the embedding dimensions like: ```ruby embedding = RubyLLM.embed( "Ruby is a programmer's best friend", model: "text-embedding-3-small", dimensions: 512 ) ``` ### References - OpenAI API docs: https://platform.openai.com/docs/api-reference/embeddings - Gemini API docs: https://ai.google.dev/api/embeddings Resolves #47 --------- Co-authored-by: Carmine Paolino <carmine@paolino.me>
1 parent 5b06439 commit d4d3ed6

14 files changed

+11228
-5632
lines changed

docs/guides/embeddings.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,44 @@ end
9292

9393
Refer to the [Working with Models Guide]({% link guides/models.md %}) for details on finding available embedding models and their capabilities.
9494

95+
## Choosing Dimensions
96+
97+
Each embedding model has its own default output dimensions. For example, OpenAI's `text-embedding-3-small` outputs 1536 dimensions by default, while `text-embedding-3-large` outputs 3072 dimensions. RubyLLM allows you to specify these dimensions per request:
98+
99+
```ruby
100+
embedding = RubyLLM.embed(
101+
"This is a test sentence",
102+
model: "text-embedding-3-small",
103+
dimensions: 512
104+
)
105+
```
106+
107+
This is particularly useful when:
108+
- Working with vector databases that have specific dimension requirements
109+
- Ensuring consistent dimensionality across different requests
110+
- Optimizing storage and query performance in your vector database
111+
112+
Note that not all models support custom dimensions. If you specify dimensions that aren't supported by the chosen model, RubyLLM will use the model's default dimensions.
113+
114+
## Using Embedding Results
115+
116+
### Vector Properties
117+
118+
The embedding result contains useful information:
119+
120+
```ruby
121+
embedding = RubyLLM.embed("Example text")
122+
123+
# The vector representation
124+
puts embedding.vectors.class # => Array
125+
puts embedding.vectors.first.class # => Float
126+
127+
# The vector dimensions
128+
puts embedding.vectors.first.length # => 1536
129+
130+
# The model used
131+
puts embedding.model # => "text-embedding-3-small"
132+
95133
## Using Embedding Results
96134

97135
A primary use case for embeddings is measuring the semantic similarity between texts. Cosine similarity is a common metric.

lib/ruby_llm/embedding.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,14 @@ def initialize(vectors:, model:, input_tokens: 0)
1212
@input_tokens = input_tokens
1313
end
1414

15-
def self.embed(text, model: nil, provider: nil, context: nil)
15+
def self.embed(text, model: nil, provider: nil, context: nil, dimensions: nil)
1616
config = context&.config || RubyLLM.config
1717
model_id = model || config.default_embedding_model
1818
Models.find(model_id, provider)
1919

2020
provider = Provider.for(model_id)
2121
connection = context ? context.connection_for(provider) : provider.connection(config)
22-
provider.embed(text, model: model_id, connection:)
22+
provider.embed(text, model: model_id, connection:, dimensions:)
2323
end
2424
end
2525
end

lib/ruby_llm/provider.rb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,10 +31,10 @@ def list_models(connection:)
3131
parse_list_models_response response, slug, capabilities
3232
end
3333

34-
def embed(text, model:, connection:)
35-
payload = render_embedding_payload(text, model:)
36-
response = connection.post embedding_url, payload
37-
parse_embedding_response response
34+
def embed(text, model:, connection:, dimensions:)
35+
payload = render_embedding_payload(text, model:, dimensions:)
36+
response = connection.post(embedding_url(model:), payload)
37+
parse_embedding_response(response, model:)
3838
end
3939

4040
def paint(prompt, model:, size:, connection:)

lib/ruby_llm/providers/gemini/embeddings.rb

Lines changed: 18 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -5,47 +5,31 @@ module Providers
55
module Gemini
66
# Embeddings methods for the Gemini API integration
77
module Embeddings
8-
# Must be public for Provider module
9-
def embed(text, model:, connection:) # rubocop:disable Metrics/AbcSize,Metrics/MethodLength
10-
payload = {
11-
content: {
12-
parts: format_text_for_embedding(text)
13-
}
14-
}
8+
module_function
159

16-
url = "models/#{model}:embedContent"
17-
response = connection.post url, payload
10+
def embedding_url(model:)
11+
"models/#{model}:batchEmbedContents"
12+
end
13+
14+
def render_embedding_payload(text, model:, dimensions:)
15+
{ requests: [text].flatten.map { |t| single_embedding_payload(t, model:, dimensions:) } }
16+
end
1817

19-
if text.is_a?(Array)
20-
# We need to make separate calls for each text with Gemini
21-
embeddings = text.map do |t|
22-
single_payload = { content: { parts: [{ text: t.to_s }] } }
23-
single_response = connection.post url, single_payload
24-
single_response.body.dig('embedding', 'values')
25-
end
18+
def parse_embedding_response(response, model:)
19+
vectors = response.body['embeddings']&.map { |e| e['values'] }
20+
vectors in [vectors]
2621

27-
Embedding.new(
28-
vectors: embeddings,
29-
model: model,
30-
input_tokens: response.body.dig('usageMetadata', 'promptTokenCount') || 0
31-
)
32-
else
33-
Embedding.new(
34-
vectors: response.body.dig('embedding', 'values'),
35-
model: model,
36-
input_tokens: response.body.dig('usageMetadata', 'promptTokenCount') || 0
37-
)
38-
end
22+
Embedding.new(vectors:, model:, input_tokens: 0)
3923
end
4024

4125
private
4226

43-
def format_text_for_embedding(text)
44-
if text.is_a?(Array)
45-
text.map { |t| { text: t.to_s } }
46-
else
47-
[{ text: text.to_s }]
48-
end
27+
def single_embedding_payload(text, model:, dimensions:)
28+
{
29+
model: "models/#{model}",
30+
content: { parts: [{ text: text.to_s }] },
31+
outputDimensionality: dimensions
32+
}.compact
4933
end
5034
end
5135
end

lib/ruby_llm/providers/openai/embeddings.rb

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,31 +7,27 @@ module OpenAI
77
module Embeddings
88
module_function
99

10-
def embedding_url
10+
def embedding_url(...)
1111
'embeddings'
1212
end
1313

14-
def render_embedding_payload(text, model:)
14+
def render_embedding_payload(text, model:, dimensions:)
1515
{
1616
model: model,
17-
input: text
18-
}
17+
input: text,
18+
dimensions: dimensions
19+
}.compact
1920
end
2021

21-
def parse_embedding_response(response)
22+
def parse_embedding_response(response, model:)
2223
data = response.body
23-
model_id = data['model']
2424
input_tokens = data.dig('usage', 'prompt_tokens') || 0
2525
vectors = data['data'].map { |d| d['embedding'] }
2626

2727
# If we only got one embedding, return it as a single vector
28-
vectors = vectors.first if vectors.size == 1
28+
vectors in [vectors]
2929

30-
Embedding.new(
31-
vectors: vectors,
32-
model: model_id,
33-
input_tokens: input_tokens
34-
)
30+
Embedding.new(vectors:, model:, input_tokens:)
3531
end
3632
end
3733
end

0 commit comments

Comments
 (0)