Skip to content

Commit b1d3f75

Browse files
Implement Integrated Inference (#103)
## Problem The integrated inference API was part of the `2025-01` release, and the code has been generated for Go, but the interface has not been wired up for use in the client. ## Solution Implement Integrated Inference: - Add `CreateIndexForModel` to `Client` struct. - Add `UpsertRecords` and `SearchRecords` to `IndexConnection` struct. - Add new types for working with integrated inference: - `CreateIndexForModelRequest`, `CreateIndexForModelEmbed` - `IntegratedRecord` - `SearchRecordsRequest`, `SearchRecordsQuery`, `SearchRecordsRerank`, `SearchRecordsVector` - `Hit`, `SearchRecordsResponse`, `SearchUsage` Note: I've included some integration test refactoring in this PR. Was running into a lot of flakiness unrelated to this change. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan Following example demonstrates creating an integrated index for a specific model, upserting some records, and then searching against those records. ```go package main import ( "context" "fmt" "github.com/pinecone-io/go-pinecone/v3/pinecone" "log" "os" ) func main() { ctx := context.Background() clientParams := pinecone.NewClientParams{ ApiKey: os.Getenv("PINECONE_API_KEY"), } pc, err := pinecone.NewClient(clientParams) if err != nil { log.Fatalf("Failed to create Client: %v", err) } else { fmt.Println("Successfully created a new Client object!") } index, err := pc.CreateIndexForModel(ctx, &CreateIndexForModelRequest{ Name: "my-integrated-index", Cloud: "aws", Region: "us-east-1", Embed: CreateIndexForModelEmbed{ Model: "multilingual-e5-large", FieldMap: map[string]interface{}{"text": "chunk_text"}, }, }) if err != nil { log.Fatalf("Failed to create serverless integrated index: %v", err) } else { fmt.Printf("Successfully created serverless integrated index: %s", idx.Name) } idxConnection, err := pc.Index(pinecone.NewIndexConnParams{Host: index.Host, Namespace: "my-namespace"}) records := []*IntegratedRecord{ { "_id": "rec1", "chunk_text": "Apple's first product, the Apple I, was released in 1976 and was hand-built by co-founder Steve Wozniak.", "category": "product", }, { "_id": "rec2", "chunk_text": "Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.", "category": "nutrition", }, { "_id": "rec3", "chunk_text": "Apples originated in Central Asia and have been cultivated for thousands of years, with over 7,500 varieties available today.", "category": "cultivation", }, { "_id": "rec4", "chunk_text": "In 2001, Apple released the iPod, which transformed the music industry by making portable music widely accessible.", "category": "product", }, { "_id": "rec5", "chunk_text": "Apple went public in 1980, making history with one of the largest IPOs at that time.", "category": "milestone", }, { "_id": "rec6", "chunk_text": "Rich in vitamin C and other antioxidants, apples contribute to immune health and may reduce the risk of chronic diseases.", "category": "nutrition", }, { "_id": "rec7", "chunk_text": "Known for its design-forward products, Apple's branding and market strategy have greatly influenced the technology sector and popularized minimalist design worldwide.", "category": "influence", }, { "_id": "rec8", "chunk_text": "The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.", "category": "nutrition", }, } err = idxConnection.UpsertRecords(ctx, records) if err != nil { log.Fatalf("Failed to upsert vectors. Error: %v", err) } res, err := idxConnection.SearchRecords(ctx, &SearchRecordsRequest{ Query: SearchRecordsQuery{ TopK: 5, Inputs: &map[string]interface{}{ "text": "Disease prevention", }, }, }) if err != nil { log.Fatalf("Failed to search records: %v", err) } fmt.Printf("Search results: %+v\n", res) } ```
1 parent faba978 commit b1d3f75

File tree

11 files changed

+1232
-183
lines changed

11 files changed

+1232
-183
lines changed

README.md

Lines changed: 279 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,53 @@ func main() {
218218
}
219219
```
220220

221+
**Create a serverless integrated index**
222+
223+
Integrated inference requires a serverless index configured for a specific embedding model. You can either create a new index for a model, or configure an existing index for a model. To create an index that accepts source text and converts it to vectors automatically using an embedding model hosted by Pinecone, use the `Client.CreateIndexForModel` method:
224+
225+
```go
226+
package main
227+
228+
import (
229+
"context"
230+
"fmt"
231+
"github.com/pinecone-io/go-pinecone/v3/pinecone"
232+
"log"
233+
"os"
234+
)
235+
236+
func main() {
237+
ctx := context.Background()
238+
239+
clientParams := pinecone.NewClientParams{
240+
ApiKey: os.Getenv("PINECONE_API_KEY"),
241+
}
242+
243+
pc, err := pinecone.NewClient(clientParams)
244+
if err != nil {
245+
log.Fatalf("Failed to create Client: %v", err)
246+
} else {
247+
fmt.Println("Successfully created a new Client object!")
248+
}
249+
250+
index, err := pc.CreateIndexForModel(ctx, &CreateIndexForModelRequest{
251+
Name: "my-integrated-index",
252+
Cloud: "aws",
253+
Region: "us-east-1",
254+
Embed: CreateIndexForModelEmbed{
255+
Model: "multilingual-e5-large",
256+
FieldMap: map[string]interface{}{"text": "chunk_text"},
257+
},
258+
})
259+
260+
if err != nil {
261+
log.Fatalf("Failed to create serverless integrated index: %v", err)
262+
} else {
263+
fmt.Printf("Successfully created serverless integrated index: %s", idx.Name)
264+
}
265+
}
266+
```
267+
221268
**Create a pod-based index**
222269

223270
The following example creates a pod-based index with a metadata configuration. If no metadata configuration is
@@ -475,6 +522,17 @@ func main() {
475522
if err != nil {
476523
fmt.Printf("Failed to configure index: %v\n", err)
477524
}
525+
526+
// To convert an existing serverless index into an integrated index
527+
model := "multilingual-e5-large"
528+
_, err := pc.ConfigureIndex(ctx, "my-serverless-index", pinecone.ConfigureIndexParams{
529+
Embed: &pinecone.ConfigureIndexEmbed{
530+
FieldMap: &map[string]interface{}{
531+
"text": "my-text-field",
532+
},
533+
Model: &model,
534+
},
535+
})
478536
}
479537
```
480538

@@ -572,9 +630,7 @@ func main() {
572630

573631
### Upsert vectors
574632

575-
The following example upserts
576-
vectors ([both dense and sparse](https://docs.pinecone.io/guides/data/upsert-sparse-dense-vectors)) and metadata
577-
to `example-index`.
633+
The following example upserts dense vectors and metadata to `example-index`.
578634

579635
```go
580636
package main
@@ -617,35 +673,113 @@ func main() {
617673
}
618674
metadata, err := structpb.NewStruct(metadataMap)
619675

620-
sparseValues := pinecone.SparseValues{
621-
Indices: []uint32{0, 1, 2, 3, 4, 5, 6, 7},
622-
Values: []float32{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0},
623-
}
624-
625676
vectors := []*pinecone.Vector{
626677
{
627678
Id: "A",
628679
Values: []float32{0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1},
629680
Metadata: metadata,
630-
SparseValues: &sparseValues,
631681
},
632682
{
633683
Id: "B",
634684
Values: []float32{0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2},
635685
Metadata: metadata,
636-
SparseValues: &sparseValues,
637686
},
638687
{
639688
Id: "C",
640689
Values: []float32{0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3},
641690
Metadata: metadata,
642-
SparseValues: &sparseValues,
643691
},
644692
{
645693
Id: "D",
646694
Values: []float32{0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4},
647695
Metadata: metadata,
648-
SparseValues: &sparseValues,
696+
},
697+
}
698+
699+
count, err := idxConnection.UpsertVectors(ctx, vectors)
700+
if err != nil {
701+
log.Fatalf("Failed to upsert vectors: %v", err)
702+
} else {
703+
fmt.Printf("Successfully upserted %d vector(s)", count)
704+
}
705+
}
706+
```
707+
708+
The following example upserts sparse vectors and metadata to `example-sparse-index`.
709+
710+
```go
711+
package main
712+
713+
import (
714+
"context"
715+
"fmt"
716+
"github.com/pinecone-io/go-pinecone/v3/pinecone"
717+
"google.golang.org/protobuf/types/known/structpb"
718+
"log"
719+
"os"
720+
)
721+
722+
func main() {
723+
ctx := context.Background()
724+
725+
clientParams := pinecone.NewClientParams{
726+
ApiKey: os.Getenv("PINECONE_API_KEY"),
727+
}
728+
729+
pc, err := pinecone.NewClient(clientParams)
730+
if err != nil {
731+
log.Fatalf("Failed to create Client: %v", err)
732+
} else {
733+
fmt.Println("Successfully created a new Client object!")
734+
}
735+
736+
idx, err := pc.DescribeIndex(ctx, "example-sparse-index")
737+
if err != nil {
738+
log.Fatalf("Failed to describe index \"%v\": %v", idx.Name, err)
739+
}
740+
741+
idxConnection, err := pc.Index(pinecone.NewIndexConnParams{Host: idx.Host})
742+
if err != nil {
743+
log.Fatalf("Failed to create IndexConnection for Host: %v: %v", idx.Host, err)
744+
}
745+
746+
metadataMap := map[string]interface{}{
747+
"genre": "classical",
748+
}
749+
metadata, err := structpb.NewStruct(metadataMap)
750+
751+
vectors := []*pinecone.Vector{
752+
{
753+
Id: "A",
754+
Metadata: metadata,
755+
SparseValues: &pinecone.SparseValues{
756+
Indices: []uint32{0, 1, 2, 3, 4, 5, 6, 7},
757+
Values: []float32{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0},
758+
},
759+
},
760+
{
761+
Id: "B",
762+
Metadata: metadata,
763+
SparseValues: &pinecone.SparseValues{
764+
Indices: []uint32{0, 1, 2, 3, 4, 5, 6, 7},
765+
Values: []float32{3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 8.0},
766+
},
767+
},
768+
{
769+
Id: "C",
770+
Metadata: metadata,
771+
SparseValues: &pinecone.SparseValues{
772+
Indices: []uint32{0, 1, 2, 3, 4, 5, 6, 7},
773+
Values: []float32{4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 8.0, 7.0},
774+
},
775+
},
776+
{
777+
Id: "D",
778+
Metadata: metadata,
779+
SparseValues: &pinecone.SparseValues{
780+
Indices: []uint32{0, 1, 2, 3, 4, 5, 6, 7},
781+
Values: []float32{5.0, 6.0, 7.0, 8.0, 9.0, 8.0, 7.0, 6.0},
782+
},
649783
},
650784
}
651785

@@ -727,10 +861,7 @@ You can [start, cancel, and check the status](https://docs.pinecone.io/guides/da
727861

728862
#### Query by vector values
729863

730-
The following example queries the index `example-index` with vector values and metadata filtering. Note: you can
731-
also query by sparse values;
732-
see [sparse-dense documentation](https://docs.pinecone.io/guides/data/query-sparse-dense-vectors)
733-
for examples.
864+
The following example queries the index `example-index` with dense vector values and metadata filtering.
734865

735866
```go
736867
package main
@@ -1452,17 +1583,10 @@ func main() {
14521583
## Inference
14531584

14541585
The `Client` object has an `Inference` namespace which allows interacting with
1455-
Pinecone's [Inference API](https://docs.pinecone.io/reference/api/2024-07/inference/generate-embeddings). The Inference
1586+
Pinecone's [Inference API](https://docs.pinecone.io/guides/inference/generate-embeddings). The Inference
14561587
API is a service that gives you access to embedding models hosted on Pinecone's infrastructure. Read more
14571588
at [Understanding Pinecone Inference](https://docs.pinecone.io/guides/inference/understanding-inference).
14581589

1459-
**Notes:**
1460-
1461-
Models currently supported:
1462-
1463-
- Embedding: [multilingual-e5-large](https://docs.pinecone.io/guides/inference/understanding-inference#embedding-models)
1464-
- Reranking: [bge-reranker-v2-m3](https://docs.pinecone.io/models/bge-reranker-v2-m3)
1465-
14661590
### Create Embeddings
14671591

14681592
Send text to Pinecone's inference API to generate embeddings for documents and queries.
@@ -1575,6 +1699,137 @@ indicating higher relevance.
15751699
fmt.Printf("rerank response: %+v", rerankResponse)
15761700
```
15771701

1702+
### Integrated Inference
1703+
1704+
When using an index with integrated inference, embedding and reranking operations are tied to index operations and do not require extra steps. This allows working with an index that accepts source text and converts it to vectors automatically using an embedding model hosted by Pinecone.
1705+
1706+
Integrated inference requires a serverless index configured for a specific embedding model. You can either create a new index for a model or configure an existing index for a model. See **Create a serverless integrated index** above for specifics on creating these indexes.
1707+
1708+
Once you have an index configured for a specific embedding model, use the `IndexConnection.UpsertRecords` method to convert your source data to embeddings and upsert them into a namespace.
1709+
1710+
**Upsert integrated records**
1711+
1712+
Note the following requirements for each record:
1713+
1714+
- Each record must contain a unique `_id`, which will serve as the record identifier in the index namespace.
1715+
- Each record must contain a field with the data for embedding. This field must match the `FieldMap` specified when creating the index.
1716+
- Any additional fields in the record will be stored in the index and can be returned in search results or used to filter search results.
1717+
1718+
```go
1719+
ctx := context.Background()
1720+
1721+
clientParams := pinecone.NewClientParams{
1722+
ApiKey: "YOUR_API_KEY",
1723+
}
1724+
1725+
pc, err := pinecone.NewClient(clientParams)
1726+
1727+
if err != nil {
1728+
log.Fatalf("Failed to create Client: %v", err)
1729+
}
1730+
1731+
idx, err := pc.DescribeIndex(ctx, "your-index-name")
1732+
1733+
if err != nil {
1734+
log.Fatalf("Failed to describe index \"%s\". Error:%s", idx.Name, err)
1735+
}
1736+
1737+
idxConnection, err := pc.Index(pinecone.NewIndexConnParams{Host: idx.Host, Namespace: "my-namespace"})
1738+
1739+
records := []*IntegratedRecord{
1740+
{
1741+
"_id": "rec1",
1742+
"chunk_text": "Apple's first product, the Apple I, was released in 1976 and was hand-built by co-founder Steve Wozniak.",
1743+
"category": "product",
1744+
},
1745+
{
1746+
"_id": "rec2",
1747+
"chunk_text": "Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.",
1748+
"category": "nutrition",
1749+
},
1750+
{
1751+
"_id": "rec3",
1752+
"chunk_text": "Apples originated in Central Asia and have been cultivated for thousands of years, with over 7,500 varieties available today.",
1753+
"category": "cultivation",
1754+
},
1755+
{
1756+
"_id": "rec4",
1757+
"chunk_text": "In 2001, Apple released the iPod, which transformed the music industry by making portable music widely accessible.",
1758+
"category": "product",
1759+
},
1760+
{
1761+
"_id": "rec5",
1762+
"chunk_text": "Apple went public in 1980, making history with one of the largest IPOs at that time.",
1763+
"category": "milestone",
1764+
},
1765+
{
1766+
"_id": "rec6",
1767+
"chunk_text": "Rich in vitamin C and other antioxidants, apples contribute to immune health and may reduce the risk of chronic diseases.",
1768+
"category": "nutrition",
1769+
},
1770+
{
1771+
"_id": "rec7",
1772+
"chunk_text": "Known for its design-forward products, Apple's branding and market strategy have greatly influenced the technology sector and popularized minimalist design worldwide.",
1773+
"category": "influence",
1774+
},
1775+
{
1776+
"_id": "rec8",
1777+
"chunk_text": "The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.",
1778+
"category": "nutrition",
1779+
},
1780+
}
1781+
1782+
err = idxConnection.UpsertRecords(ctx, records)
1783+
if err != nil {
1784+
log.Fatalf("Failed to upsert vectors. Error: %v", err)
1785+
}
1786+
```
1787+
1788+
**Search integrated records**
1789+
1790+
Use the `IndexConnection.SearchRecords` method to convert a query to a vector embedding and then search your namespace for the most semantically similar records, along with their similarity scores.
1791+
1792+
```go
1793+
res, err := idxConnection.SearchRecords(ctx, &SearchRecordsRequest{
1794+
Query: SearchRecordsQuery{
1795+
TopK: 5,
1796+
Inputs: &map[string]interface{}{
1797+
"text": "Disease prevention",
1798+
},
1799+
},
1800+
})
1801+
if err != nil {
1802+
log.Fatalf("Failed to search records: %v", err)
1803+
}
1804+
fmt.Printf("Search results: %+v\n", res)
1805+
```
1806+
1807+
To rerank initial search results based on relevance to the query, add the rerank parameter, including the [reranking model](https://docs.pinecone.io/guides/inference/understanding-inference#reranking-models) you want to use, the number of reranked results to return, and the fields to use for reranking, if different than the main query.
1808+
1809+
For example, repeat the search for the 4 documents most semantically related to the query, “Disease prevention”, but this time rerank the results and return only the 2 most relevant documents:
1810+
1811+
```go
1812+
topN := int32(2)
1813+
res, err := idxConnection.SearchRecords(ctx, &SearchRecordsRequest{
1814+
Query: SearchRecordsQuery{
1815+
TopK: 5,
1816+
Inputs: &map[string]interface{}{
1817+
"text": "Disease prevention",
1818+
},
1819+
},
1820+
Rerank: &SearchRecordsRerank{
1821+
Model: "bge-reranker-v2-m3",
1822+
TopN: &topN,
1823+
RankFields: []string{"chunk_text"},
1824+
},
1825+
Fields: &[]string{"chunk_text", "category"},
1826+
})
1827+
if err != nil {
1828+
log.Fatalf("Failed to search records: %v", err)
1829+
}
1830+
fmt.Printf("Search results: %+v\n", res)
1831+
```
1832+
15781833
## Support
15791834

15801835
To get help using go-pinecone you can file an issue on [GitHub](https://github.com/pinecone-io/go-pinecone/issues),

go.mod

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@ require (
1616
github.com/apapsch/go-jsonmerge/v2 v2.0.0 // indirect
1717
github.com/davecgh/go-spew v1.1.1 // indirect
1818
github.com/pmezard/go-difflib v1.0.0 // indirect
19-
golang.org/x/net v0.25.0 // indirect
20-
golang.org/x/sys v0.20.0 // indirect
21-
golang.org/x/text v0.15.0 // indirect
19+
golang.org/x/net v0.33.0 // indirect
20+
golang.org/x/sys v0.30.0 // indirect
21+
golang.org/x/text v0.22.0 // indirect
2222
google.golang.org/genproto/googleapis/rpc v0.0.0-20240528184218-531527333157 // indirect
2323
gopkg.in/yaml.v3 v3.0.1 // indirect
2424
)

0 commit comments

Comments
 (0)