Skip to content

Commit 9239ea6

Browse files
DOC-5073 fixed 2nd/3rd person inconsistencies
1 parent b5e40b7 commit 9239ea6

File tree

1 file changed

+13
-14
lines changed

1 file changed

+13
-14
lines changed

content/develop/clients/redis-py/vecsets.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ import numpy as np
5656

5757
The first of these imports is the
5858
`SentenceTransformer` class, which generates an embedding from a section of text.
59-
Here, we create an instance of `SentenceTransformer` that uses the
59+
This example uses an instance of `SentenceTransformer` with the
6060
[`all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
6161
model for the embeddings. This model generates vectors with 384 dimensions, regardless
6262
of the length of the input text, but note that the input is truncated to 256
@@ -71,8 +71,8 @@ model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
7171

7272
## Create the data
7373

74-
For the example, we will use a dictionary of data that contains brief
75-
descriptions of some famous people:
74+
The example data is contained a dictionary with some brief
75+
descriptions of famous people:
7676

7777
```python
7878
peopleData = {
@@ -146,11 +146,11 @@ The code below uses the dictionary's
146146
view to iterate through all the key-value pairs and add corresponding
147147
elements to a vector set called `famousPeople`.
148148

149-
We use the
149+
Use the
150150
[`encode()`](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode)
151151
method of `SentenceTransformer` to generate the
152152
embedding as an array of `float32` values. The `tobytes()` method converts
153-
the array to a byte string that we pass to the
153+
the array to a byte string that you can pass to the
154154
[`vadd()`]({{< relref "/commands/vadd" >}}) command to set the embedding.
155155
Note that `vadd()` can also accept a list of `float` values to set the
156156
vector, but the byte string format is more compact and saves a little
@@ -183,9 +183,9 @@ for name, details in peopleData.items():
183183

184184
## Query the vector set
185185

186-
We can now query the data in the set. The basic approach is to use the
186+
You can now query the data in the set. The basic approach is to use the
187187
`encode()` method to generate another embedding vector for the query text.
188-
(This is the same method we used when we added the elements to the set.) Then, we pass
188+
(This is the same method used to add the elements to the set.) Then, pass
189189
the query vector to [`vsim()`]({{< relref "/commands/vsim" >}}) to return elements
190190
of the set, ranked in order of similarity to the query.
191191

@@ -211,8 +211,8 @@ This returns the following list of elements (formatted slightly for clarity):
211211
```
212212

213213
The first two people in the list are the two actors, as expected, but none of the
214-
people from Linus Pauling onward was especially well-known for acting (and we certainly
215-
didn't include any information about that in the short description text).
214+
people from Linus Pauling onward was especially well-known for acting (and there certainly
215+
isn't any information about that in the short description text).
216216
As it stands, the search attempts to rank all the elements in the set, based
217217
on the information contained in the embedding model.
218218
You can use the `count` parameter of `vsim()` to limit the list of elements
@@ -234,10 +234,9 @@ print(f"'actors (2)': {two_actors_results}")
234234
The reason for using text embeddings rather than simple text search
235235
is that the embeddings represent semantic information. This allows a query
236236
to find elements with a similar meaning even if the text is
237-
different. For example, we
238-
don't use the word "entertainer" in any of the descriptions but
239-
if we use it as a query, the actors and musicians are ranked highest
240-
in the results list:
237+
different. For example, the word "entertainer" doesn't appear in any of the
238+
descriptions but if you use it as a query, the actors and musicians are ranked
239+
highest in the results list:
241240

242241
```py
243242
query_value = "entertainer"
@@ -253,7 +252,7 @@ print(f"'entertainer': {entertainer_results}")
253252
# 'Paul Erdos', 'Maryam Mirzakhani', 'Marie Curie']
254253
```
255254

256-
Similarly, if we use "science" as a query, we get the following results:
255+
Similarly, if you use "science" as a query, you get the following results:
257256

258257
```
259258
'science': ['Marie Curie', 'Linus Pauling', 'Maryam Mirzakhani',

0 commit comments

Comments
 (0)