You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/develop/use/patterns/indexes/index.md
+19-39Lines changed: 19 additions & 39 deletions
Original file line number
Diff line number
Diff line change
@@ -9,9 +9,7 @@ categories:
9
9
- oss
10
10
- kubernetes
11
11
- clients
12
-
description: 'Building secondary indexes in Redis
13
-
14
-
'
12
+
description: Building secondary indexes in Redis
15
13
linkTitle: Secondary indexing
16
14
title: Secondary indexing
17
15
weight: 1
@@ -31,8 +29,7 @@ users that need to perform complex queries on data should understand if they
31
29
are better served by a relational store. However often, especially in caching
32
30
scenarios, there is the explicit need to store indexed data into Redis in order to speedup common queries which require some form of indexing in order to be executed.
33
31
34
-
Simple numerical indexes with sorted sets
35
-
===
32
+
## Simple numerical indexes with sorted sets
36
33
37
34
The simplest secondary index you can create with Redis is by using the
38
35
sorted set data type, which is a data structure representing a set of
@@ -79,8 +76,7 @@ reversed order, which is often useful when data is indexed in a given
79
76
direction (ascending or descending) but we want to retrieve information
80
77
the other way around.
81
78
82
-
Using objects IDs as associated values
83
-
---
79
+
### Using objects IDs as associated values
84
80
85
81
In the above example we associated names to ages. However in general we
86
82
may want to index some field of an object which is stored elsewhere.
@@ -111,8 +107,7 @@ In the next examples we'll almost always use IDs as values associated with
111
107
the index, since this is usually the more sounding design, with a few
112
108
exceptions.
113
109
114
-
Updating simple sorted set indexes
115
-
---
110
+
### Updating simple sorted set indexes
116
111
117
112
Often we index things which change over time. In the above
118
113
example, the age of the user changes every year. In such a case it would
@@ -133,8 +128,7 @@ to execute the following two commands:
133
128
The operation may be wrapped in a [`MULTI`]({{< relref "/commands/multi" >}})/[`EXEC`]({{< relref "/commands/exec" >}}) transaction in order to
134
129
make sure both fields are updated or none.
135
130
136
-
Turning multi dimensional data into linear data
137
-
---
131
+
### Turning multi dimensional data into linear data
138
132
139
133
Indexes created with sorted sets are able to index only a single numerical
140
134
value. Because of this you may think it is impossible to index something
@@ -151,8 +145,7 @@ linear score of a sorted set to many small *squares* in the earth surface.
151
145
By doing an 8+1 style center plus neighborhoods search it is possible to
152
146
retrieve elements by radius.
153
147
154
-
Limits of the score
155
-
---
148
+
### Limits of the score
156
149
157
150
Sorted set elements scores are double precision floats. It means that
158
151
they can represent different decimal or integer values with different
@@ -165,8 +158,7 @@ When representing much larger numbers, you need a different form of indexing
165
158
that is able to index numbers at any precision, called a lexicographical
166
159
index.
167
160
168
-
Lexicographical indexes
169
-
===
161
+
## Lexicographical indexes
170
162
171
163
Redis sorted sets have an interesting property. When elements are added
172
164
with the same score, they are sorted lexicographically, comparing the
@@ -229,8 +221,7 @@ string and the infinitely positive string, which are `-` and `+`.
229
221
230
222
That's it basically. Let's see how to use these features to build indexes.
231
223
232
-
A first example: completion
233
-
---
224
+
### A first example: completion
234
225
235
226
An interesting application of indexing is completion. Completion is what
236
227
happens when you start typing your query into a search engine: the user
@@ -256,8 +247,7 @@ as start, and the same string plus a trailing byte set to 255, which is `\xff` i
256
247
257
248
Note that we don't want too many items returned, so we may use the **LIMIT** option in order to reduce the number of results.
258
249
259
-
Adding frequency into the mix
260
-
---
250
+
### Adding frequency into the mix
261
251
262
252
The above approach is a bit naive, because all the user searches are the same
263
253
in this way. In a real system we want to complete strings according to their
@@ -320,8 +310,7 @@ A refinement to this algorithm is to pick entries in the list according to
320
310
their weight: the higher the score, the less likely entries are picked
321
311
in order to decrement its score, or evict them.
322
312
323
-
Normalizing strings for case and accents
324
-
---
313
+
### Normalizing strings for case and accents
325
314
326
315
In the completion examples we always used lowercase strings. However
327
316
reality is much more complex than that: languages have capitalized names,
@@ -343,8 +332,7 @@ Basically we add another field that we'll extract and use only for
343
332
visualization. Ranges will always be computed using the normalized strings
344
333
instead. This is a common trick which has multiple applications.
345
334
346
-
Adding auxiliary information in the index
347
-
---
335
+
### Adding auxiliary information in the index
348
336
349
337
When using a sorted set in a direct way, we have two different attributes
350
338
for each object: the score, which we use as an index, and an associated
@@ -380,8 +368,7 @@ that the separator will never happen to be part of the key.
380
368
For example if you use two null bytes as separator `"\0\0"`, you may
381
369
want to always escape null bytes into two bytes sequences in your strings.
382
370
383
-
Numerical padding
384
-
---
371
+
### Numerical padding
385
372
386
373
Lexicographical indexes may look like good only when the problem at hand
387
374
is to index strings. Actually it is very simple to use this kind of index
@@ -410,8 +397,7 @@ decimal part with trailing zeroes like in the following list of numbers:
410
397
00000002121241.34893482930000
411
398
00999999999999.00000000000000
412
399
413
-
Using numbers in binary form
414
-
---
400
+
### Using numbers in binary form
415
401
416
402
Storing numbers in decimal may use too much memory. An alternative approach
417
403
is just to store numbers, for example 128 bit integers, directly in their
@@ -423,8 +409,7 @@ the least significant bytes. This way when Redis compares the strings with
423
409
Keep in mind that data stored in binary format is less observable for
424
410
debugging, harder to parse and export. So it is definitely a trade off.
425
411
426
-
Composite indexes
427
-
===
412
+
## Composite indexes
428
413
429
414
So far we explored ways to index single fields. However we all know that
430
415
SQL stores are able to create indexes using multiple fields. For example
@@ -481,8 +466,7 @@ ID 90, regardless of the *current* fields values of the object, we just
481
466
have to retrieve the hash value by object ID and [`ZREM`]({{< relref "/commands/zrem" >}}) it in the sorted
482
467
set view.
483
468
484
-
Representing and querying graphs using a hexastore
485
-
===
469
+
## Representing and querying graphs using a hexastore
486
470
487
471
One cool thing about composite indexes is that they are handy in order
488
472
to represent graphs, using a data structure which is called
@@ -538,8 +522,7 @@ matteocollina.
538
522
539
523
Make sure to check [Matteo Collina's slides about Levelgraph](http://nodejsconfit.levelgraph.io/) in order to better understand these ideas.
540
524
541
-
Multi dimensional indexes
542
-
===
525
+
## Multi dimensional indexes
543
526
544
527
A more complex type of index is an index that allows you to perform queries
545
528
where two or more variables are queried at the same time for specific
@@ -703,8 +686,7 @@ For now, the good thing is that the complexity may be easily encapsulated
703
686
inside a library that can be used in order to perform indexing and queries.
704
687
One example of such library is [Redimension](https://github.com/antirez/redimension), a proof of concept Ruby library which indexes N-dimensional data inside Redis using the technique described here.
705
688
706
-
Multi dimensional indexes with negative or floating point numbers
707
-
===
689
+
## Multi dimensional indexes with negative or floating point numbers
708
690
709
691
The simplest way to represent negative values is just to work with unsigned
710
692
integers and represent them using an offset, so that when you index, before
@@ -715,8 +697,7 @@ For floating point numbers, the simplest approach is probably to convert them
715
697
to integers by multiplying the integer for a power of ten proportional to the
716
698
number of digits after the dot you want to retain.
717
699
718
-
Non range indexes
719
-
===
700
+
## Non range indexes
720
701
721
702
So far we checked indexes which are useful to query by range or by single
722
703
item. However other Redis data structures such as Sets or Lists can be used
@@ -741,8 +722,7 @@ are added with [`LPUSH`]({{< relref "/commands/lpush" >}}) and trimmed with [`LT
741
722
with just the latest N items encountered, in the same order they were
742
723
seen.
743
724
744
-
Index inconsistency
745
-
===
725
+
## Index inconsistency
746
726
747
727
Keeping the index updated may be challenging, in the course of months
748
728
or years it is possible that inconsistencies are added because of software
0 commit comments