Skip to content

Commit 7028adb

Browse files
authored
Merge pull request #9239 from soyeric128/blog-index
docs: blog for indexing
2 parents 6ec6846 + 3b05feb commit 7028adb

File tree

3 files changed

+175
-0
lines changed

3 files changed

+175
-0
lines changed

website/blog/2022-12-13-indexing.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
title: Databend Indexing Explained
3+
description: Indexing
4+
slug: databend-indexing
5+
date: 2022-12-13
6+
tags: [databend, indexing]
7+
authors:
8+
- name: wubx
9+
url: https://github.com/wubx
10+
image_url: https://github.com/wubx.png
11+
---
12+
13+
When working with Databend, you don't bother maintaining indexes. Databend takes advantage of these indexing techniques to automatically build and manage indexes on the fly:
14+
15+
- Min/Max index
16+
- Bloom index
17+
- Cluster key
18+
19+
## Min/Max Index
20+
Min/Max Index is the key indexing technique for OLAP databases. Databend Fuse Engine uses it as the main indexing method to build indexes and store them in snapshots, segments, and blocks. The following shows how the Min/Max Index works for a table in Databend.
21+
22+
First, use [SHOW CREATE TABLE](https://databend.rs/doc/sql-commands/show/show-create-table) to find the initial snapshot file created for the table:
23+
24+
```sql
25+
show create table ontime(
26+
`Year` INT, -- First column
27+
...
28+
) ENGINE=FUSE SNAPSHOT_LOCATION='1/458/_ss/71b460c61fa943d1a391d3118ebd984c_v1.json'
29+
```
30+
31+
Downdoad and open the snapshot file with VSCODE:
32+
33+
```json
34+
{
35+
"format_version": 1,
36+
"snapshot_id": "71b460c6-1fa9-43d1-a391-d3118ebd984c",
37+
"timestamp": "2022-11-29T03:44:03.419194Z",
38+
"prev_snapshot_id": null,
39+
"schema": {
40+
"fields": [
41+
... -- Field definitions
42+
],
43+
"metadata": {}
44+
},
45+
"summary": {
46+
"row_count": 90673588,
47+
"block_count": 200,
48+
"perfect_block_count": 0,
49+
"uncompressed_byte_size": 65821591614,
50+
"compressed_byte_size": 2761791374,
51+
"index_size": 1194623,
52+
"col_stats": {
53+
...
54+
"0": { -- Min/Max indexes for the first column 'Year' in the table
55+
"min": {
56+
"Int64": 1987
57+
},
58+
"max": {
59+
"Int64": 2004
60+
},
61+
"null_count": 0,
62+
"in_memory_size": 362694352,
63+
"distinct_of_values": 0
64+
},
65+
...
66+
}
67+
},
68+
"segments": [
69+
...
70+
[
71+
"1/458/_sg/ddccbb022ba74387be0b41eefd16bbbe_v1.json",
72+
1
73+
],
74+
...
75+
],
76+
"cluster_key_meta": null
77+
}
78+
```
79+
80+
The file above indicates that the min value of the first column is `1987` and the max is `2004`. The indexes in a snapshot file can tell you whether the data you want to retrieve exists in the table. For example, no data would be returned for the following query if Databend cannot find a matching Min/Max interval in all snapshots:
81+
82+
```sql
83+
select avg(DepDelay) from ontime where Year='2003';
84+
```
85+
86+
Databend Fuse Engine stores the most important indexes in segment files. At the end of a snashot file, you can find information about which segments are related to the snapshot. Here's a sample segment file:
87+
88+
```json
89+
{
90+
"format_version":1,
91+
"blocks":[
92+
{ -- block ...
93+
...
94+
"row_count": 556984,
95+
"block_size": 405612604,
96+
"file_size": 25302413,
97+
"col_stats": {
98+
...
99+
"0": {
100+
"min": {
101+
"Int64": 2003
102+
},
103+
"max": {
104+
"Int64": 2003
105+
},
106+
"null_count": 0,
107+
"in_memory_size": 2227936,
108+
"distinct_of_values": 1
109+
},
110+
...
111+
},
112+
"col_metas": {
113+
-- Used to record the start position and length of each column
114+
},
115+
"cluster_stats": null,
116+
"location": [
117+
"1/458/_b/e4f3795c79004f22b80ed5ee821edf23_v0.parquet",
118+
0
119+
],
120+
"bloom_filter_index_location": [
121+
"1/458/_i_b_v2/e4f3795c79004f22b80ed5ee821edf23_v2.parquet",
122+
2
123+
],
124+
"bloom_filter_index_size": 60207,
125+
"compression": "Lz4Raw"
126+
...
127+
}
128+
],
129+
"summary": {
130+
"row_count": 11243809,
131+
"block_count": 25,
132+
"perfect_block_count": 25,
133+
"uncompressed_byte_size": 8163837349,
134+
"compressed_byte_size": 339392734,
135+
"index_size": 1200133,
136+
"col_stats": {
137+
...
138+
"0": {
139+
"min": {
140+
"Int64": 1988
141+
},
142+
"max": {
143+
"Int64": 2003
144+
},
145+
"null_count": 0,
146+
"in_memory_size": 44975236,
147+
"distinct_of_values": 0
148+
},
149+
...
150+
}
151+
}
152+
}
153+
```
154+
155+
From the sample above, we can see that a segment file contains its own Min/Max index information. So does a block file. The Min/Max indexes are layered and distributed among snapshots, segments, and blocks like this:
156+
157+
![Alt text](../static/img/blog/index-1.png)
158+
159+
When retrieving data for a query, Databend starts from the snapshot indexes and locates the corresponding segment by matching the Min/Max interval. Then, it looks up the indexes in the segment file to find the block where the required data is stored and reads data from the block file with information about the start position from `col_metas`. So Databend literally processes a query by finding the right segments and blocks with the Min/Max Index.
160+
161+
## Bloom Index
162+
163+
For queries requiring an exact string match, Databend uses the Min/Max Index to find the right block first, and then locates the offsets with the bloom index information in `bloom_filter_index_location` to retrieve data from the block.
164+
165+
For more information about the Bloom Index, see https://databend.rs/blog/xor-filter.
166+
167+
## Cluster Key
168+
169+
The Min/Max Index seems to work perfectly for Databend, but in fact, data is usually written into a table out of order. As a result, segments and blocks might be created with overlapped Min/Max intervals.
170+
171+
For example, you need to access up to three parquet files for a query condition like `Age = 20 & Age = 35`. If Age is set as the cluster key, Databend will sort the data by the Age column and combine as many small parquet files as possible.
172+
173+
![Alt text](../static/img/blog/index-2.png)
174+
175+
For more information about the cluster key, see https://databend.rs/doc/sql-commands/ddl/clusterkey/.

website/static/img/blog/index-1.png

160 KB
Loading

website/static/img/blog/index-2.png

75.7 KB
Loading

0 commit comments

Comments
 (0)