Multiple Clustering Indexes

Introduction

Covering indexes can result in orders of magnitude performance improvements for queries. Bradley's presentation on covering indexes describes what a covering key is, how it can effect performance, and why it works. However, the definition of covering indexes can get cumbersome since MySQL limits the number of columns in a key to 16 (32 on MariaDB).

Tokutek introduced multiple clustering indexes into MySQL to address these problems. Zardosht describes the multiple clustering indexes feature and how clustering indexes differ from covering indexes. Zardosht also describes the query versus update tradeoffs that exist with clustering indexes.

TokuDB now runs in MySQL 5.5, MariaDB 5.5, Percona Server 5.6, and MariaDB 10. Multiple clustering indexes are supported on all of these platforms. However, the user interface for clustering indexes is slightly different. We describe the differences here.

Tokutek's user interface for clustering indexes

Tokutek added the clustering attribute to the key definition and extended the MySQL grammar to use it. Tokutek also tweaked the query optimizer to understand that clustering indexes cover all of the columns defined in the table.

Here is how a TokuDB table is created with a secondary clustering index.

mysql> create table t (a int, b int, c int, key(a), clustering key(b)) engine=tokudb;

mysql> show create table t;
CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  KEY `a` (`a`),
  CLUSTERING KEY `b` (`b`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1

Here is how a clustering index is added to an existing TokuDB table.

mysql> create clustering index 'c' on t(c);

mysql> show create table t;
CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  KEY `a` (`a`),
  CLUSTERING KEY `b` (`b`),
  CLUSTERING KEY `c` (`c`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1

Percona Server's user interface for clustering indexes

The clustering key user interface in Percona Server has the same grammar as Tokutek's so the user interface is the same. However, Percona decided to not allow secondary clustering indexes to be created on InnoDB, which does not support them yet. In contrast, Tokutek's implementation silently ignores the clustering keywords for InnoDB. In fact the examples in the previous section were captured on Percona Server 5.6.

Percona Server will not create an InnoDB table with a secondary clustering index.

mysql> create table t (a int, b int, c int, key(a), clustering key(b)) engine=innodb;
ERROR 1478 (HY000): Table storage engine 'InnoDB' does not support the create option 'CLUSTERING'

Percona Server will not add a secondary clustering index to an InnoDB table because InnoDB does not support it.

mysql> create table t (a int, b int, c int, key(a)) engine=innodb;
Query OK, 0 rows affected (0.13 sec)

mysql> create clustering index b on t(b);
ERROR 1478 (HY000): Table storage engine 'InnoDB' does not support the create option 'CLUSTERING'

MariaDB's user interface for clustering indexes

When the MariaDB engineers ported TokuDB to MariaDB, they decided not to extend the grammar to support the clustering attribute, but rather to have TokuDB define a clustering index attribute. Index attributes is a feature specific to MariaDB. As a consequence, the user interface for clustering indexes in MariaDB's port of TokuDB is different than Tokutek's.

Here is how a TokuDB table is created with a clustering index on 'b'.

MariaDB [test]> create table t (a int, b int, c int, key(a), key(b) clustering=yes) engine=tokudb;

Note that the clustering attribute for the index on 'b'.

MariaDB [test]> show create table t;
CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  KEY `a` (`a`),
  KEY `b` (`b`) `clustering`=yes
) ENGINE=TokuDB DEFAULT CHARSET=latin1

Here is how a clustering index on 'c' is added to an existing table.

MariaDB [test]> create index c on t(c) clustering=yes;

MariaDB [test]> show create table t;                                                                                                                                                                                                                              
CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  KEY `a` (`a`),
  KEY `b` (`b`) `clustering`=yes,
  KEY `c` (`c`) `clustering`=yes
) ENGINE=TokuDB DEFAULT CHARSET=latin1 `compression`='tokudb_zlib'

The parser accepts 'clustering=yes|no|1|0', but does not accept 'clustering=y|n|true|false'.

MariaDB [test]> create index c on t(c) clustering=y;
ERROR 1912 (HY000): Incorrect value 'y' for option 'clustering'

Portability issues

The meta-data to store Tokutek's implementation of multiple clustering indexes and Maria's implementation is different. This makes portability between the two systems problematic without a conversion tool. I like Maria's implementation of index and table options as it allows the storage engine to define new capabilities without hacks to the common MySQL or MariaDB code. It would be nice if MySQL would also support this feature.

Multiple Clustering Indexes

Introduction

Tokutek's user interface for clustering indexes

Percona Server's user interface for clustering indexes

MariaDB's user interface for clustering indexes

Portability issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!