Multiple Clustering Indexes

Introduction

Covering indexes can result in orders of magnitude performance improvements for queries. Bradley's presentation describes what a covering key is, how it can effect performance, and why it works. However, the definition of covering indexes can get cumbersome since MySQL limits the number of columns in a key to 16 (32 on MariaDB).

Tokutek introduced multiple clustering indexes into MySQL to address these problems. Zardosht describes the multiple clustering indexes feature and how clustering indexes differ from covering indexes. Zardosht also describes the query versus update tradeoffs.

TokuDB now runs in MySQL 5.5, MariaDB 5.5, Percona Server 5.6, and MariaDB 10. Multiple clustering indexes are supported on all of these platforms. However, the user interface for multiple clustering indexes is slightly different. We describe the differences here.

Tokutek's user interface for clustering indexes

Tokutek added the clustering attribute to the key definition and extended the MySQL and MariaDB grammar to use it. We also tweaked the query optimizer to understand that clustering indexes cover all of the columns defined in the table.

Create a TokuDB table with a secondary clustering index.

mysql> create table t (x int, y int, z int, key(x), clustering key(y));

mysql> show create table t;
CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `y` int(11) DEFAULT NULL,
  `z` int(11) DEFAULT NULL,
  KEY `x` (`x`),
  CLUSTERING KEY `y` (`y`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1

Add a clustering index to an existing TokuDB table.

mysql> create clustering index z on t(z);

mysql> show create table t;
CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `y` int(11) DEFAULT NULL,
  `z` int(11) DEFAULT NULL,
  KEY `x` (`x`),
  CLUSTERING KEY `y` (`y`),
  CLUSTERING KEY `z` (`z`)
) ENGINE=TokuDB DEFAULT CHARSET=latin1

Percona Server's user interface for clustering indexes

The clustering key user interface in Percona Server has the same grammar as Tokutek's so the user interface is the same as Tokutek's. However, Percona decided to not allow secondary clustering indexes to be created on InnoDB, which does not support them yet. In contrast, Tokutek's implementation silently ignores the clustering keywords for InnoDB. In fact the examples in the previous section were captured on Percona Server 5.6.

Percona Server will not create an InnoDB table with a secondary clustering index.

mysql> create table t (x int, y int, z int, clustering key(x)) engine=innodb;
ERROR 1478 (HY000): Table storage engine 'InnoDB' does not support the create option 'CLUSTERING'

Percona Server will not add a secondary clustering index to an InnoDB table because InnoDB does not support it.

mysql> create table t (x int, y int, z int, key(x)) engine=innodb;
Query OK, 0 rows affected (0.13 sec)

mysql> create clustering index z on t(z);
ERROR 1478 (HY000): Table storage engine 'InnoDB' does not support the create option 'CLUSTERING'

MariaDB's user interface for clustering indexes

When the MariaDB engineers ported TokuDB to MariaDB, they decided to not extended the grammar to support the clustering attribute, but rather to have TokUDB define a clustering table option using an clustering index attribute. Index attributes is a feature specific to MariaDB. As a consequence, the user interface for clustering indexes in MariaDB's port of TokuDB is different than Tokutek's.

Create a TokuDB table with a clustering index on 'y'.

MariaDB [test]> create table t (x int, y int, z int, key(x), key(y) clustering=yes);

Note that the clustering attribute is stored for the index on 'y'.

MariaDB [test]> show create table t;
CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `y` int(11) DEFAULT NULL,
  `z` int(11) DEFAULT NULL,
  KEY `x` (`x`),
  KEY `y` (`y`) `clustering`=yes
) ENGINE=TokuDB DEFAULT CHARSET=latin1 `compression`='tokudb_zlib'

Add a clustering index on 'z'.

MariaDB [test]> create index z on t(z) clustering=yes;

MariaDB [test]> show create table t;                                                                                                                                                                                                                              
CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `y` int(11) DEFAULT NULL,
  `z` int(11) DEFAULT NULL,
  KEY `x` (`x`),
  KEY `y` (`y`) `clustering`=yes,
  KEY `z` (`z`) `clustering`=yes
) ENGINE=TokuDB DEFAULT CHARSET=latin1 `compression`='tokudb_zlib'

In what I consider a quick of the implementation, 'yes' works, but 'y' does not work

MariaDB [test]> create index z on t(z) clustering=y;
ERROR 1912 (HY000): Incorrect value 'y' for option 'clustering'

and '1' works.

MariaDB [test]> create index z on t(z) clustering=1;

MariaDB [test]> show create table t;
CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `y` int(11) DEFAULT NULL,
  `z` int(11) DEFAULT NULL,
  KEY `x` (`x`),
  KEY `y` (`y`) `clustering`=yes,
  KEY `z` (`z`) `clustering`=1
) ENGINE=TokuDB DEFAULT CHARSET=latin1 `compression`='tokudb_zlib'

In addition, MariaDB always stores the clustering attribute if it is included when the table is defined or altered.

MariaDB [test]> create index z on t(z) clustering=no;

MariaDB [test]> show create table t;
CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `y` int(11) DEFAULT NULL,
  `z` int(11) DEFAULT NULL,
  KEY `x` (`x`),
  KEY `y` (`y`) `clustering`=yes,
  KEY `z` (`z`) `clustering`=no
) ENGINE=TokuDB DEFAULT CHARSET=latin1 `compression`='tokudb_zlib'

Portability issues

The meta-data to store Tokutek's implementation of multiple clustering indexes and Maria's implementation is different. This makes portability between the two systems problematic without a conversion tool. I like Maria's implementation of index and table options as it allows the storage engine to define new capabilities with hacks to the common MySQL or MariaDB code. It would be nice if MySQL would also support this feature.

Multiple Clustering Indexes

Introduction

Tokutek's user interface for clustering indexes

Percona Server's user interface for clustering indexes

MariaDB's user interface for clustering indexes

Portability issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!