Posted on

MariaDB 10.3 use case: Hidden PRIMARY KEY column for closed/legacy applications

Database symbolI really like MariaDB 10.3, it has a lot of interesting new features.  Invisible (hidden) columns, for instance.  Here is a practical use case:

You’re dealing with a legacy application, possibly you don’t even have the source code (or modifying is costly), and this app doesn’t have a PRIMARY KEY on each table. Consequences:

  • Even though InnoDB would have an internal hidden ID in this case (you can’t access that column at all), it affects InnoDB’s locking strategy – I don’t think this aspect is explicitly documented, but we’ve seen table-level locks in this scenario where we’d otherwise see more granular locking;
  • Galera cluster really wants PKs;
  • Asynchronous row-based replication can work without PKs, but it is more efficient with;

So, in a nutshell, your choices for performance and scaling are restricted.

On the other hand, you can’t just add a new ID column, because the application may:

  • use SELECT * and not be able to handle seeing the extra column, and/or
  • use INSERT without an explicit list of columns and then the server would chuck an error for the missing column.

The old way

Typically, what we used to do for cases like this is hunt for UNIQUE indexes that can be converted to PK. That’d be a clean operation with no effect on the application. We use a query like the following to help us find out if this is feasible:


SELECT
CONCAT(s.TABLE_SCHEMA,'.',s.TABLE_NAME) AS tblname,
INDEX_NAME AS idxname,
GROUP_CONCAT(s.COLUMN_NAME) AS cols,
GROUP_CONCAT(IF(s.NULLABLE='YES','X','-')) AS nullable
FROM INFORMATION_SCHEMA.STATISTICS s
WHERE s.table_schema NOT IN ('information_schema','performance_schema','mysql')
AND s.NON_UNIQUE=0
GROUP BY tblname,idxname
HAVING nullable LIKE '%X%'
ORDER BY tblname;

The output is like this:


+----------+---------+------+----------+
| tblname  | idxname | cols | nullable |
+----------+---------+------+----------+
| test.foo | a       | a,b  | -,X      |
+----------+---------+------+----------+

We see that table test.foo has a UNIQUE index over columns (a,b), but column b is NULLable.  A PK may not contain any NULLable columns, so it would not be a viable candidate.

All is not yet lost though, we can further check whether the column data actually contains NULLs. If it doesn’t, we could change the column to NOT NULL and thus solve that problem.  But strictly speaking, that’s more risky as we may not know for certain that the application will never use a NULL in that column. So that’s not ideal.

IF all tables without a PK have existing (or possible) UNIQUE indexes without any NULLable columns, we can resolve this issue. But as you can appreciate, that won’t always be the case.

With MariaDB 10.3 and INVISIBLE columns

Now for Plan B (or actually, our new plan A as it’s much nicer).  Let’s take a closer look at the foo table from the above example:


CREATE TABLE foo (
a int(11) NOT NULL,
b int(11) DEFAULT NULL,
UNIQUE KEY uidx (a,b)
) ENGINE=InnoDB

Our new solution:

ALTER TABLE foo ADD COLUMN id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY INVISIBLE FIRST

So we’re adding a AUTO_INCREMENT new column named ‘id’, technically first in the table, but we also flag it as invisible.

Normally, you won’t notice, see how we can use INSERT without an explicit (a,b) column list:


MariaDB [test]> INSERT INTO  foo VALUES (2,24);
Query OK, 1 row affected (0.007 sec)

MariaDB [test]> SELECT * FROM foo;
+---+------+
| a | b    |
+---+------+
| 1 |   20 |
| 2 |   24 |
+---+------+

So it won’t even show up with SELECT *.  Fabulous.  We can see the column only if we explicitly ask for it:


MariaDB [test]> SELECT id,a,b FROM foo;
+----+---+------+
| id | a |    b |
+----+---+------+
|  1 | 1 |   20 |
|  2 | 2 |   24 |
+----+---+------+

We solved our table structural issue by introducing a clean AUTO_INCREMENT id column covered by a PRIMARY KEY, while the application remains perfectly happy with what it can SELECT and INSERT. Total win!

Posted on