Friend and former colleague Roland Bouwman has written an excellent update on the GROUP BY implementation in MySQL.
MySQL’s implementation of GROUP BY has historically been quirky. Sometimes that quirkiness has been useful, but often it causes grief as SQL authors can make mistakes that are executed but don’t produce the results they want (or expect).
SELECT cat, COUNT(val) as cnt, othercol FROM tbl GROUP BY cat
The ‘cat‘ column is in the GROUP BY clause, the COUNT(val) is an aggregate, but the ‘othercol‘ column is … well… neither. What used to effectively happen is that the server would pick one othercol value from within each group. As I noted before, sometimes useful but often a pest as the server wouldn’t know if you just made a mistake or whether you actually intended to use this ‘feature’.
The long existent sql_mode option ONLY_FULL_GROUP_BY mitigated some of that (but not fully in all cases – see Roland’s explanation).
With the sql_mode option enabled, the server chucks an error for faulty constructs, the most common one being:
Error: 1055 SQLSTATE: 42000 (ER_WRONG_FIELD_WITH_GROUP)
Message: ‘col’ isn’t in GROUP BY
MySQL 5.7.5 and above have ONLY_FULL_GROUP_BY enabled by default, with an updated implementation, so GROUP BY will now generally behave properly.
We know that many existing applications often rely on the previously “standard” but technically quirky/faulty behaviour. So when upgrading to this version or beyond, it’s important to
- use and test in development and staging environment first. This also means upgrading your dev environment first.
- scan the code for GROUP BY use, and ensure that the queries are correct.
- have proper error handling and reporting in your application.
- in some cases disable the sql_mode ONLY_FULL_GROUP_BY option, but that’s really a last resort. It’s best to fix the application.