How MySQL Optimises ORDER BY

How MySQL Optimises ORDER BY



In some cases MySQL can uses index to satisfy an ORDER BY or GROUP BY request without doing any extra sorting.


The index can also be used even if the ORDER BY doesn’t match the index exactly, as long as all the unused index parts and all the extra are ORDER BY columns are constants in the WHERE clause. The following queries will use the index to resolve the ORDER BY / GROUP BY part:



SELECT * FROM t1 ORDER BY key_part1,key_part2,…
SELECT * FROM t1 WHERE key_part1=constant ORDER BY key_part2
SELECT * FROM t1 WHERE key_part1=constant GROUP BY key_part2
SELECT * FROM t1 ORDER BY key_part1 DESC,key_part2 DESC
SELECT * FROM t1 WHERE key_part1=1 ORDER BY key_part1 DESC,key_part2 DESC


Some cases where MySQL can not use indexes to resolve the ORDER BY: (Note that MySQL will still use indexes to find the rows that matches the WHERE clause):


* You are doing an ORDER BY on different keys: SELECT * FROM t1 ORDER BY key1,key2
* You are doing an ORDER BY using non-consecutive key parts. SELECT * FROM t1 WHERE key2=constant ORDER BY key_part2
* You are mixing ASC and DESC. SELECT * FROM t1 ORDER BY key_part1 DESC,key_part2 ASC
* The key used to fetch the rows are not the same one that is used to do the ORDER BY: SELECT * FROM t1 WHERE key2=constant ORDER BY key1
* You are joining many tables and the columns you are doing an ORDER BY on are not all from the first not-const table that is used to retrieve rows (This is the first table in the EXPLAIN output which doesn’t use a const row fetch method).
* You have different ORDER BY and GROUP BY expressions.
* The used table index is an index type that doesn’t store rows in order. (Like the HASH index in HEAP tables).


In the cases where MySQL have to sort the result, it uses the following algorithm:


* Read all rows according to key or by table scanning. Rows that don’t match the WHERE clause are skipped.
* Store the sort-key in a buffer (of size sort_buffer).
* When the buffer gets full, run a qsort on it and store the result in a temporary file. Save a pointer to the sorted block. (In the case where all rows fits into the sort buffer, no temporary file is created)
* Repeat the above until all rows have been read.
* Do a multi-merge of up to MERGEBUFF (7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file.
* Repeat the following until there is less than MERGEBUFF2 (15) blocks left.
* On the last multi-merge, only the pointer to the row (last part of the sort-key) is written to a result file.
* Now the code in `sql/records.cc’ will be used to read through them in sorted order by using the row pointers in the result file. To optimise this, we read in a big block of row pointers, sort these and then we read the rows in the sorted order into a row buffer (read_rnd_buffer_size) .


You can with EXPLAIN SELECT … ORDER BY check if MySQL can use indexes to resolve the query. If you get Using filesort in the extra column, then MySQL can’t use indexes to resolve the ORDER BY. See section 5.2.1 EXPLAIN Syntax (Get Information About a SELECT).


If you want to have a higher ORDER BY speed, you should first see if you can get MySQL to use indexes instead of having to do an extra sorting phase. If this is not possible, then you can do:


* Increase the size of the sort_buffer_size variable.
* Increase the size of the read_rnd_buffer_size variable.
* Change tmpdir to point to a dedicated disk with lots of empty space. If you use MySQL 4.1 or later you can spread load between several physical disks by setting tmpdir to a list of paths separated by colon : (semicolon ; on Windows). They will be used in round-robin fashion. Note: These paths should end up on different physical disks, not different partitions of the same disk.


By default, MySQL sorts all GROUP BY x,y[,…] queries as if you specified ORDER BY x,y[,…] in the query as well. If you include the ORDER BY clause explicitly, MySQL optimises it away without any speed penalty, though the sorting still occurs. If a query includes GROUP BY but you want to avoid the overhead of sorting the result, you can supress sorting by specifying ORDER BY NULL:


INSERT INTO foo SELECT a,COUNT(*) FROM bar GROUP BY a ORDER BY NULL;


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.