This is totally and 100% cheating. If someone catches you doing this, the SQL police will come after you and cite you for not using a self-join. Just so you know.
Let’s say you have a table that keeps track of users, colors, and timestamps, and you want to know the most recent color for each user. It might look something like:
There are two standard solutions.
SELECT `user`, `color` FROM ( SELECT `user`, MAX(`timestamp`) AS `timestamp` FROM `user_colors` GROUP BY `user` ) `alias1` INNER JOIN `user_colors` USING (`user`, `timestamp`)
Or you could do a left join and handle this with the criterion for joining.
SELECT `user`, `color` FROM `user_colors` AS `uc1` LEFT JOIN `user_colors` AS `uc2` ON (`uc1`.`user` = `uc2`.`user` AND `uc1`.`timestamp` < `uc2`.`timestamp`) WHERE `uc2`.`user` IS NULL
I can never remember how to do the second one, and full disclosure, it might be wrong.
I don’t have anything easy around to test it on. I frequently get the direction wrong
on the inequality. Sometimes I forget the
WHERE clause until I look at my results and
notice that they are totally not what I wanted.
Don’t read any further if you are worried about the SQL police.
For whatever reason, there is something about my tables and their size and their structure that makes these self-joins really slow. I have come up with a work-around.
SELECT `user`, SUBSTRING_INDEX( GROUP_CONCAT( `color` ORDER BY `timestamp` DESC SEPARATOR "," ), ",", 1 ) FROM `user_colors` GROUP BY `user`
GROUP_CONCAT makes a comma-separted list of all the colors for the user, in
descending order by timestamp. The
SUBSTRING_INDEX takes the part of the string
before the first comma. To be honest, I have no idea what sort of witchcraft the
query planner is doing to make this work, but on my tables it is faster than the
Note that a
GROUP_CONCAT will fail silently and horrifically if you have more than
a thousand characters in a group. There are ways around this, but the finer points
tend to depend on the system that you’re using.