JJB Blog

SQL CHALLENGE

Arrrrre you readyyyyy— for an es Q elllllllll— CHALLENGE!!!!!

I have a user table, let’s say with name and zipcode. I can do the following successfully:

  1. Select a random user with SELECT * FROM users ORDER BY RAND(UNIX_TIMESTAMP()) LIMIT 1
  2. Select all users, with only 1 from each zipcode, using SELECT * FROM users GROUP BY zipcode update: I simplified this as per Danny’s suggestion below

The only problem is, in case 2, it’s always the same user per-zipcode. I want to randomize which in the grouping gets pulled out. I would prefer that the solution not involve joining the table on itself, although if I don’t find another solution I will probably whip up the self-join solution just for fun.


5 Comments

Try this (tweaked since UNIX_TIMESTAMP() is optional, and WHERE is quicker than HAVING):

SELECT * FROM USERS WHERE zipcode > ” GROUP BY zipcode ORDER BY zipcode, RAND();

Posted by danny on 19 January 2007 @ 12pm

Hey, thanks for the attempt danny, but that still returns the same row every time :(

and now i am even more confused, because I thought that HAVING count(zipcode)>=1 was the part that squeezed the GROUP BY groups (each zip code cluster) down to 1.

Posted by John on 19 January 2007 @ 12pm

Actually no. using having count(zipcode) >=1 means it’ll return anything that has at least 1 zipcode in it. It doesn’t do much for filtering in this case.

Posted by Jason on 19 January 2007 @ 2pm

Whoah, I just tried it and verified that you are right. Well that makes for a more simple query to start out with. Hmmm

Posted by John on 19 January 2007 @ 3pm

Here is something else I came up with, which works properly on some datasets, but when I tried it where every row had the same zip, it came up with 0 rows, not 1. But anyway, it’s my current best solution.

SELECT DISTINCT * FROM users u1, users u2 where u1.zipcode != u2.zipcode ORDER BY RAND(UNIX_TIMESTAMP());

Posted by John on 23 January 2007 @ 5pm

Leave a Comment