select * from some_table
join (select rand() as random_id) on
some_table.the_id = random_id
If the intent is actually to select one, and only one, random row from the table than this Stack Overflow entry has a result that is in most cases in the direction I was thinking (LIMIT 1, and have some way of selecting a random entry, remembering that the ID field might be monotonic, but isn't assured to be continuous (have no gaps))
Unfortunately it seems that it depends on language extensions that are not portable.
It unfortunately also appears that the easiest query scales poorly to large tables as it works by assigning EACH row a random ID and then limiting it to one of those by some order (probably lowest assigned row).
A more effective approach might be a function or procedure based around a pre-recorded 'maximum' ID for the column (assuming it's vaguely monotonic and not a GUID), picking a random result between base and that ID, Verifying it exists, and if it doesn't trying again up to N times before using a fallback procedure. (E.G. after 100 random stabs that fail selecting ~500 records in the range of lastRND +/- 250 and using the ORDER BY rand() or equivalent on that sub-set). If even /that/ fails pick the next or previous record by which is closest.
select * from the_table
offset rand(select count(*) from the_table)
A corrected version based on being >= or <= the random number (picking the nearest) COULD work (in fact, that's the way my suggest final fallback works), but would still be biased.
That was the reason I suggested taking a limited number of stabs at random numbers within the allowed range; my example still has some slight bias (a little in the 'best case' fallback, and a lot in the absolute fallback), but it tries to deliver a truly random result first, many times.
EDIT:" For people who think I don't know what they mean, I do. I meant that the author of TFA obviously meant to write "Nondeterministic Functions in MySQL (e.g. Rand)", because the entire article is about all non-deterministic functions and just uses rand as an example.
Why do people match salmon colored items with rose ones?
Why do people sing off key?
They aren't themselves sensitive to the difference and don't think about it.
"e.g." is "exempli gratia" in latin; just meaning "for example".
The rule I expect for where clause expression evaluation is: standard short-circuit semantics. That calls to deterministic functions get memoized, or that constant sub-expressions get hoisted out, is a plus -- I do expect GCSE and optimization of deterministic functions from modern RDBMSes. I also expect that functions meant to be used in queries have no side effects -- that they insert/update/delete no rows on any tables.
WHERE ID = CAST(Rand() * 1000 as int)
In addition this query: SELECT TOP(100) rand() AS RandData from SomeTable
. . . .
A old trick to actually get random data on SQL Server that I think has been replaced with newer functionality is:
However if we move the NEWID.... stuff into the WHERE clause we end up switching to an index seek so back to being evaluated once.
SELECT column FROM table
ORDER BY RANDOM()