|
|
SQL Server T-SQL User-Defined Function of the Week
Retrieving a Random Record in SQL Server using udf_Num_RanInt
Volume 1 Number 21 April 8, 2003
Check out the UDF Frequently Asked Questions at:
http://www.novicksoftware.com/UdfOfWeek/UDF_FAQ.htm
To ask a UDF related question or to contribute a UDF,
send e-mail to udf@NovickSoftware.com
Sign up for this newsletter at:
http://www.novicksoftware.com/UdfOfWeek/UDFofWeekSignup.htm
I presume you've all started paying your execution time charges
to Microsoft? Oh, yea. I did promise not to do any more April
Fooling until next year. If you missed the April Fools edition,
you can go back and find it in the archives.
This issue's UDFs of the Week solves a problem posed to me last
week by a friend Gary, who runs the VB Pro User Group in Waltham,
MA. Gary wants to select a random row from one of his tables
and was hoping that SQL Server would have some support for this
task.
The idea of selecting a random row runs counter to the ideas
that are the basis of the relational database model. SQL Server
doesn't have any support for picking a random row.
That doesn't mean that there's no way to do what Gary wants. In
fact, there are several ways to pick random rows. The technique
used in this issue is based on having a identity column or a
similar sequential numeric key.
For an example, we'll use the Orders table in the Northwind
sample database, which has an OrderID column that fits the bill.
OrderIDs are nearly sequential. If we can live with less than
perfect randomness, a nearly sequential ID will suffice.
The first UDF in this issue is udf_Num_RandInt, which returns a
random integer from a numeric range. It depends on the
built-in RAND function. Since RAND is non-deterministic, it
can't be called directly from a UDF and we'll have to embed it
in a view in order to write the function. This technique was
used in the Volume 1 #2 to write udf_DT_CurrTime and it's
explained pretty well there: http://www.novicksoftware.com/
UDFofWeek/Vol1/T-SQL-UDF-Volume-1-Number-2-udf_DT_CurrTime.htm
Here's our view that returns a random number:
/-------- Start copying below this line ------------------------\
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE VIEW Function_Assist_RAND
/*
* A view to return one row, with one column, a random number from
* the built-in function RAND(). This allows a user defined
* function to bypass the restriction on access to this
* non-deterministic function. However, since no seed is
* provided, it will return the same value for all invocations
* within a query as seen in this query:
select r1.[RAND], r2.[RAND] from Function_Assist_RAND as r1
cross join Function_Assist_RAND as r2
* Related Functions: udf_Num_RandInt
*
* Example:
select r1.[RAND] from Function_Assist_RAND
*
* © Copyright 2003 Andrew Novick http://www.NovickSoftware.com
* You may use this function in any of your SQL Server databases
* including databases that you sell, so long as they contain
* other unrelated database objects. You may not publish this
* View either in print or electronically.
* Originally Published in T-SQL UDF of the Week Vol 1 #21 4/8/03
http://www.NovickSoftware.com/UDFofWeek/UDFofWeek.com
****************************************************************/
AS
SELECT RAND() as [RAND]
GO
GRANT SELECT ON [dbo].[Function_Assist_RAND] TO [PUBLIC]
GO
\------- Stop copying above this line -------------------------/
Once the view is in place a UDF can now use the built-in RAND
function. Here's the CREATE FUNCTION script for udf_Num_RandInt.
It returns a random integer within a range:
/-------- Start copying below this line ------------------------\
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE FUNCTION dbo.udf_Num_RandInt (
@MinNum int = 0 -- lowest value
,@MaxNum int = 100 -- highest value in range
) RETURNS INT -- int between @MinNum and @MaxNum inclusive
/*
* Returns a random integer between @MinNum and @MaxNum inclusive.
* Note that while the query "SELECT RAND(), RAND()" always
* returns that same two values,
select dbo.udf_NumRandIntFromRange (1, 100)
, dbo.udf_NumRandIntFromRange (1, 100)
* returns two different values. That's because the view is
* executed separately for each function invocation.
*
* Example:
select dbo.udf_NumRandIntFromRange (1, 100)
*
* © Copyright 2003 Andrew Novick http://www.NovickSoftware.com
* You may use this function in any of your SQL Server databases
* including databases that you sell, so long as they contain
* other unrelated database objects. You may not publish this
* UDF either in print or electronically.
* Published in T-SQL UDF of the Week Newsletter Vol 1 #21 4/8/03
http://www.NovickSoftware.com/UDFofWeek/UDFofWeek.htm
****************************************************************/
AS BEGIN
Declare @Result int
DECLARE @RAND float -- the random number
SELECT @RAND = [RAND] From Function_Assist_Rand
RETURN ROUND(@MinNum + (@RAND * (@MaxNum - @MinNum)), 0)
END
GO
GRANT EXEC on [dbo].[udf_Num_RandInt] to [PUBLIC]
GO
\------- Stop copying above this line -------------------------/
Before we go to further, try out udf_Num_RandInt:
/-------- Start copying below this line ------------------------\
SELECT dbo.udf_Num_RandInt (15, 47) as [Random Num 15-47]
go
\------- Stop copying above this line -------------------------/
(Results)
Random Num 15-47
----------------
19
Now that we have a way to get a random number from within a UDF
the next script creates a UDF that returns a random row from the
Northwind..Orders table and uses it to find the first OrderID
that's greater than or equal to the number selected. To make
that work we have to supply the range of numbers that OrderID
falls within.
Take a look at this CREATE FUNCTION script. I'll explain what
it's doing in the text that follows.
/-------- Start copying below this line ------------------------\
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
CREATE FUNCTION dbo.udf_Order_RandomRow (
) RETURNS TABLE
-- WITH SCHEMABINDING omitted due to 3 part table reference
/*
* Returns a random row from the Northwind Orders table
*
* Example:
select * FROM dbo.udf_Orders_RandomRow ()
*
* © Copyright 2003 Andrew Novick http://www.NovickSoftware.com
* You may use this function in any of your SQL Server databases
* including databases that you sell, so long as they contain
* other unrelated database objects. You may not publish this
* UDF either in print or electronically.
* Published in T-SQL UDF of the Week Newsletter Vol 1 #21 4/8/03
http://www.NovickSoftware.com/UDFofWeek/UDFofWeek.htm
****************************************************************/
AS RETURN
SELECT TOP 1 *
FROM Northwind..Orders
WHERE OrderId > = (
SELECT dbo.udf_Num_RandInt ((SELECT MIN(OrderID)
FROM Northwind..orders)
, (SELECT MAX(orderId)
FROM Northwind..orders))
)
ORDER BY OrderID
GO
GRANT SELECT on [dbo].[udf_Orders_RandomRow] to [PUBLIC]
GO
\------- Stop copying above this line -------------------------/
udf_Orders_RandomRow picks out a random row based on the OrderID.
OrderID doesn't start from one or any predetermined origin so
the UDF has to select the minimum and maximum values. That's
actually pretty cool because we're using sub-queries for the
parameters of the function.
Lets take a look at the range of numbers that are in the OrderID
column:
/-------- Start copying below this line ------------------------\
SELECT Min(OrderID) as [Min OrderID]
, Max(OrderID) as [Max OrderID]
FROM Northwind..Orders
GO
\------- Stop copying above this line -------------------------/
(Resutls)
Min OrderID Max OrderID
----------- -----------
10248 11077
If you pull out the SELECT on udf_Orders_RandomRow
you can retrieve random integers in the range of orderIDs:
/-------- Start copying below this line ------------------------\
SELECT dbo.udf_Num_RandInt ((SELECT MIN(OrderID)
FROM Northwind..orders)
, (SELECT MAX(orderId)
FROM Northwind..orders))
GO
\------- Stop copying above this line -------------------------/
Try it a few times and you should always get numbers between
10248 and 11077.
Finally, we can retrieve the random row that we've been looking
for:
/-------- Start copying below this line ------------------------\
SELECT * from udf_Order_RandomRow()
GO
\------- Stop copying above this line -------------------------/
(Results - truncated on the right)
OrderID CustomerID EmployeeID OrderDate RequiredD
------- ---------- ----------- ------------------------ ---------
10717 FRANK 1 1997-10-24 00:00:00.000 1997-11-2
Of course, you'll get a different row each time you use
udf_Orders_RandomRow.
The UDF isn't general purpose. Instead it's very specific to the
Orders table. Due to the limitation on using Dynamic SQL in a
UDF and to other limitations it's not possible to create a
general purpose UDF that does the job that udf_Orders_RandomRow
takes care of. We have to live with creating a new UDF for
each table that we want to use with this technique.
Please don't forget to share this issue with anyone interested
in SQL Server. Thanks.
+--------------------------------------------------------------+
Do you have a great UDF that you'd like to share? Or maybe you
have a T-SQL problem that you think could be solved by a UDF
but you don't know how? Send them to:
UDF@NovickSoftware.com
and they might be published in this newsletter. I try and
respond to every request that I get.
Thanks,
Andrew Novick
+--------------------------------------------------------------+
|
|