Writing stored procedures
and other code in .Net languages is a very attractive idea to a lot
of developers. But is it a good idea? What’s the performance going
to be like? Is it going to bring my server to its knees?
I got lots of these
questions last weekend when I did two presentations on developing in
the .Net CLR in SQL Server 2005. The newly revised slides and
examples are on my web site in the presentations section. Here are
the links:
http://www.novicksoftware.com/Presentations/clr-programming-dotnet-sql-server-2005-yukon/clr-dotnet-sql-server-2005-yukon.htm
and
http://www.novicksoftware.com/Presentations/clr-user-defined-types-sql-2005/clr-user-defined-types-sql-2005.htm
If you’re interested in
the CLR or just in SQL Server 2005 in general and you’re near New
England, try and get to the Mini-Code Camp at the Microsoft office
in Waltham, MA on October 22nd, 2005. The title of the
presentation is SQL Server Programming: From 2000 to 2005. It’s
going to include lots of material for developers interested in
expanding their abilities to use SQL Server. The CLR (.net) will be
a big part of the presentation. Adam Machanic and I will be the
presenters. Best of all there’s no charge for the complete one day
event. If you want to register, do it soon, the event is already
two thirds filled. Here’s the registration page:
http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032279560&Culture=en-US
So back to the performance
question: are stored procedures and user-defined functions, written
in .Net languages faster or slower than the ones written in T-SQL?
The answer is: “it depends”. There are situations where either
alternative will come out on top. For this issue I’m going to
compare a pretty simple UDF written in T-SQL and an identical one
written in VB.Net.
I’ve used the function
udf_Txt_CharIndexRev for performance analysis in
Issue #10 of the T-SQL UDF of the Week. It was also featured in
Issue #32 of Volume 1 of the old newsletter. The purpose of the
function is to locate one string in another from the back. So it’s
like CHARINDEX but from the back. Here’s the CREATE FUNCTION
script:
CREATE FUNCTION
dbo.udf_Txt_CharIndexRev (
@SearchFor varchar(255) -- Sequence to be found
, @SearchIn varchar(8000) -- The string to be searched
) RETURNS int -- Position from the back of the string where
-- @SearchFor is found in @SearchIn
WITH SCHEMABINDING
/*
* Searches for a string in another string working from the back.
* It reports the position (relative to the front) of the first
* such expression it finds. If the expression is not found, it
* returns zero.
*
* Equivalent Template:
CASE
WHEN CHARINDEX(
, ) > 0
THEN LEN()
- CHARINDEX(REVERSE()
, REVERSE ()) + 1
ELSE 0 END
* Example:
select dbo.udf_Txt_CharIndexRev('\', 'C:\temp\abcd.txt')
*
* Test:
PRINT 'Test 1 ' + CASE WHEN 8=
dbo.udf_Txt_CharIndexRev ('\', 'C:\temp\abcd.txt')
THEN 'Worked' ELSE 'ERROR' END
PRINT 'Test 2 ' + CASE WHEN 0=
dbo.udf_Txt_CharIndexRev ('*', 'C:\tmp\d.txt')
THEN 'Worked' ELSE 'ERROR' END
-- 12345678901234567890
PRINT 'Test 3 ' + CASE WHEN 18=
dbo.udf_Txt_CharIndexRev('fj', 'f123 asdasfdfdfddfjas ')
THEN 'Worked' ELSE 'ERROR' END
*
* History
* WHEN WHO WHAT
* ---------- ---------- -----------------------------------------
* 2002-12-10 ASN Initial Coding
* 2003-06-16 ASN Fixed problem with Test 3 by using
DATALENGTH instead of LEN
****************************************************************/
AS BEGIN
DECLARE @Result int
, @StringLen int
, @ReverseIn varchar(8000)
, @ReverseFor varchar(255)
SELECT @ReverseIn = REVERSE (@SearchIn)
, @ReverseFor = REVERSE(@SearchFor)
, @StringLen = DATALENGTH(@SearchIn)
SELECT @Result = CHARINDEX(@ReverseFor, @ReverseIn)
-- return the position from the front of the
string
IF @Result > 0
SET @Result = @StringLen - (@Result
+ DATALENGTH(@SearchFor)
- 2
)
-- ENDIF
RETURN @Result
END
GO
GRANT EXEC on dbo.udf_Txt_CharIndexRev to PUBLIC
GO
The VB.Net version is named CharIndexRev and it
takes advantage of the built-in function InStrRev that implements
this function. InStrRev is from the Microsoft.VisualBasic
compatibility library, which consists of functions that existed in
Visual Basic 5, but which have different names in the .Net Common
Language Runtime (CLR). Here’s the VB.Net function:
Imports System
Imports System.Data.Sql
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Imports Microsoft.VisualBasic
Partial Public Class
UserDefinedFunctions
''' <summary>
''' Returns the position of the last occurrence of SearchFor in
SearchIn
''' </summary>
''' <param name="SearchFor">String being located</param>
''' <param name="SearchIn">String being searched</param>
''' <returns>Position, 1 based relative to the start of the string
''' of the last occurrence of SearchFor in SearchIn</returns>
''' <remarks></remarks>
<SqlFunction()> _
Public Shared Function CharIndexRev(ByVal
SearchFor As SqlString, _
ByVal SearchIn As
SqlString _
) As SqlInt32
Return InStrRev(CType(SearchIn,
String), CType(SearchFor,
String))
End Function
End Class
There’s a ton of
functionality in .Net just waiting to be used. Like
last issues’s RegExMatch, CharIndexRev is an example of using a
UDF to expose .Net functionality to T-SQL. This is the easiest way
to get a payoff from the CLR.
I ran both functions
against a one million row table of number strings in a manner
similar to Volume 1 #10. The data was pinned into memory and both
functions were run to search for a number in all million rows.
For good measure I also
took the function and unraveled it to TSQL that could run against
the same data. Using just the SQL against the same data is much
faster that invoking a UDF. It doesn’t matter if the UDF is written
in TSQL or .Net. In the Vol 1 #10 where the UDF was written in
T-SQL the difference was 140 times. This time I got similar
results.
Table 1 Comparison of T-SQL and CLR UDFs
|
|
Test |
CPU |
Elapsed |
|
Query
#0 |
Scan
Table |
0.8 |
0.4 |
|
Query
#1 |
udf_txt_CharIndexRev |
114.0 |
172.3 |
|
Query
#2 |
Equivalent SQL |
2.9 |
1.5 |
|
Query
#3 |
CharIndexRev (.Net) |
44.9 |
52.8 |
Table 1 shows the
results from my experiment. Note that all queries were run on a
single processor system with Hyperthreading, which is why the
Elapsed time can be less than the CPU time. The queries are:
- Query #0 shows the time to select all
million rows without any processing. So this is just the time
to read the data. (After it has been pinned into memory)
- Query #1 is the result from using the
T-SQL UDF udf_Txt_CharIndexRev on the million rows. This shows
the overhead of using a T-SQL UDF.
- Query #2 is the time to run the unwrapped
T-SQL on the same data.
- Query #3 shows the time to run over the
same million rows with the VB.Net version of CharIndex.
In this case, the .Net UDF
beats the T-SQL UDF by about three times That’s a fabulous
speedup. It was achieved because .Net is better suited to some
problems than T-SQL.
Obviously, straight T-SQL
is the fastest by far. It’s usually the preferred solution, if you
can achieve the desired functionality with T-SQL. In the case of a
reverse character string function T-SQL is adequate. Now go and try
implementing the RegExMatch function in T-SQL! It’s just not
practical.
Performance isn’t the only
consideration when deciding on a technology. As I build business
applications and software products I also have to consider:
- Programmer productivity: How long will it
take to construct.
- Maintainability: How easy is the code to
maintain over time.
- Conformance to standards
You may have additional considerations.