User-defined functions

from the Artful SQL Server & Access Tips List


Microsoft first implemented user-defined functions (UDFs) in SQL 2000. Probably you are already familiar with many built-in functions, for example GetDate() and User_Name(). You may not have noticed, though, that the master database has a collection of UDFs as well--some encrypted, others available for your inspection. To see the list of UDFs in SQL Management Studio, open System Databases, master, Programmability, then Functions.

SQL 2000 lets you write two kinds of UDF—-scalar functions and table functions. Scalar functions return a single value of any data type, much the same as user-defined functions in languages such as Visual Basic. Table functions return tables. I'll discuss both in this article. Before we begin our exploration, however, we must address two concepts: determinism and schema binding.

Determinism

A routine (built-in or user-defined) is deterministic if the same set of inputs to it will always yield exactly the same result. If that is not the case, the routine is non-deterministic.

Functions like DateAdd() are deterministic, since given the same set of parameters, they're guaranteed to return the same results. Functions such as GetDate() are non-deterministic, because while they're always called with the same parameters, they return different results every time. For UDFs the rules are a little more complicated, since their code could contain almost anything. A UDF is deterministic if:
  • the function is schema-bound (more on that in a moment),
  • all built-in or user-defined functions called by the user-defined function are deterministic,
  • the body of the function references no database objects outside the scope of the function, and
  • the function calls no extended stored procedures.
The schema-binding problem

Imagine a function fn_Problematic_Select(), which selects * from one or more tables. Imagine that after writing this function, you insert a new column in said table(s). Subsequent invocations of fn_Problematic_Select() will not show your new column.

So the question becomes, should you ever select * from table(s)? In my opinion, the answer is emphatically no. You may disagree. The way around this difficulty is schema binding, which binds the function to the objects it references, making it impossible to alter the referenced objects (tables, views, or other functions) unless the function is either dropped or modified without using the schema-binding directive.

You can use the schema-binding option only if all the following conditions are met:
  • Any views or user-defined functions referenced by the function are also schema-bound.
  • The objects referenced by the function are not referenced using a two-part name.
  • The function and the objects it references belong to the same database.
  • The user who created the function has the REFERENCES permission on all the database objects that the function references.
If any of these conditions fails, CREATE FUNCTION ... will fail.

What exactly can you do within a multi-statement function? Any of the following types of statement are permitted:
  • Assignment statements
  • Control-Flow statements
  • DECLARE statements defining local data variables
  • SELECT statements containing select lists with expressions that assign values to variables that are local to the function
  • Cursor operations referencing local cursors that are declared, opened, closed, and unallocated in the function. You can use FETCH statements to assign values to local variables using the INTO clause, but you cannot use FETCH statements that return data to the client
  • INSERT, UPDATE, and DELETE statements modifying table variables local to the function. That is, you can create a local table and do anything you want to it, but you cannot change data in external tables
  • EXECUTE statements that call extended stored procedures
Now that we have the rules for creating and using functions, let’s explore some scalar UDFs.

Scalar functions

In the Access Northwind sample database, the Order Details table has no column for the extended total value of the order. This is as it should be, the general principle being "do not store what can be calculated." However, when you need to calculate this value, you have to rewrite its formula over and over, and as you escalate (calculate the total extended amount for each order and then for each customer), the expression grows awkwardly. Here is a function that automates it:

CREATE FUNCTION dbo.fn_ExtendedAmount (
       @quantity int,
       @unitprice money,
       @discount float
       )
RETURNS money
AS
       BEGIN
              DECLARE @result money
              SET @result = 0
              SET @result = @quantity * @unitprice * (1 - @Discount)
       RETURN @result
       END

That function lets us write a query such as the following to sum the extended amounts for a given order.

SELECT OrderID, Sum(dbo.fn_ExtendedAmount(Quantity, UnitPrice, Discount)) AS [Order Total]
FROM dbo.[ORDER DETAILS]
GROUP BY OrderID


Better yet, let’s write a function to automate the calculation:

CREATE FUNCTION dbo.fn_OrderTotal (
       @OrderID int
       )
RETURNS money
AS
       BEGIN
              DECLARE @result money
              SET @result = 0
              SET @result = (SELECT  SUM(dbo.fn_ExtendedAmount(Quantity, UnitPrice, Discount)) AS [Order Total]
                           FROM dbo.[Order Details]
                           WHERE OrderID = @OrderID )
       RETURN @result
       END

Moving up the relational hierarchy, we now create a function to derive the total purchases of a given customer:

CREATE FUNCTION dbo.fn_CustomerTotal7 (
       @CustomerID varchar(10)
       )
RETURNS money
AS
       BEGIN
              DECLARE @result money
              SET @result = 0
              SET @result = (SELECT SUM(dbo.fn_OrderTotal(OrderID))
                           FROM ORDERS
                           WHERE CustomerID = @CustomerID)
       RETURN @result
       END

We can now call this function in a query like this:

Select DISTINCT CompanyName, dbo.fn_CustomerTotal(Orders.CustomerID) AS [Customer Total]
FROM Customers 
INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID

The query results look like this:



Calling built-in UDFs

The syntax is slightly different: the function name must be prefixed with a pair of colons. For example, if you try to call the function sp_helpcollations() in the normal way, you get the error “Invalid object name.” You must call it this way:

select * from ::fn_helpcollations()

Table functions

There are two types of table functions: in-line functions and standard table functions. We conclude with a list of the characteristics of the more powerful in-line functions, which form a subset of table functions:
  • The RETURNS statement contains only the keyword TABLE. You don’t have to specify the structure of the table since it derives from the SELECT statement.
  • There is no function body delimited by the BEGIN and END keywords.
  • The RETURN statement consists of a valid SELECT statement enclosed within parentheses.

Return to the Artful SQL Server & Access tips page