Build a data values dictionary with SQL Server

from the Artful SQL Server & Access Tips List


When a client asked one of my colleagues to construct a data values dictionary, the client didn't mean what developers think of when they hear that term. He was referring to a list of the distinct values in various columns and tables. I'll present the basic SQL I came up with to solve this problem.

First, let's talk about the requirements. You may not have an interest in knowing the distinct values for all columns. Typically, the "interesting" columns might be items such as ZipCode or ProductID in the SalesDetails table.

Obtaining the values is the easy part; it simply requires a SELECT DISTINCT. But since I need something a little more powerful and useful, I began by building a table called DataValuesDictionary, which you can see in Listing A.

The SELECT DISTINCT part of the clause will supply only the distinct values--you still need to populate the other columns. The simplest way to do this is to include these values directly in the SELECT statement, like this:

SELECT DISTINCT 
  'Customers' as TableName, 
  'CompanyName' as ColumnName,
  'Varchar(100)' as DataType, 
  CompanyName as DataValue 
FROM Customers

This is fine for one table and one or two columns but, in order to go further, I turned the SQL above into a stored procedure. This required a little footwork, primarily to get the single-quotes working correctly. The procedure is offered in Listing B.

You can see the fancy footwork in the assembly of the SQL statement. The line beginning "set @s" must wrap the parameters in single quotes and deal correctly with the commas between the column names.

Note: For brevity, I did not include the code required to convert all column types to varchar. A DateTime column will work, but a Money column won't. In the unlikely event that you really want to add such a column's values to the dictionary, add code to test the DataType parameter and a block that CASTs or CONVERTs the data type to varchar.

Now that you have a data values dictionary, there are a number of reporting tools that will let you deliver an attractive printout to clients.

-- LISTING A
if exists (
  select * from dbo.sysobjects 
  where id = object_id(N'[dbo].[DataValuesDictionary]') 
    and OBJECTPROPERTY(id, N'IsUserTable') = 1
)
drop table [dbo].[DataValuesDictionary]
GO

CREATE TABLE [dbo].[DataValuesDictionary] (
    [PK] [int] IDENTITY (1, 1) NOT NULL ,
    [TableName] [varchar] (100)  NOT NULL ,
    [ColumnName] [varchar] (100)  NOT NULL ,
    [DataType] [varchar] (20)  NOT NULL ,
    [DataValue] [varchar] (100)  NOT NULL
) ON [PRIMARY]
GO

-- LISTING B
-- =============================================
-- Create procedure basic template
-- =============================================
-- creating the stored procedure
IF EXISTS (SELECT name 
       FROM   sysobjects 
       WHERE  name = N'DataValuesDictionary_Add_ap' 
       AND       type = 'P')
    DROP PROCEDURE DataValuesDictionary_Add_ap
GO

CREATE PROCEDURE DataValuesDictionary_Add_ap 
    (
    @TableName varchar(100), 
    @ColumnName varchar(100),
    @DataType varchar(20) 
    )
AS
    declare @s varchar(200)
    set @s = 'select distinct ''' + @TableName + ''', 
             ''' + @ColumnName + ''', 
             ''' + @DataType + ''', 
             ' + @ColumnName + 
             ' from ' + @TableName
    print @s
    DELETE DataValuesDictionary WHERE TableName = @TableName
    INSERT DataValuesDictionary EXECUTE (@s)

-- =============================================
-- example to execute the stored procedure
-- =============================================
-- EXECUTE DataValuesDictionary_Add_ap 'Customers', 'CompanyName','Varchar(100)'
-- SELECT * 
-- FROM DataValuesDictionary
-- WHERE TableName = 'Customers'
-- AND ColumnName = 'CompanyName'


Return to the Artful SQL Server & Access Tips page