When a client asked one of my colleagues to construct a data values dictionary, the client didn't mean what developers think of when they hear that term. He was referring to a list of the distinct values in various columns and tables. I'll present the basic SQL I came up with to solve this problem. First, let's talk about the requirements. You may not have an interest in knowing the distinct values for all columns. Typically, the "interesting" columns might be items such as ZipCode or ProductID in the SalesDetails table. Obtaining the values is the easy part; it simply requires a SELECT DISTINCT. But since I need something a little more powerful and useful, I began by building a table called DataValuesDictionary, which you can see in Listing A. The SELECT DISTINCT part of the clause will supply only the distinct values--you still need to populate the other columns. The simplest way to do this is to include these values directly in the SELECT statement, like this:
This is fine for one table and one or two columns but, in order to go further, I turned the SQL above into a stored procedure. This required a little footwork, primarily to get the single-quotes working correctly. The procedure is offered in Listing B. You can see the fancy footwork in the assembly of the SQL statement. The line beginning "set @s" must wrap the parameters in single quotes and deal correctly with the commas between the column names. Note: For brevity, I did not include the code required to convert all column types to varchar. A DateTime column will work, but a Money column won't. In the unlikely event that you really want to add such a column's values to the dictionary, add code to test the DataType parameter and a block that CASTs or CONVERTs the data type to varchar. Now that you have a data values dictionary, there are a number of reporting tools that will let you deliver an attractive printout to clients.
|