If we were to take normalization to its extreme, we would model all columns as rows, so we (or our client) could freely add columns to our tables. In practice, column specs may remain unchanged for years. But in some situations, you should think about representing these attributes as rows rather than columns.
Let me present an example from a large database that I helped remodel. The application was an enterprise solution tailored to pulp and paper mills. Even though all clients are interested in certain attributes of a given product, each client might have an interest in an attribute that no one else cares about.
One solution is to include a lot of user-definable columns in the Products table, and then provide a means for the client to "name" said columns, indicate their data type, and so on. There are quite a few commercial products (e.g., Act!) that use this approach.
Our team decided not to go that route; instead, we chose to model all the "columns" as rows. Then, we gathered the collection of rows corresponding to any given product and assembled them as virtual columns.
Here's a simple example to show you how to do this. We'll begin with a table called Things, and then add another table called Properties. Finally we'll build an associative table called ThingProperties:
CREATE TABLE [dbo].[things] (
[Thing] [varchar] (20) NOT NULL
ALTER TABLE [dbo].[things] WITH NOCHECK ADD
CONSTRAINT [PK_things] PRIMARY KEY CLUSTERED
CREATE TABLE [dbo].[properties] (
[Property] [varchar] (20) NOT NULL
ALTER TABLE [dbo].[properties] WITH NOCHECK ADD
CONSTRAINT [PK_properties] PRIMARY KEY CLUSTERED
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[thingproperties]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[thingproperties]
CREATE TABLE [dbo].[thingproperties] (
[ThingPropertyID] [int] IDENTITY (1, 1) NOT NULL ,
[Thing] [varchar] (20) NOT NULL ,
[Property] [varchar] (20) NOT NULL ,
[Value] [varchar] (20) NOT NULL
) ON [PRIMARY]
We'll populate the Thing table with the values X, Y, and Z. Then, we'll populate the Properties table with the values COLOR, DESIGNER, SHAPE, and SIZE:
ThingPropertyID Thing Property Value
1 X COLOR GREEN
2 X COLOR RED
3 X SHAPE TRIANGLE
4 X SIZE MEDIUM
5 X SIZE SMALL
6 Y COLOR BLUE
7 Y COLOR GREEN
8 Y SHAPE CIRCLE
9 Y SIZE MEDIUM
10 Z COLOR BLUE
11 Z COLOR RED
12 Z SHAPE TRIANGLE
13 Z SIZE LARGE
14 Z SIZE SMALL
In a simple case, such as listing the properties and their values for any given thing, it's a simple SELECT. Let's suppose, however, that we want a list of all the small red triangular things. The table above reveals that only X and Y have the three desired property values. How do we get this list? The answer is to open the table once for each property that is specified, which in this case is three. And then, join the second and third instances to the first on the thing column:
q0.Thing, q0.Property, q0.Value,
ThingProperties AS q2
INNER JOIN (ThingProperties AS q0
INNER JOIN ThingProperties AS q1 ON q0.Thing = q1.Thing)
ON q2.Thing = q0.Thing
(((q0.Property)='COLOR') AND ((q0.Value)='RED') AND
((q1.Property)='SHAPE') AND ((q1.Value)='TRIANGLE') AND
((q2.Property)='SIZE') AND ((q2.Value)='SMALL'));
The great thing about this SQL statement is its formulaic nature. For each property value requested, open the table once. This statement is easy for a front-end application to manufacture. Alternatively, you could turn the statement into a collection of stored procedures, each taking n pairs of properties and values; then, have your front end invoke the appropriate one, depending on the number of pairs passed.
While not every application requires this kind of flexibility, if you find yourself having to deliver it, this is how to do it.