If we were to take normalization to its extreme, we would model all columns as rows, so we (or our client) could freely add columns to our tables. In practice, column specs may remain unchanged for years. But in some situations, you should think about representing these attributes as rows rather than columns. Let me present an example from a large database that I helped remodel. The application was an enterprise solution tailored to pulp and paper mills. Even though all clients are interested in certain attributes of a given product, each client might have an interest in an attribute that no one else cares about. One solution is to include a lot of user-definable columns in the Products table, and then provide a means for the client to "name" said columns, indicate their data type, and so on. There are quite a few commercial products (e.g., Act!) that use this approach. Our team decided not to go that route; instead, we chose to model all the "columns" as rows. Then, we gathered the collection of rows corresponding to any given product and assembled them as virtual columns. Here's a simple example to show you how to do this. We'll begin with a table called Things, and then add another table called Properties. Finally we'll build an associative table called ThingProperties:
We'll populate the Thing table with the values X, Y, and Z. Then, we'll populate the Properties table with the values COLOR, DESIGNER, SHAPE, and SIZE:
In a simple case, such as listing the properties and their values for any given thing, it's a simple SELECT. Let's suppose, however, that we want a list of all the small red triangular things. The table above reveals that only X and Y have the three desired property values. How do we get this list? The answer is to open the table once for each property that is specified, which in this case is three. And then, join the second and third instances to the first on the thing column:
The great thing about this SQL statement is its formulaic nature. For each property value requested, open the table once. This statement is easy for a front-end application to manufacture. Alternatively, you could turn the statement into a collection of stored procedures, each taking n pairs of properties and values; then, have your front end invoke the appropriate one, depending on the number of pairs passed. While not every application requires this kind of flexibility, if you find yourself having to deliver it, this is how to do it. |