Peter Brawley and Arthur FullerIt's a 10-step:
(lastname,firstname,SSN), Jane's primary key may change when she marries, messing up table relationships that depend on that key value. Note too that in some countries including Canada, certain external unique identifiers such as Social Insurance Number legally can be used only for purposes relating to the government.
Sometimes there's a combination of unique never-changing real-world column values
that will serve as an adequate primary key. More often the safest bet is a
auto_increment integer ID column. Using such a
unique auto-generated arbitrary number to identify all table rows
guarantees row uniqueness and removes the need ever to edit such primary key values.
In MySQL, declare such keys
integer_type unsigned primary key auto_increment
where integer_type is one of ...
4c: Make each table self-consistent. For example it seems obvious that a customer has an address. Unless the app will have homeless customers, that's usually the case. It seems natural then to give the customer table some address columns and leave it at that.
Combining addresses with other identifying info can be problematic. Changing an address doesn't change the customer; they're two different sorts of things. Many people and organisations have mutliple addresses (branch offices; school; summer, a particular job or project &c), not to mention the very affluent who have homes in several countries.
In database-speak, the relationship from customers to addresses is 1:many, so in many cases the customer table with address columns needs to become two tables: a customer table and an address table with a customer foreign key. Likewise an orders table with order item columns becomes an orders table and an order items table with an orders foreign key. Simpler, less redundant, more reportable, more maintainable.
4d: Give each child table a column that references its parent table,
to automate the database logic that keeps these table relationships intact. Such a column
is a foreign key. Do not carry IDs up to grandparents; that would introduce redundancy,
complicating data management and queries.
4e: Configure lookup tables not covered by steps 4c and 4d. A child can have multiple
parents. Any table in the chain may also belong to a table that's not in a main chain. For
example counties, schools and departments may each have a
that points to a computer services table, and a
staffID column pointing to a staff
CREATE TABLE school (
ID int auto_increment primary key,
name varchar(50) not null Default 'Not known',
universityID int not null references university(ID),
computerservicesID int not null references computerservices(ID),
deanstaffID int not null references staff(ID),
That's how we model schools that share a computer service, staff who work in multiple schools &c. The diagram begins to get more interesting ...
university <----- computerservices
| | |
school <---------------+ |
4f: Add bridge tables for many-many relationships: Students take multiple tests
and each test is taken by multiple students; that's a many:many relationship. To
implement it, you need a bridge table, a
students_testresults table that has
its own primary key column, plus a foreign key column pointing to a
students table row,
and a foreign key column pointing to a particular
testresults table row.
Steps 4a through 4f bring your database to Third Normal Form, roughly. We can get rid of yet more redundancy, e.g., by creating a
table, replacing the
name column in
schools with a
column that points to the
schoolnames table row for that name. Now no name is stored twice.
Whether to take normalisation that far depends on how important the redundancy is for queries, and for
available storage space. Third Normal Form usually suffices.
5 The Actions
In the Requirement (Step 1), underline or otherwise mark all the verbs to get a first outline of what actions will have to be performed on the Tables. Then organise those actions into query specifications (e.g., a shipping clerk begins by looking up an order by date, customer or order number).
6 Build the database from the normalised model
You no longer need to do this by hand, e.g., with MySQL, a button click gets it done in MySQL Workbench.
It's the first test of your model: does it parse, can your RDBMS actually buold the
database you defined? Inevitably, you'll find errors, inconsistencies, omissions. Rinse and repeat.
7 Seed the database with data
Populate your lookup tables, then insert test data into the database.
8 The Queries
This is the second real test of your database. Translate every Action from Step 5 into a query. Notice that this includes all anticipated reports!
9 Test loop and revise as need be
You'll find that Step 8 uncovers some aspects of the database model that need touching up. Loop back as far as need be to plug holes and correct errors.
Write no application code till you arrive here.
For more, see ...
Back to Artful ...Last updated 2 Jul 2020