Normalisation: fine points

from the Artful MySQL Tips List

1NF = an unordered table of unordered atomic columns with no repeat rows

2NF = 1NF + all nonkey columns are fully functionally dependent on the PK

3NF = 2NF + all nonkey columns are mutually independent

BCNF = 3NF + every determinant (column on which some other column depends) is a candidate key (rowwise unique column combination, a column or column combo with no nulls & no dupes)

4NF = 3NF + all multivalued dependencies (MVDs) but one are projected to new tables (MVD = cols B & C depend on A but C does not depend on B).

5NF = 4NF + no cyclic dependencies ("join dependencies", JD).

Suppose you project a table with cyclic dependencies to N new tables, ie keep doing projections till there are no more non-key dependencies. N-decomposable means that having done these projections, recomposing the original table from the children gets you exactly the original table.

Another way of presenting it: cyclic dependency or JD exists in a table if for any i and j, if

    (Ai, Bi, Cj), eg A=Imre B=Izzy C=Jussi
    (Ai, Bj, Ci), eg A=Imre B=Jock C=Iago
    (Aj, Bi, Ci), eg A=Jan  B=Izzy C=Iago
all occur, then
    (Ai, Bi, Ci ) eg A=Imre B=Izzy C=Iago
also occurs, without which projection followed by reconstitution would create spurious rows.

N-decomposability is clarified by pointing at where it isn't. Consider this cribbed example:

  PID      SID      Course
  Emile    Wendy    CSC330
  Emile    Joan     CSC111
  Merrie   Wendy    CSC111

Now projecting PSC to obtain PS(PID,SID), SC(SID,Course), and PC(PID,Course), ie every projection of degree 2, we get:

       PS                SC                  PC
  PID     SID        SID    Course      PID      Course
  Emile   Wendy      Wendy  CSC330      Emile    CSC330
  Emile   Joan       Joan   CSC111      Emile    CSC111
  Merrie  Wendy      Wendy  CSC111      Merrie   CSC111

But neither the projections PS and SC (nor any two of them) are equivalent to PSC: joining them over the common domain SID yields mythical rows (marked *):

            PS_SC                    PS_SC_PC
    ~~~~~~~~~~~~~~~~~~~~~      ~~~~~~~~~~~~~~~~~~~~~~
    PID     SID    Course      PID     SID     Course
    Emile   Wendy  CSC330      Emile   Wendy   CSC330
    Emile   Wendy  CSC111*     Emile   Wendy   CSC111*
    Emile   Joan   CSC111      Emile   Joan    CSC111
    Merrie  Wendy  CSC330*     Merrie  Wendy   CSC111
    Merrie  Wendy  CSC111
Bottom line:
    BCNF = every FD is a consequence of the candidate keys
    4NF = every MVD is a consequence of the candidate keys
    5NF = every JD is a consequence of the candidate keys

Last updated 23 Nov 2021

Return to the Artful MySQL Tips page