EAV designs and queries

from the Artful MySQL Tips List


EAV systems are meant to model sparse databases--systems where the number of possible attributes for any entity overwhelms the number of attribute values. A typical example is a medical records system. For general introductions to EAV systems see

http://ycmi.med.yale.edu/nadkarni/Introduction%20to%20EAV%20systems.htm
http://en.wikipedia.org/wiki/Entity-Attribute-Value_model
http://www.devshed.com/c/a/MySQL/Database-Design-Using-KeyValue-Tables

For more discussion on EAV queries in general, see:

http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=9824799

EAV queries are notorious for difficulty and inefficiency, but here is a delightfully clear EAV example posted in the MySQL Newbie forum by a user with the monicker "laptop alias". You track orders and their options in this single EAV table:


CREATE TABLE order_info( 

  order_id INT NOT NULL,

  product_id INT NOT NULL,

  option_attribute VARCHAR(20),

  option_value VARCHAR(20),

  PRIMARY KEY(order_id,product_id,option_attribute,option_value)

);

INSERT INTO order_info VALUES 

(1,10,'Colour','Red'),(1,10,'Size','Large'),(1,10,'Sleeve','Short'),

(2,10,'Colour','Blue'),(2,10,'Size','Small'),(2,10,'Sleeve','Long');



How would you report that data with options for each order on one line?


SELECT 

  order_id   AS `Order`,

  product_id AS Product,

  MAX(CASE WHEN option_attribute = 'Colour' THEN option_value END) AS Colour, 

  MAX(CASE WHEN option_attribute = 'Size'   THEN option_value END) AS Size,

  MAX(CASE WHEN option_attribute = 'Sleeve' THEN option_value END) AS Sleeve

FROM order_info 

GROUP BY order_id, product_id;

+-------+---------+--------+-------+--------+

| Order | Product | Colour | Size  | Sleeve |

+-------+---------+--------+-------+--------+

|     1 |      10 | Red    | Large | Short  |

|     2 |      10 | Blue   | Small | Long   |

+-------+---------+--------+-------+--------+



The unmentioned trick is that to write this query, you need the results of ...


SELECT DISTINCT option_attribute FROM order_info;

+------------------+

| option_attribute |

+------------------+

| Colour           |

| Size             |

| Sleeve           |

+------------------+



Here is another small example, this one using nested queries, that we ported to MySQL from SQL Server. The problem is to write a query for these two simple EAV tables ...


DROP TABLE IF EXISTS tracking,trackingdata;

CREATE TABLE tracking (

  id int primary key auto_increment,

  keyname char(16),

  keyvalue char(32) 

);

insert into tracking VALUES (1,'LOCATION','Bedroom');

insert into tracking VALUES (2,'LOCATION','Dining Room');

insert into tracking VALUES (3,'LOCATION','Bathroom');

insert into tracking VALUES (4,'LOCATION','courtyard');

insert into tracking VALUES (5,'EVENT',   'verbal aggression');

insert into tracking VALUES (6,'EVENT',   'peer');

insert into tracking VALUES (7,'EVENT',   'bad behavior');

insert into tracking VALUES (8,'EVENT',   'other');



CREATE TABLE trackingdata (

 id int primary key auto_increment,

 bdsid int,

 keyname char(16),

 keyvalue char(128)

);

insert into trackingdata VALUES (1, 1, 'LOCATION', 'Bedroom');

insert into trackingdata VALUES (2, 1, 'EVENT',    'other');

insert into trackingdata VALUES (3, 1, 'EVENT',    'bad behavior');

insert into trackingdata VALUES (4, 2, 'LOCATION', 'Bedroom');

insert into trackingdata VALUES (5, 2, 'EVENT',    'other');

insert into trackingdata VALUES (6, 2, 'EVENT',    'verbal aggression');

insert into trackingdata VALUES (7, 3, 'LOCATION', 'courtyard');

insert into trackingdata VALUES (8, 3, 'EVENT',    'other');

insert into trackingdata VALUES (9, 3, 'EVENT',    'peer');



... which lists all defined events at all defined locations with their counts:


Select 

  Locations.LocationValue As Location, 

  Events.EventValue As Event,

  (

    Select Count(*) From (

      Select LocationData.LocationValue, EventData.EventValue

      From (

        Select TD1.bdsid, TD1.keyvalue As LocationValue

        From TrackingData As TD1

        Where TD1.keyname = 'LOCATION'

      ) As LocationData

      Join (

        Select TD2.bdsid, TD2.keyvalue As EventValue

        From TrackingData As TD2

        Where TD2.keyname = 'EVENT'

      ) As EventData On LocationData.bdsId = EventData.bdsId

    ) As CollatedEventData

    Where CollatedEventData.LocationValue = Locations.LocationValue

      And CollatedEventData.EventValue = Events.EventValue

  ) AS Count

From (

  Select T1.keyvalue As LocationValue

  From Tracking As T1

  Where T1.keyname = 'LOCATION'

) As Locations,

(

  Select T2.keyvalue As EventValue

  From Tracking As T2

  Where T2.keyname = 'EVENT'

) As Events

Order By Locations.LocationValue, Events.EventValue 



If this doesn't discourage you, frequent MySQL forum contributor Rick James has assembled a list of relevant forum posts on EAV:
http://forums.mysql.com/read.php?10,355723 (Searching by using an index table, efficient queries?)
http://forums.mysql.com/read.php?125,354578,354578
http://forums.mysql.com/read.php?10,248799,249039
http://forums.mysql.com/read.php?10,359004 (GROUP_CONCAT question)
http://forums.mysql.com/read.php?10,358797 (joined query help needed)
http://forums.mysql.com/read.php?10,358687 (Update query)
http://forums.mysql.com/read.php?10,358153 (Sub-request with multiple result as condition)
http://forums.mysql.com/read.php?10,358143 (two tables, one table has value of 2nd table name)
http://forums.mysql.com/read.php?10,354528 (SQL long Join statement and performance)
http://forums.mysql.com/read.php?10,304620 (Query problem)
http://forums.mysql.com/read.php?10,293310 (Need select query for eav modeling)
http://forums.mysql.com/read.php?10,293211 (UPDATE TABLE with TRIM() => Duplicate key problems)
http://forums.mysql.com/read.php?20,292439 (SQL Syntax)
http://forums.mysql.com/read.php?10,290756 (combining multiple rows)
http://forums.mysql.com/read.php?10,289808 (Export Data from multiple tables and Later export it to CSV)
http://forums.mysql.com/read.php?10,289079 (Query between 2 related tables - duplicated records in the result set for every record in the 2nd table that it matches.)
http://forums.mysql.com/read.php?10,288978 (Need help with advanced Joins)

Return to the Artful MySQL Tips page