Home All Groups Group Topic Archive Search About

Data Modeling: Variable Number of Attributes

Author
2 Sep 2006 5:19 AM
Richard S.
This post presents a relatively straight-forward problem and a proposed
solution. I'd appreciate your perspective and suggestions:

I'm working on the design of a new database for a medical clinic which must
track, amongst other things, EVENTS + EVENT_DETAILS for patients. The
database implements a 1:M relationship between PATIENTS and EVENTS.

The proposed EVENTS table would store things like:
   event_id,
   patient_id, (FK to PATIENTS table)
   event_datetime

Examples of events (to which event_id points in a separate data dictionary)
would include things like:
   "Inquiry"
   "Referral"
   "Intake"
   etc.

For any given patient, multiple events typically occur, and any event could
occur zero or more times.

THE CHALLENGE
The challenge I'm attempting to solve is the fact that, for any given event,
very different attributes are to be recorded. That is, each event has
qualitatively and quantitatively different attributes. We therefore do not
want to have a column for each possible "event attribute" in the Events
table or in an EVENT_DETAILS table.

PROPOSED SOLUTION
I am proposing an EVENT_DETAILS table would store attributes values about
each EVENT occurrence. It would look something like this:

EVENT_DETAILS
   event_id (FK to EVENTS table)
   attribute_id (FK to EventAttributes "data dictionary" table)
   attribute_value

This implements a 1:M relationship between EVENTS and EVENT_ATTRIBUTES.

You can see that this design would store the event detail (attribute) data
"vertically" - in one column (attribute_value); thereby enabling the
database to store an arbitrary number of attributes per EVENT.

NULL would [have to?] be implied by an absence of a row in the EVENT_DETAILS
table (for an event that occurred but for which an attribute value is
missing).

This design would enable us to easily add new EVENTS to the system without
having to modify our table structures or add new tables. This is an
important capability that we would like to have.

Thoughts? Opinions? Perspective? Any nods of approval? Or are we painting
ourselves into a corner?

Thanks!

Author
2 Sep 2006 6:39 AM
Anith Sen
http://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html

Essentially, based on your narratives, "events" are entity types and they
should be represented in a distinct table with its own attributes.

>> This design would enable us to easily add new EVENTS to the system
>> without having to modify our table structures or add new tables. This is
>> an important capability that we would like to have.

No, that is just a superficial flexibility. Quite often agile/scrum/oo-
crowd re-invent similar approaches under various "paradigms" &  O-R mapping
schemes. Many claim the resulting schema to be flexible by focusing only on
the structural aspects and ignoring the integrity concerns.

Apart from the obvious abuse of NULLs that result in the inability to handle
inapplicable values logically, there are several serious issues with your
proposal: absense of a valid predicate ( what does your event table mean?),
inability to declare appropriate contraints ( how do you make sure
EventAttributes are valid for a given event? ), overlapping data with
metadata ( an FK to data dictionary? what constrains the data in them ? )etc
will definitely cause more problems in the long run.

--
Anith
Author
2 Sep 2006 10:18 AM
Erland Sommarskog
Richard S. (A@B.COM) writes:
Show quote
> For any given patient, multiple events typically occur, and any event
> could occur zero or more times.
>
> THE CHALLENGE
> The challenge I'm attempting to solve is the fact that, for any given
> event, very different attributes are to be recorded. That is, each event
> has qualitatively and quantitatively different attributes. We therefore
> do not want to have a column for each possible "event attribute" in the
> Events table or in an EVENT_DETAILS table.
>
> PROPOSED SOLUTION
> I am proposing an EVENT_DETAILS table would store attributes values about
> each EVENT occurrence. It would look something like this:
>
> EVENT_DETAILS
>    event_id (FK to EVENTS table)
>    attribute_id (FK to EventAttributes "data dictionary" table)
>    attribute_value
>
> This implements a 1:M relationship between EVENTS and EVENT_ATTRIBUTES.
>
> You can see that this design would store the event detail (attribute) data
> "vertically" - in one column (attribute_value); thereby enabling the
> database to store an arbitrary number of attributes per EVENT.

There are situations where a design like this may be warranted, but
your short introduction of the situation makes me think that is not
one of them.

While this model adds some flexibiliy, it makes programming a lot more
difficult, and you also lose some safety. Consider this query fragment:

   WHERE Inquiry.Dcotor = @doctor

SQL Server will tell you that you misspelled Doctor. But if you have

   WHERE attribute_id = 'DCOTOR'

the query will silently return incorrect results.

We have this sort of design in some corners of our database, where we've
found that new attributes are added so frequently, that the cost for
maintaining the schema can not be defended. Typically, these attributes
are used only in a few places, or in very simple queries. From your
description, I get the impression that you could have quite a few queries
against these event details, and they could be fairly complex even
with a normal design. And with your design they would be even more
complex.

You should probably look into having a main event_details table with
common attributes, and subtables with event-specific atttributes.

--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
2 Sep 2006 3:08 PM
Richard S.
Thank you Anith and Erland for your thoughtful and helpful responses.
Author
2 Sep 2006 3:45 PM
Arnie Rowland
Expanding upon the previous comments, it is quite possible that a
combination of approaches may be needed.

For example: (Quick and NOT well thought out)

Patient comes in for Exam and Tests

Event: Exam Appt
Event: Lab Tests

Event Details: Time, Who, Vitals Taken, etc
Event Details: Time, Who, Visit with Dr,etc
Event Details: Time, Who, Treadmill

Lab Tests:
Blood, Has Blue
Pulse, Still pumps
LDH, too low
HDH, too high

The lab Tests table may benefit from name=value pairing rather than trying
to determine a column for every type of tests -with more to come in the
future.

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"Erland Sommarskog" <esq***@sommarskog.se> wrote in message
news:Xns98327D37ECE56Yazorman@127.0.0.1...
> Richard S. (A@B.COM) writes:
>> For any given patient, multiple events typically occur, and any event
>> could occur zero or more times.
>>
>> THE CHALLENGE
>> The challenge I'm attempting to solve is the fact that, for any given
>> event, very different attributes are to be recorded. That is, each event
>> has qualitatively and quantitatively different attributes. We therefore
>> do not want to have a column for each possible "event attribute" in the
>> Events table or in an EVENT_DETAILS table.
>>
>> PROPOSED SOLUTION
>> I am proposing an EVENT_DETAILS table would store attributes values about
>> each EVENT occurrence. It would look something like this:
>>
>> EVENT_DETAILS
>>    event_id (FK to EVENTS table)
>>    attribute_id (FK to EventAttributes "data dictionary" table)
>>    attribute_value
>>
>> This implements a 1:M relationship between EVENTS and EVENT_ATTRIBUTES.
>>
>> You can see that this design would store the event detail (attribute)
>> data
>> "vertically" - in one column (attribute_value); thereby enabling the
>> database to store an arbitrary number of attributes per EVENT.
>
> There are situations where a design like this may be warranted, but
> your short introduction of the situation makes me think that is not
> one of them.
>
> While this model adds some flexibiliy, it makes programming a lot more
> difficult, and you also lose some safety. Consider this query fragment:
>
>   WHERE Inquiry.Dcotor = @doctor
>
> SQL Server will tell you that you misspelled Doctor. But if you have
>
>   WHERE attribute_id = 'DCOTOR'
>
> the query will silently return incorrect results.
>
> We have this sort of design in some corners of our database, where we've
> found that new attributes are added so frequently, that the cost for
> maintaining the schema can not be defended. Typically, these attributes
> are used only in a few places, or in very simple queries. From your
> description, I get the impression that you could have quite a few queries
> against these event details, and they could be fairly complex even
> with a normal design. And with your design they would be even more
> complex.
>
> You should probably look into having a main event_details table with
> common attributes, and subtables with event-specific atttributes.
>
> --
> Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

AddThis Social Bookmark Button