Home All Groups Group Topic Archive Search About

Give me your Thoughts-Storing format in database

Author
17 Jun 2006 1:43 PM
Johnny
Hi all,

I have the following tables:

NumberKey
ID
Name

NumberValue
ID
NumberKey_ID
Value

The values stored in the NumberValue table could be currency,
percentage, or number that require 7 digits to the right of the decimal
place, etc., etc. (This table is not very big, so I'm storing many
different types of numbers here, and not worrying about the most
efficient data types).  I am toying with the idea of adding a format
field to the NumberKey table to use later in the UI to format the Value
field in NumberValue.  I am envisioning using percentage, currency,
etc. as string so the UI knows what format function to use to display
the value in a text box, etc. etc.

Does anyone have any experience with this?  Is this even a good idea?
Should I let business logic handle this?

Thanks,
Johnny

Author
17 Jun 2006 2:53 PM
David Portas
Johnny wrote:
Show quote
> Hi all,
>
> I have the following tables:
>
> NumberKey
> ID
> Name
>
> NumberValue
> ID
> NumberKey_ID
> Value
>
> The values stored in the NumberValue table could be currency,
> percentage, or number that require 7 digits to the right of the decimal
> place, etc., etc. (This table is not very big, so I'm storing many
> different types of numbers here, and not worrying about the most
> efficient data types).  I am toying with the idea of adding a format
> field to the NumberKey table to use later in the UI to format the Value
> field in NumberValue.  I am envisioning using percentage, currency,
> etc. as string so the UI knows what format function to use to display
> the value in a text box, etc. etc.
>
> Does anyone have any experience with this?  Is this even a good idea?
> Should I let business logic handle this?
>
> Thanks,
> Johnny

I'd say that design is a bigger issue for you than formatting. Do you
know what a domain is and what Normal Forms are? With a better design I
think your formatting issue will become a non-issue.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
Author
17 Jun 2006 3:19 PM
David Browne
Show quote
"Johnny" <jmered***@gmail.com> wrote in message
news:1150551794.449752.67970@c74g2000cwc.googlegroups.com...
> Hi all,
>
> I have the following tables:
>
> NumberKey
> ID
> Name
>
> NumberValue
> ID
> NumberKey_ID
> Value
>
> The values stored in the NumberValue table could be currency,
> percentage, or number that require 7 digits to the right of the decimal
> place, etc., etc. (This table is not very big, so I'm storing many
> different types of numbers here, and not worrying about the most
> efficient data types).  I am toying with the idea of adding a format
> field to the NumberKey table to use later in the UI to format the Value
> field in NumberValue.  I am envisioning using percentage, currency,
> etc. as string so the UI knows what format function to use to display
> the value in a text box, etc. etc.
>
> Does anyone have any experience with this?  Is this even a good idea?
> Should I let business logic handle this?
>

You are only considering this because your tables don't model anything
useful.  Why are there different kinds of numbers in the same column?

David
Author
17 Jun 2006 5:51 PM
--CELKO--
>> The values stored in the NumberValue table could be currency, percentage, or number that require 7 digits to the right of the decimal  place, etc., etc. <<

Then **by definition** this is NOT a table.  The mi9stake you are
making is so bad and so popular among non_SQL programemrs it has a name
--  EAV (Entity-Attribute-Value) modeling.

A table **by definition** is a set of things of the same kind.  As
Aristlotle put it, "To be is to be something in particular; to be
nothing in particular or to be everythig gin general is to be nothing."


>>  (This table is not very big, so I'm storing many different types of numbers here, and not worrying about the most efficient data types). <<

You have no idea what data integrity is.

>> I am toying with the idea of adding a format field [sic] to the NumberKey table to use later in the UI to format the Value field [sic] in NumberValue.  <<

You do not know that a field and column are different concepts.  Fields
can change type and format; columns do not.  COBOL programmers often
want to do this because that is how their language works.

Why are you formatting data in the back end?  The basic principle of a
tiered architecture is that display is done in the front end and never
in the back end.  This a more basic programming principle than just SQL
and RDBMS.

>>  Does anyone have any experience with this? <<

Yes.  I bill $1000 per day plus expenses to repair these systems.  I
have found that they get corrupted in about one year in production, and
corrupted beyond repair in less than two years.

I found an old "cut & paste".  Someone like you posted this:

CREATE TABLE EAV -- no key declared
(key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL);

INSERT INTO EAV VALUES ('LOCATION','Bedroom');
INSERT INTO EAV VALUES ('LOCATION','Dining Room');
INSERT INTO EAV VALUES ('LOCATION','Bathroom');
INSERT INTO EAV VALUES ('LOCATION','courtyard');
INSERT INTO EAV VALUES ('EVENT','verbal aggression');
INSERT INTO EAV VALUES ('EVENT','peer');
INSERT INTO EAV VALUES ('EVENT','bad behavior');
INSERT INTO EAV VALUES ('EVENT','other');

CREATE TABLE EAV_DATA  -note lack of constraints, defaults, DRI
(id INTEGER IDENTITY (1,1) NOT NULL,
bts_id INTEGER NULL,
key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL );

INSERT INTO EAV_DATA VALUES (1, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'bad behavior');
INSERT INTO EAV_DATA VALUES (2, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'verbal aggression');
INSERT INTO EAV_DATA VALUES (3, 'LOCATION', 'courtyard');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'peer');

Ideally, the result set of the query would be Location Event count
(headings if possible)

Bedroom verbal aggression 1
Bedroom peer 0
Bedroom bad behavior 0
Bedroom other 2
Dining Room verbal aggression 0
Dining Room peer 0
Dining Room bad behavior 0
Dining Room other 0
Bathroom verbal aggression 0
Bathroom peer 0
Bathroom bad behavior 0
Bathroom other 0
courtyard verbal aggression 0
courtyard peer 1
courtyard bad behavior 0
courtyard other 1

Also, if possible, another query would return this result set. (I think
I know how to do this one.)

Location Event count
Bedroom verbal aggression 1
Bedroom other 2
courtyard peer 1
courtyard other 1

Here is an answer From: Thomas Coleman

SELECT Locations.locationvalue, Events.eventvalue,
       (SELECT COUNT(*)
          FROM (SELECT LocationData.locationvalue, EventData.eventvalue

                  FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
                          FROM eav_data AS TD1
                         WHERE TD1.key = 'location') AS LocationData
               INNER JOIN
               (SELECT TD2.bts_id, TD2.value AS eventvalue
              FROM eav_data AS TD2
              WHERE TD2.key = 'event'
            ) AS EventData
            ON LocationData.bts_id = EventData.bts_id
      ) AS CollatedEventData
    WHERE CollatedEventData.locationvalue = Locations.locationvalue
      AND CollatedEventData.eventvalue = Events.eventvalue
FROM (SELECT T1.value AS locationvalue
      FROM EAV AS T1
      WHERE T1.key = 'location') AS Locations,
   (SELECT T2.value AS eventvalue
      FROM EAV AS T2
     WHERE T2.key = 'event') AS Events
ORDER BY Locations.locationvalue, Events.eventvalue ,
SELECT Locations.locationvalue, Events.eventvalue
       (SELECT COUNT(*)
          FROM (SELECT LocationData.locationvalue, EventData.eventvalue

                  FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
                          FROM eav_data AS TD1
                         WHERE TD1.key = 'location') AS LocationData
              INNER JOIN
              (SELECT TD2.bts_id, TD2.value AS eventvalue
                 FROM eav_data AS TD2
                WHERE TD2.key = 'event') AS EventData
              ON LocationData.bts_id = EventData.bts_id)
              AS CollatedEventData
    WHERE CollatedEventData.locationvalue = Locations.locationvalue
      AND CollatedEventData.eventvalue = Events.eventvalue)
FROM (SELECT T1.value AS locationvalue
        FROM EAV AS T1
      WHERE T1.key = 'location') AS Locations,
    (SELECT T2.value AS eventvalue
       FROM EAV AS T2
       WHERE T2.key = 'event') AS Events;

Is the same thing in a proper schema as:

SELECT L.locationvalue, E.eventvalue, COUNT(*)
  FROM Locations AS L, Events AS E
WHERE L.btd_id = E.btd_id
GROUP BY L.locationvalue, E.eventvalue;

The reason that I had to use so many subqueries is that those entities
are all lopped into the same table. There should be separate tables for
Locations and Events.

The column names are seriously painful. Beyond the fact that I
personally hate underscores in column names, using underscores at the
end of the column name is really non-intuitive. I removed them for my
example and came across the next column name faux pas. Don't use "key"
and "value" for column names. It means that the developer *has*
surround the column name with square brackets for everything which is a
serious pain.

There is such a thing as "too" generic. There has to be some structure
or everything becomes nothing more than a couple of tables called
"things". The real key (no pun intended) is commonality. Is there a
pattern to the data that they want to store? It may not be possible to
create one structure to rule them all and in the darkness bind them.

"To be is to be something in particular; to be nothing in particular is
to be nothing." --Aristole

All data integrity is destroyed. Any typo becomes a new attribute or
entity. Entities are found missing attributes, so all the reports are
wrong.

ry to write a single CHECK() constraint that works for all the
attributes of those 30+ entities your users created because you were
too dumb or too lazy to do your job. It can be done! You need a case
expression almost 70 WHEN clauses for a simple invoice and order system
when I tried it as an exercise.

ry to write a single DEFAULT clause for 30+ entities crammed into one
column. Impossible!

Try to set up DRI actions among the entities. If you thought the WHEN
clauses in the single CASE expression were unmaintainable, wait until
you see the "TRIGGERs from Hell" -- Too bad that they might not fit
into older SQL Server which had some size limits. Now maintain it.

For those who are interested, there are couple of links to articles I
found on the net:

Generic Design of Web-Based Clinical Databases
http://www.jmir.org/2003/4/e27­/

The EAV/CR Model of Data Representation
http://ycmi.med.yale.edu/nadka­rni/eav_CR_contents.htm

An Introduction to Entity-Attribute-Value Design for Generic
Clinical Study Data Management Systems
http://ycmi.med.yale.edu/nadka­rni/Introduction%20to%20EAV%20­systems.htm


Data Extraction and Ad Hoc Query of an Entity- Attribute- Value
Database
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...


Exploring Performance Issues for a Clinical Database Organized Using
an Entity-Attribute-Value Representation
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...






Show quote
>> Is this even a good idea? <<
Author
17 Jun 2006 6:09 PM
Earl
You forgot to answer the last question ;=)

"--CELKO--" <jcelko***@earthlink.net> wrote in message
news:1150566682.741491.323730@f6g2000cwb.googlegroups.com...
>> The values stored in the NumberValue table could be currency, percentage,
>> or number that require 7 digits to the right of the decimal  place, etc.,
>> etc. <<

Then **by definition** this is NOT a table.  The mi9stake you are
making is so bad and so popular among non_SQL programemrs it has a name
--  EAV (Entity-Attribute-Value) modeling.

A table **by definition** is a set of things of the same kind.  As
Aristlotle put it, "To be is to be something in particular; to be
nothing in particular or to be everythig gin general is to be nothing."


>>  (This table is not very big, so I'm storing many different types of
>> numbers here, and not worrying about the most efficient data types). <<

You have no idea what data integrity is.

>> I am toying with the idea of adding a format field [sic] to the NumberKey
>> table to use later in the UI to format the Value field [sic] in
>> NumberValue.  <<

You do not know that a field and column are different concepts.  Fields
can change type and format; columns do not.  COBOL programmers often
want to do this because that is how their language works.

Why are you formatting data in the back end?  The basic principle of a
tiered architecture is that display is done in the front end and never
in the back end.  This a more basic programming principle than just SQL
and RDBMS.

>>  Does anyone have any experience with this? <<

Yes.  I bill $1000 per day plus expenses to repair these systems.  I
have found that they get corrupted in about one year in production, and
corrupted beyond repair in less than two years.

I found an old "cut & paste".  Someone like you posted this:

CREATE TABLE EAV -- no key declared
(key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL);

INSERT INTO EAV VALUES ('LOCATION','Bedroom');
INSERT INTO EAV VALUES ('LOCATION','Dining Room');
INSERT INTO EAV VALUES ('LOCATION','Bathroom');
INSERT INTO EAV VALUES ('LOCATION','courtyard');
INSERT INTO EAV VALUES ('EVENT','verbal aggression');
INSERT INTO EAV VALUES ('EVENT','peer');
INSERT INTO EAV VALUES ('EVENT','bad behavior');
INSERT INTO EAV VALUES ('EVENT','other');

CREATE TABLE EAV_DATA  -note lack of constraints, defaults, DRI
(id INTEGER IDENTITY (1,1) NOT NULL,
bts_id INTEGER NULL,
key_col VARCHAR (10) NULL,
attrib_value VARCHAR (50) NULL );

INSERT INTO EAV_DATA VALUES (1, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (1, 'EVENT', 'bad behavior');
INSERT INTO EAV_DATA VALUES (2, 'LOCATION', 'Bedroom');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (2, 'EVENT', 'verbal aggression');
INSERT INTO EAV_DATA VALUES (3, 'LOCATION', 'courtyard');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'other');
INSERT INTO EAV_DATA VALUES (3, 'EVENT', 'peer');

Ideally, the result set of the query would be Location Event count
(headings if possible)

Bedroom verbal aggression 1
Bedroom peer 0
Bedroom bad behavior 0
Bedroom other 2
Dining Room verbal aggression 0
Dining Room peer 0
Dining Room bad behavior 0
Dining Room other 0
Bathroom verbal aggression 0
Bathroom peer 0
Bathroom bad behavior 0
Bathroom other 0
courtyard verbal aggression 0
courtyard peer 1
courtyard bad behavior 0
courtyard other 1

Also, if possible, another query would return this result set. (I think
I know how to do this one.)

Location Event count
Bedroom verbal aggression 1
Bedroom other 2
courtyard peer 1
courtyard other 1

Here is an answer From: Thomas Coleman

SELECT Locations.locationvalue, Events.eventvalue,
       (SELECT COUNT(*)
          FROM (SELECT LocationData.locationvalue, EventData.eventvalue

                  FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
                          FROM eav_data AS TD1
                         WHERE TD1.key = 'location') AS LocationData
               INNER JOIN
               (SELECT TD2.bts_id, TD2.value AS eventvalue
              FROM eav_data AS TD2
              WHERE TD2.key = 'event'
            ) AS EventData
            ON LocationData.bts_id = EventData.bts_id
      ) AS CollatedEventData
    WHERE CollatedEventData.locationvalue = Locations.locationvalue
      AND CollatedEventData.eventvalue = Events.eventvalue
FROM (SELECT T1.value AS locationvalue
      FROM EAV AS T1
      WHERE T1.key = 'location') AS Locations,
   (SELECT T2.value AS eventvalue
      FROM EAV AS T2
     WHERE T2.key = 'event') AS Events
ORDER BY Locations.locationvalue, Events.eventvalue ,
SELECT Locations.locationvalue, Events.eventvalue
       (SELECT COUNT(*)
          FROM (SELECT LocationData.locationvalue, EventData.eventvalue

                  FROM (SELECT TD1.bts_id, TD1.value AS locationvalue
                          FROM eav_data AS TD1
                         WHERE TD1.key = 'location') AS LocationData
              INNER JOIN
              (SELECT TD2.bts_id, TD2.value AS eventvalue
                 FROM eav_data AS TD2
                WHERE TD2.key = 'event') AS EventData
              ON LocationData.bts_id = EventData.bts_id)
              AS CollatedEventData
    WHERE CollatedEventData.locationvalue = Locations.locationvalue
      AND CollatedEventData.eventvalue = Events.eventvalue)
FROM (SELECT T1.value AS locationvalue
        FROM EAV AS T1
      WHERE T1.key = 'location') AS Locations,
    (SELECT T2.value AS eventvalue
       FROM EAV AS T2
       WHERE T2.key = 'event') AS Events;

Is the same thing in a proper schema as:

SELECT L.locationvalue, E.eventvalue, COUNT(*)
  FROM Locations AS L, Events AS E
WHERE L.btd_id = E.btd_id
GROUP BY L.locationvalue, E.eventvalue;

The reason that I had to use so many subqueries is that those entities
are all lopped into the same table. There should be separate tables for
Locations and Events.

The column names are seriously painful. Beyond the fact that I
personally hate underscores in column names, using underscores at the
end of the column name is really non-intuitive. I removed them for my
example and came across the next column name faux pas. Don't use "key"
and "value" for column names. It means that the developer *has*
surround the column name with square brackets for everything which is a
serious pain.

There is such a thing as "too" generic. There has to be some structure
or everything becomes nothing more than a couple of tables called
"things". The real key (no pun intended) is commonality. Is there a
pattern to the data that they want to store? It may not be possible to
create one structure to rule them all and in the darkness bind them.

"To be is to be something in particular; to be nothing in particular is
to be nothing." --Aristole

All data integrity is destroyed. Any typo becomes a new attribute or
entity. Entities are found missing attributes, so all the reports are
wrong.

ry to write a single CHECK() constraint that works for all the
attributes of those 30+ entities your users created because you were
too dumb or too lazy to do your job. It can be done! You need a case
expression almost 70 WHEN clauses for a simple invoice and order system
when I tried it as an exercise.

ry to write a single DEFAULT clause for 30+ entities crammed into one
column. Impossible!

Try to set up DRI actions among the entities. If you thought the WHEN
clauses in the single CASE expression were unmaintainable, wait until
you see the "TRIGGERs from Hell" -- Too bad that they might not fit
into older SQL Server which had some size limits. Now maintain it.

For those who are interested, there are couple of links to articles I
found on the net:

Generic Design of Web-Based Clinical Databases
http://www.jmir.org/2003/4/e27­/

The EAV/CR Model of Data Representation
http://ycmi.med.yale.edu/nadka­rni/eav_CR_contents.htm

An Introduction to Entity-Attribute-Value Design for Generic
Clinical Study Data Management Systems
http://ycmi.med.yale.edu/nadka­rni/Introduction%20to%20EAV%20­systems.htm


Data Extraction and Ad Hoc Query of an Entity- Attribute- Value
Database
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...


Exploring Performance Issues for a Clinical Database Organized Using
an Entity-Attribute-Value Representation
http://www.pubmedcentral.nih.g­ov/articlerender.fcgi?tool=pub­med&pubme...






Show quote
>> Is this even a good idea? <<
Author
18 Jun 2006 1:22 PM
Johnny
Hi,

Basic Problem:
I have a ledger in an accounting system.  I need to store information
about the ledger.  The information could be numbers, strings, or dates.
I need to be able to add any of these value types in the future with
no trouble (i.e. - the UI should allow addition of items to track, and
if the user does not populate the information for every ledger, NULL is
acceptable.)

More Details:
You all are right, I am not a database design person (please don't snip
this and say duh!).  Here's the problem I'm trying to solve.  I'm
trying to come up with a way to store what I will call metadata about
ledgers in an accounting system.  I don't know up front what type of
data we will need to track (I know most of it, but things change; for
example, what happens when legacy systems are replaced (already
happened twice on my watch and broke a department full of spreadsheets
and manual processes).)  Anyway, I need a design where the "things" I'm
storing can be unknown upfront, and furthermore, if the user wants to
create new things to track, the design should allow this to be possible
without changes to the tables.  All of the "real world" items like
Invoices, Employees, Customers, Journal Entries, etc. are in the ledger
databases.  Said differently, I want the user to be able to add fields
without a single change to the database.  These pieces of data can be
strings, dates, numbers, etc.  I don't want to assume I know the value
type of the piece of data either.

I 100% agree that sometimes things can get too generic.  Maybe what I'm
trying to do is a pipe dream.  In this case, so far, the more specifc I
get, the more changes I need to make to the database when I discover
new requirements.  I don't think this is laziness, I think it is
reality.  Requirements are almost non-discoverable when you get zero
budget (I mean money and time).

Does anyone have experience with this sort of problem?  Let me also put
the question another way:  I'm trying to store lists of things that
relate to ledgers (entities in the EAV example Celko describes) and I
don't know ANYTHING about what will be in the lists up front.  Also, I
don't want lists to map to tables because that means I will need to add
tables later.  And also there will not be 100,000 lists.  There may be
more like 50 lists.

What is an alternative to the EAV design if you don't know the specific
"things" that are getting stored in the table up front?  How do I
prevent having a bunch of tables and having to add tables when the data
requirements change (my other ideas led me to this result)?

Thanks,
Johnny
Author
18 Jun 2006 2:55 PM
David Browne
Show quote
"Johnny" <jmered***@gmail.com> wrote in message
news:1150636975.417715.8470@f6g2000cwb.googlegroups.com...
> Hi,
>
> Basic Problem:
> I have a ledger in an accounting system.  I need to store information
> about the ledger.  The information could be numbers, strings, or dates.
> I need to be able to add any of these value types in the future with
> no trouble (i.e. - the UI should allow addition of items to track, and
> if the user does not populate the information for every ledger, NULL is
> acceptable.)
>
.. . .


That sounds like a valid use of the EAV pattern: storing additional metadata
about an existing schema.

Consider a batch process that interrogates your database and populates some
lookup tables with the table names, column names and types.  This will both
make your UI easier and give you better integrity.

And consider posting DDL for your accounting schema and a detailed
explanation of a particuar use case, I'm still not quite sure what you're
trying to acomplish.

David
Author
18 Jun 2006 3:03 PM
Mike C#
I would have thought that Accounting data requirements and format are fairly
well defined by now.  After all, the Phoenecians were doing this thousands
of years ago...  One of the major problems with EAV (apart from not
following any of the principles of good relational database design) is that
you cannot create proper constraints if you don't know the data types of the
columns (items you are storing) in advance.  You also severely complicate
check constraints.  This method effectively bypasses any built-in SQL type
checks since you don't know in advance if you're checking for a valid date,
a valid integer, a valid numeric value, etc.  You also thoroughly complicate
and slow down queries which are the basis of report generation.  Finally
it's a very wasteful methodology since you have to effectively store
everything as a string and either explicitly or implicitly convert it to the
correct type every time you use it; or you have to store several NULLs and
indicators of the type if you plan on storing values in columns of the
proper type, one per row.

A General Ledger has a pretty specific format as do Journals, etc.  I may be
missing the point, but I don't understand what the issue is exactly?  If you
need to store un-typed "metadata" about specific ledgers, maybe you should
look into storing that information as XML?

Show quote
"Johnny" <jmered***@gmail.com> wrote in message
news:1150636975.417715.8470@f6g2000cwb.googlegroups.com...
> Hi,
>
> Basic Problem:
> I have a ledger in an accounting system.  I need to store information
> about the ledger.  The information could be numbers, strings, or dates.
> I need to be able to add any of these value types in the future with
> no trouble (i.e. - the UI should allow addition of items to track, and
> if the user does not populate the information for every ledger, NULL is
> acceptable.)
>
> More Details:
> You all are right, I am not a database design person (please don't snip
> this and say duh!).  Here's the problem I'm trying to solve.  I'm
> trying to come up with a way to store what I will call metadata about
> ledgers in an accounting system.  I don't know up front what type of
> data we will need to track (I know most of it, but things change; for
> example, what happens when legacy systems are replaced (already
> happened twice on my watch and broke a department full of spreadsheets
> and manual processes).)  Anyway, I need a design where the "things" I'm
> storing can be unknown upfront, and furthermore, if the user wants to
> create new things to track, the design should allow this to be possible
> without changes to the tables.  All of the "real world" items like
> Invoices, Employees, Customers, Journal Entries, etc. are in the ledger
> databases.  Said differently, I want the user to be able to add fields
> without a single change to the database.  These pieces of data can be
> strings, dates, numbers, etc.  I don't want to assume I know the value
> type of the piece of data either.
>
> I 100% agree that sometimes things can get too generic.  Maybe what I'm
> trying to do is a pipe dream.  In this case, so far, the more specifc I
> get, the more changes I need to make to the database when I discover
> new requirements.  I don't think this is laziness, I think it is
> reality.  Requirements are almost non-discoverable when you get zero
> budget (I mean money and time).
>
> Does anyone have experience with this sort of problem?  Let me also put
> the question another way:  I'm trying to store lists of things that
> relate to ledgers (entities in the EAV example Celko describes) and I
> don't know ANYTHING about what will be in the lists up front.  Also, I
> don't want lists to map to tables because that means I will need to add
> tables later.  And also there will not be 100,000 lists.  There may be
> more like 50 lists.
>
> What is an alternative to the EAV design if you don't know the specific
> "things" that are getting stored in the table up front?  How do I
> prevent having a bunch of tables and having to add tables when the data
> requirements change (my other ideas led me to this result)?
>
> Thanks,
> Johnny
>
Author
18 Jun 2006 4:44 PM
Johnny
I'm not creating a general ledger here.
Author
19 Jun 2006 1:07 AM
Mike C#
"Johnny" <jmered***@gmail.com> wrote in message
news:1150649075.077038.312450@f6g2000cwb.googlegroups.com...
> I'm not creating a general ledger here.
>

As I undestand the problem you "have a ledger in an accounting system.  I
need to store information about the ledger."  What information are you
trying to store about Ledger #1 that does not apply to Ledger #2 and what
information are you trying to store about Ledger #2 that does not apply to
Ledger #3 exactly?
Author
19 Jun 2006 1:35 AM
Johnny
Ledger #1's code in one accounting system might be "E01T" for example.
Ledger #2's might be "E02T", etc.  Ledger #1's code in another
accounting system might be "00001" and #2's is "00002", etc.  The same
applies for common names, legal names, the currency the ledger's is
stored it, whether or not the currency has a "decimal" in the amount
field (e.g. US Dollars has two places to the right of the decimal for
cents, Japanese Yen has not digits to the right of the decimal place;
the accounting system stores both currencies as 10000: in USD that's
$100.00, in Japanese Yen that is 10,000.)  I need to decorate the
ledgers with this "metadata" for later use in integration projects.

The problem is, as I discover new pieces of data I need to store, I
don't want to have to change tables.  So I don't want to have a table
with "Ledger Codes" and one with "Decimal Codes", etc., etc.

Thanks,
Johnny
Author
19 Jun 2006 1:59 AM
Mike C#
"Johnny" <jmered***@gmail.com> wrote in message
news:1150680955.355312.233540@r2g2000cwb.googlegroups.com...
> Ledger #1's code in one accounting system might be "E01T" for example.
> Ledger #2's might be "E02T", etc.  Ledger #1's code in another
> accounting system might be "00001" and #2's is "00002", etc.  The same
> applies for common names, legal names, the currency the ledger's is
> stored it, whether or not the currency has a "decimal" in the amount
> field (e.g. US Dollars has two places to the right of the decimal for
> cents, Japanese Yen has not digits to the right of the decimal place;
> the accounting system stores both currencies as 10000: in USD that's
> $100.00, in Japanese Yen that is 10,000.)  I need to decorate the
> ledgers with this "metadata" for later use in integration projects.

Does each ledger have a currency type assigned to it, or does each
individual entry in the ledger have its own currency type?  It sounds like
you were probably on the right track in the beginning, before you started
going down the EAV path.  Designing a table that contained columns such as:

Ledger_ID                      -- The ID of the ledger
Accounting_System_ID   -- The accounting system to which this ledger
"metadata" applies
Ledger_Code                  -- The code for this ledger in the accounting
system ("EO1T" or "00001")
Currency_Type               -- Currency Type ("USD", "Japanese Yen", etc.)
....

In such a table each ledger would have a single row for each applicable
accounting system.

> The problem is, as I discover new pieces of data I need to store, I
> don't want to have to change tables.  So I don't want to have a table
> with "Ledger Codes" and one with "Decimal Codes", etc., etc.

This sounds more like a problem with gathering requirements, which should
have been completed way before you even contemplated a database design.
Using EAV as a fallback is often a sign of a failure in the requirements
gathering phase of the process.  If you think about it, even adding a new
Attribute to an EAV style table is going to cause a ripple effect through
your implementation process, since you need to update all applicable
ledgers' (Entities) in the database to include the new Attribute.  You will
also have to update all of the front-end code to use this new Attribute as
well.  EAV doesn't really make life easier; it will make your queries more
complex and slower, and it effectively circumvents many of the checks used
by SQL to prevent data anomalies.  I would recommend putting more emphasis
on the requirements gathering phase of the process to minimize the number of
changes down the road, and then model your database using tried and proven
best practices.  Sit down with the stakeholders in this project and get them
to agree on exactly what "metadata" needs to be stored to bring the project
to completion; then design your database accordingly.
Author
18 Jun 2006 1:12 AM
Tracy McKibben
Johnny wrote:
Show quote
> Hi all,
>
> I have the following tables:
>
> NumberKey
> ID
> Name
>
> NumberValue
> ID
> NumberKey_ID
> Value
>
> The values stored in the NumberValue table could be currency,
> percentage, or number that require 7 digits to the right of the decimal
> place, etc., etc. (This table is not very big, so I'm storing many
> different types of numbers here, and not worrying about the most
> efficient data types).  I am toying with the idea of adding a format
> field to the NumberKey table to use later in the UI to format the Value
> field in NumberValue.  I am envisioning using percentage, currency,
> etc. as string so the UI knows what format function to use to display
> the value in a text box, etc. etc.
>
> Does anyone have any experience with this?  Is this even a good idea?
> Should I let business logic handle this?
>
> Thanks,
> Johnny
>

I suggest picking up a copy of the book "The Art Of SQL"
(http://www.amazon.com/gp/product/0596008945).  You've just provided a
real-world example of a common design flaw talked about in the first
chapter.  :-)

AddThis Social Bookmark Button